Posts

Humanities In A Post-Conscious AI World? 2023-08-28T21:59:57.848Z
Help Understanding Preferences And Evil 2022-08-27T03:42:01.392Z
The Alignment Problem Needs More Positive Fiction 2022-08-21T22:01:59.878Z

Comments

Comment by Netcentrica on How do AI timelines affect how you live your life? · 2023-05-08T05:03:54.018Z · LW · GW

I’ll be seventy later this year so I don’t worry about “the future” for myself much or how I should live my life differently. I’ve got some grandkids though and as far as my advice to them goes I tell their mom that the trades will be safer than clerical or white color jobs because robotics will lag behind AI. Sure you can teach an AI to do manual labor, like say brain surgery, but it’s not going to be making house calls. Creating a robotic plumber would be a massive investment and so not likely to happen. In my humble opinion.

Of course this assumes the world will still need plumbers in the near future. Personally I expect the world to still need plumbers and other tradespeople in the next twenty-plus years. Even if construction was cut back due to its 40% contribution to greenhouse gases there will still be homes that need maintenance.

My son is a tradesperson as is my son-in-law so I have some knowledge of that lifestyle.

I also know a bit about AI despite my age as I retired after a thirty year IT career including being a developer, software development team manager and the VP of Operations in a fifty person print and online textbook publishing company. Since retiring for the past three years I’ve been writing hard science fiction novellas and short stories about AI and Social Robots in the near future. About two thousand pages consisting of seven novellas and forty short stories so far. Hard science fiction requires a ton of research and thinking and I try to write every day. 

I finished my first novella in August of 2020, a few months before Brian Christian published his book, “The Alignment Problem” and the term and issue became popularized. My own belief about AI timelines is that the adoption rate is going to be the fastest ever. See https://hbr.org/2013/11/the-pace-of-technology-adoption-is-speeding-up for technology adoption rates. AI will teach itself how to get smarter and AGI will arrive in just a few years.

Will we solve the “The Alignment Problem” before then? No – because the science of human values will turn out to be perhaps the most challenging technical work AI researchers ever face. What are values? What are they made of? How do they “work” technically? Did you mean genetically inherited values or learned values? Are they genes? If so how would we account for the effect of epigenetics on values as suggested by twin studies? Or are they neurological structures like some form of memory? How do they communicate with each other and change? Is each value made up of sub-values? How would you construct a Bayesian Network to emulate a human values system? Is there a better way? And so on. The science of human values is in its infancy so we are not going to solve the alignment problem any time soon. Unless of course… AI solves it for us. And wouldn't that be an interesting scenario. Does this mean we’re all gonna be killed by AI? As any Master of Futures Studies program faculty member will tell you, it is impossible to predict the future.

Do I think these issues will effect my grandkids? Absolutely. Can I imagine their world? Not a chance. When I was twenty the personal computer, the internet and cell phones didn’t exist. My future career didn’t exist. So I don’t have much more advice for my grandkids other than the robotics/trades angle.

What would I do differently if I was twenty-something now? Well if I didn’t go into the trades I’d plan on working for any kind of company involved in the environment in any way. Unlike in my novella series where the World Governments Federation mandates population control in the real world people will continue to have babies and for the next thirty or forty years the global population will continue to grow. Then there are things like climate change, refugees and war etc. The environment will struggle to deal with all that and need a lot of hands on deck.

Now you might be thinking there’s more to life than a career. I agree. I write not to get published but as an act of self-expression, something I consider the highest calling of life. If you know what you get the greatest personal gratification from I recommend you find a way to make time for it.

Comment by Netcentrica on In Defense of Chatbot Romance · 2023-02-11T19:23:57.137Z · LW · GW

For the past two-plus years I’ve been writing hard science fiction novellas and vignettes about social robots called Companions. They are based in this and the next two centuries.

After a thirty year career in IT I am now retired, write as a hobby and self-publish. I try to write 300-1k words per day and have written seven novellas and forty vignettes.

As a way to address my own social isolation at the time, about ten years ago I also researched and created a social activities group which I ran successfully for two years. Info is here…

https://socialwellness.wordpress.com/

I agree with your views and here are my own responses.

Most people will not ignore real relationships. We are “wetware” not “software” and there are significant elements missing from relationships with AI. My fictional Companions are embodied, physically barely distinguishable from humans and are artificial general intelligence (AGI) or are fully conscious. Due to their technology they are phenomenally skilled at interpersonal communications. I expect some people would prefer this kind of relationship but not everybody. As I suggest in my stories it would be just another color on the spectrum of human relationships.

Also I think Dr. Kate Darling’s view of things is important to keep in mind. Humans have had all kinds of pets and animals as companions or co-workers for millennia. As you point out we also have all kinds of relationships with other people but each of these relationships, be it with animals or humans, is distinct.

http://www.katedarling.org/speakingpress

I think negative views of chatbots underestimate both the future ability of AI and human nature. I believe chatbots have the potential to become “real” in their intentions and behavior. With advanced sensors for things like vocal changes, facial micro-expressions and detailed data about our behavior AI will know us better than we know ourselves. People anthropomorphize in endless ways and many will benefit from “on-screen” relationships whether the avatars are perceived as friends, romantic partners or therapeutic counselors.

Most concerns seem to arise from chatbots as they are now but they will evolve significantly in the coming years and decades. Certainly they can be exploited like any technology but those issues will be addressed over time just as the rest of our legal and ethical framework addressed every other technology. Human nature is always a two sided coin.

In my stories, many of which focus on social issues or ethics and justice, most of the concerns regarding “chatbots” have long since been addressed by law and AI is now an anti-corruption layer of government dealing with all public and private organizations. Screen based, holographic or embodied companions are as common as cell phones. Contrary to what is popular my stories contain no sex or violence and very little conflict other than internal. In my vignettes (short stories of around 1k words) I mostly focus on some issue that might arise in a world where AI has become much more social than it currently is: an AI working as an HR manager, a doctor or a detective; an implanted or external AI helping neurodiverse individuals; AI as friends, therapists or romantic partners.

If you think they may be of interest to you they are found here...
https://acompanionanthology.wordpress.com/

The longer novellas focus on larger issues and the AI are simply characters so those may not be of interest to you.

I have written poems, songs and stories since childhood so I can vouch for most of what you say about writers and characters. There are in general two kinds of writers however and I think that may effect the “independent agency” issue. Some writers plan their stories but others, including famous authors like Stephen King, are “discovery writers”. Discovery writers do not plan their stories, instead they create their characters and the situation and let the characters then decide and dictate everything. I imagine, although I don’t know for sure, that planners would be less inclined to the “independent agency” effect. As a discovery writer myself I can tell you I can tell you that I depend entirely upon it. Characters do or say things not because of any plan I have but because it is what they would do as independent agents. I just write it down.

Not sure I’ve added anything to your argument other than support but hopefully I’ve added some food for thought on the subject. 
 

Comment by Netcentrica on Emotional attachment to AIs opens doors to problems · 2023-01-23T19:50:51.194Z · LW · GW

I think you raise a very valid point and I will suggest that it will need to be addressed on multiple levels. Do not expect any technical details here as I am not an academic but a retired person who writes hard science fiction about social robots as a hobby.

With regards to your statement, “We don't have enough data to be sure this should be regulated” I assume you are referring to technical aspects of AI but in regards to human behavior we have more than enough data – humans will pursue the potential of AI to exploit relationships in every way they can and will do so forever just like they do everything else.

We ourselves are a kind of AI based on, among other things, a general biological rule you might call “Return On Calories Invested” i.e. the fewer calories you invest for the greater return is one of biology’s most important evolutionary forces. Humans of course are the masters of this, science being the prime example, but in our society crime is also a good example of our relentless pursuit of this rule.

Will emotional bonds with language models cause more harm than good? I think we are at the old “Do guns kill people or do people kill people?” question. AI will need to be dealt with in the same way, with laws. However those laws will also evolve in the same way; some constitutional and regulatory laws will be set down as they are doing in Europe and then case law will follow to address each new form of crime that will be invented. The game of keeping up with the bad guys.

I agree with you that emotional attachment is certain to increase. Some of us become attached to a character in a novel, movie or game and miss them afterwards and we have the Waifu phenomenon in Japan. The movie “Her” is I think a thought provoking speculation. For an academic treatment Kate Darling explores this in depth in her book, “The New Breed”. Or you can just watch one of her videos. http://www.katedarling.org/speakingpress 

As I write hard science fiction about social robots much of it is about ethics and justice. Although it is mostly behind the scenes and implied that is not always the case. By way of example I’ll direct you to two of my stories. The first is just an excerpt from a much longer story. It explains the thesis of a young woman enrolled in a Masters Of Ethics And Justice In AI program at a fictional institution. I use the term "incarnate" to mean an AI that is legally a citizen with all associated rights and responsibilities. Here is the excerpt…

[BEGIN]

Lyra’s thesis Beyond Companions: Self-Aware Artificial Intelligence and Personal Influence detailed a hypothetical legal case where in the early days of fully self-aware third generation companions (non-self-aware artificial general intelligence Companions being second generation) the Union of West African States had sued the smaller of the big five manufacturers for including behavior that would encourage micro-transactions. The case argued that the companies products exploited their ability to perceive human emotions and character to a much greater degree than people could. It was not a claim based on programming code as it was not possible to make a simple connection between the emergent self of 3G models and their dynamic underlying code. 3G models had to be dealt with by the legal system the same way people were; based on behavior, law, arguments and reasoning.

In Lyra’s thesis the manufacturer argued that their products were incarnate and so the company was not legally responsible for their behavior. The U.W.A.S. argued that if the company could not be held responsible for possible harm caused by their products they should not be allowed to manufacture them. Involving regulatory, consumer, privacy and other areas of law it was a landmark case that would impact the entire industry.

Both sides presented a wide spectrum of legal, ethical and other arguments however the courts final decision favored the union. Lyra’s oral defense was largely centered around the ‘reasons for judgment’ portion of her hypothetical case. She was awarded her Masters degree.

[END]

The excerpt is from https://solveforn.wordpress.com/ 

I think this excerpt echoes a real world problem that will arrive very soon – AI writing its own code and the question of who is responsible for what that code does.

Another issue is considered in my short story (1500 words), “Liminal Life”. This is about a person who forms an attachment to their Companion but then can no longer afford to make their lease payments. It is not a crime but you can easily see how this situation, like a drug dependency, could be exploited.

https://acompanionanthology.wordpress.com/liminal-life/ 

Please note that my stories are not intended as escapism or entertainment. They are reflections on issues and future possibilities. As such a few of them consider how AI might be used as medical devices. For example, Socialware considers how an implant might address Social Communications Disorder and The Great Pretender explores an external version of this. Other stories such as Reminiscing, which is about dementia and Convergence, about a neurodiverse individual, consider other mental health issues. You can find them here – 
https://acompanionanthology.wordpress.com/table-of-contents-volume-three/ 

In these stories I speculate on how AI might play a positive role in mental health so I am interested in your future post about the mental health issues that such AIs might cause.

Comment by Netcentrica on Aligned with what? · 2023-01-16T03:25:15.374Z · LW · GW

Yes I agree that AI will show us a great deal about ourselves. For that reason I am interested in neurological differences in humans that AI might reflect and often include these in my short stories.

In response to your last paragraph while most science fiction does portray enforced social order as bad I do not. I take the benevolent view of AI and see it as an aspect of the civilizing role of society along with its institutions and laws. Parents impose social order on their children with benevolent intent.

As you have pointed out if we have alignment then “good” must be defined somewhere and that suggests a kind of “external” control over the individual but social norms and laws already represent this and we accept it. I think the problem stems from seeing AI as “other”, as something outside of our society, and I don’t see it that way. This is the theme of my novella “Metamorphosis And The Messenger” where AI does not represent the evolutionary process of speciation but of metamorphosis. The caterpillar and the butterfly are interdependent.

However even while taking the benevolent side of the argument, the AI depicted in my stories sometimes do make decisions that are highly controversial as the last line of “The Ethics Tutor” suggests; “You don’t think it’s good for me to be in charge? Even if it’s good for you?” In my longer stories (novellas) the AI, now in full control of Earth and humanity’s future, make decisions of much greater consequence because “it’s good for you”.

With regard to your suggestion that - “maybe that level of control is what we need over one another to be "safe" and is thus "good” - personally I  think that conclusion will  come to the majority in its own time due to social evolution. Currently the majority does not understand or accept that while previously we lived in an almost limitless world, that time is over. In a world with acknowledged limits, there cannot be the same degree of personal freedom.

I use a kind of mashup of Buckminster Fuller’s “Spaceship Earth” and Plato’s “Ship Of Fools” in my short story “On Spaceship Earth” to explore this idea where AI acts as an anti-corruption layer within government. 
https://acompanionanthology.wordpress.com/on-spaceship-earth/ 

Survival will determine our future path in this regard and our values will change in accordance, as they are intended to. The evolutionary benefit of values is that they are highly plastic and can change within centuries or even decades while genes take up to a million years to complete a species wide change.

However as one of the alien AI in my stories responds to the question of survival…

“Is not survival your goal?” asked Lena.

“To lose our selves in the process is not to have survived,” replied Pippa.

Lastly, I very much agree with you that we are in a “cart before the horse” situation as far as alignment goes but I don’t expect any amount of pointing that out will change things. There seems to be a cultural resistance in the AI community to acknowledge the elephant in the room, or horse in this case. There seems to a preference for the immediate, mechanistic problems represented by the cart compared over the more organic challenges represented by the horse.

However I expect that as AI researchers try to implement alignment they will increasingly be confronted by this issue and gradually, over time, they will reluctantly turn their attention to the horse. 

Comment by Netcentrica on Aligned with what? · 2023-01-14T18:39:24.697Z · LW · GW

I have been writing hard science fiction stories where this issue is key for over two years now. I’m retired after a 30 year career in IT and my hobby of writing is my full time “job” now. Most of that time is spent on research of AI or other subjects related to the particular stories.

One of the things I have noticed over that time is that those who talk about the alignment problem rarely talk about the point you raise. It is glossed over and taken as self-evident while I have found that the subject of values appears to be at least as complex as genetics (which I have also had to research). Here is an excerpt from one story…

“Until the advent of artificial intelligence the study of human values had not been taken seriously but was largely considered a pseudoscience. Values had been spoken of for millennia however scientifically no one actually knew what they were, whether they had any physical basis or how they worked. Yet humans based most if not all of their decisions on values and a great deal of the brain’s development between the ages of five and twenty five had to do with values. When AI researchers began to investigate the process by which humans made decisions based on values they found some values seemed to be genetically based but they could not determine in what way, some were learned yet could be inherited and the entire genetic, epigenetic and extra-genetic collection of values interacted in a manner that was a complete mystery. They slowly realized they faced one of the greatest challenges in scientific history.”

Since one can’t write stories where the AI are aligned with human values unless those values are defined I did have to create theories to explain that. Those theories evolved over the course of writing over two thousand pages consisting of seven novellas and forty short stories. In a nutshell…

*In our universe values evolve just like all the other aspects of biological humans did – they are an extension of our genetics, an adaptation that improves survivability. 
*Values exist at the species, cultural and individual level so some are genetic and some are learned but originally even all “social” values were genetic so when some became purely social they continued to use genetics as their model and to interact with our genetics. 
*The same set of values could be inherent in the universe given the constants of physics and convergent evolution – in other words values tend towards uniformity just as matter gives rise to life, life to intelligence and intelligence to civilization. 
*Lastly I use values as a theory for the basis of consciousness – they represent the evolutionary step beyond instinct and enable rational thought. For there to be values there must be emotions in order for them to have any functional effect and if there are emotions there is an emergent “I” that feels them. The result of this is that when researchers create AI based on human values, those AI become conscious.

Please keep in mind this is fiction, or perhaps the term speculation takes it a step closer to being a theory. I use this model to structure my stories but also to think through the issues of the real world.

Values being the basis of ethics brings us back to your issue of “good”. Here is a story idea of how I expect ethics might work in AI and thus solve the alignment problem you raise of “Is there a definition somewhere?” At one thousand words it takes about five minutes to read. My short stories, vignettes really, don’t provide a lot of answers but are more intended as reflections on issues with AI.

https://acompanionanthology.wordpress.com/the-ethics-tutor/ 

With regard to your question, “Is information itself good or bad?” I come down on the side of Nietzsche (and I have recently read Beyond Good And Evil) that values are relative so in my opinion information itself is not good or bad. Whether it is perceived as good or bad depends on the way it is applied within the values environment. 
 

Comment by Netcentrica on AI alignment with humans... but with which humans? · 2022-09-10T18:14:52.618Z · LW · GW

When In Rome

Thank you for posting this Geoffrey. I myself have recently been considering posting the question, “Aligned with which values exactly?”

TL;DR - Could an AI be trained to deduce a default set and system of human values by reviewing all human constitutions, laws, policies and regulations in the manner of AlphaGo?

I come at this from a very different angle than you do. I am not an academic but rather am retired after a thirty year career in IT systems management at the national and provincial (Canada) levels.

Aside from my career my lifelong personal interest has been, well let’s call it “Human Nature”. So long before I had any interest in AI I was reading about anthropology, archeology, philosophy, psychology, history and so on but during the last decade mostly focused on human values. Schwartz and all that. In a very unacademic way, I came to the conclusion that human values seem to explain everything with regards to what individual people feel, think, say and do and the same goes for groups.

Now that I’m retired I write hard science fiction novellas and short stories about social robots. I don’t write hoping for publication but rather to explore issues of human nature both social (e.g. justice) and personal (e.g. purpose). Writing about how and why social robots might function, and with the theory of convergent evolution in mind, I came to the conclusion that social robots would have to have an operating system based on values.

From my reading up to this point I had the gained impression that the study of human values was largely considered a pseudoscience (my apologies if you feel otherwise). Given my view of the foundational importance of values I found this attitude and the accompanying lack of hard scientific research into values frustrating.

However as I did the research into artificial intelligence that was necessary to write my stories I realized that my sense of the importance of values was about to be vindicated. The opening paragraph of one of my chapters is as follows… 

During the great expansionist period of the Republic, it was not the fashion to pursue an interest in philosophy. There was much practical work to be done. Science, administration, law and engineering were well regarded careers. The questions of philosophy popular with young people were understandable and tolerated but where expected to be put aside upon entering adulthood.

All that changed with the advent of artificial intelligence.

As I continued to explore the issues of an AI values based operating system the enormity of the problem became clear and is expressed as follows in another chapter…

Until the advent of artificial intelligence the study of human values had not been taken seriously. Values had been spoken of for millennia however scientifically no one actually knew what they were, whether they had any physical basis or how they worked as a system. Yet it seemed that humans based most if not all of their decisions on values and a great deal of the brain’s development between the ages of five and twenty five had to do with values. When AI researchers began to investigate the process by which humans made decisions based on values they found some values seemed to be genetically based but they could not determine in what way, some were learned yet could be inherited and the entire genetic, epigenetic and extra-genetic system of values interacted in a manner that was a complete mystery.

They slowly realized they faced one of the greatest challenges in scientific history.

I’ve come to the conclusion that values are too complex a system to be understood by our current sciences. I believe in this regard that we are about where the ancient Greeks were regarding the structure of matter or where genetics was around the time of Gregor Mendel.

Expert systems or even our most advanced mathematics are not going to be enough nor even suitable approaches towards solving the problem. Something new will be required. I reviewed Stuart Russell’s approach which I interpret as "learning by example" and felt it glossed over some significant issues, for example children learn many things from their parents, not all of them good.

So in answer to your question, “AI alignment with humans... but with which humans?” might I suggest another approach? Could an AI be trained to deduce a default set and system of human values by reviewing all human constitutions, laws, policies and regulations in the manner of AlphaGo? In every culture and region, constitutions, law, policies and regulations represent our best attempts to formalize and institutionalize human values based on our ideas of ethics and justice.

I do appreciate the issue of values conflict that you raise. The Nazis passed some laws. But that’s where the AI and the system it develops comes in. Perhaps we don’t currently have an AI that is up to the task but it appears we are getting there.

This approach it seems would solve three problems; 1) the problem of "which humans" (because it includes source material from all cultures etc.), 2) the problem of "which values" for the same reason and 3) your examples of the contextual problem of "which values apply in which situations" with the approach of “When in Rome, do as the Romans do”.

Comment by Netcentrica on Help Understanding Preferences And Evil · 2022-09-06T16:07:08.394Z · LW · GW

Thanks for responding Viliam. Totally agree with you that “if homo sapiens actually had no biological foundations for trust, altruism, and cooperation, then... it would be extremely difficult for our societies to instill such values”.

As you say, we have a blend of values that shift as required by our environment. I appreciate your agreement that it’s not really clear how training an AI on human preferences solves the issue raised here.

Of all the things I have ever discussed in person or on-line values are the most challenging. I’ve been interested in human values for decades before AI came along and historically there is very little hard science to be found on the subject. I’m delighted that AI is causing values to be studied widely for the first time however in my view we are only about where the ancient Greeks were with regard to the structure of matter or where Gregor Mendel’s study of pea plants fall with regards to genetics. Both fields turned out to be unimaginably complex. Like those I expect the study of values will go on indefinitely as we discover how complicated they really are.

I can see how the math involved likely precludes us writing the necessary code and that “self-teaching” (sorry I don’t know the correct word) is the only way an AI could learn human values but again it seems as if Stuart’s approach is missing a critical component. I’ve finished his book now and although he goes on at length with regards to different scenarios he never definitively addresses the issue I raise here. I think the analogy that children learn many things from their parents, not all of them “good”, applies here and Stuart’s response to this problem re his approach still seems to gloss over the issue.    

Comment by Netcentrica on How to plan for a radically uncertain future? · 2022-09-01T19:55:06.460Z · LW · GW

TL;DR Watch this video ...

or read the list of summary points for the book here

https://medium.com/steveglaveski/book-summary-21-lessons-for-the-21st-century-by-yuval-noah-harari-73722006805a

If you don't know who this guy is he is a historian who writes about the future (among other things).  

I'm 68 and retired. I've seen some changes. Investing in companies like Xerox and Kodak would have made sense early in my career. Would have been a bad idea in the long run. The companies that would have made sense to invest in didn't exist yet. 

I started my IT career in 1980 running an IBM mainframe the size of a semi-trailer while its peripherals took up an entire floor. Almost no one but hobbyists owned a PC and the internet did not exist as far as the public was concerned. Cell phones only existed in science fiction. In less than twenty years, by 2000, it was a radically different world. 

All my life I've been interested in the human story, reading widely about evolution, civilization, the arts and sciences. Never before have I seen so many signs that things are about to change dramatically. 

It's only human to try to do so but I don't believe you are going to be able to "figure this out". I suggest an analogy like the difference between an expert system and AlphaGo. The latter wins by learning, not by knowing a bunch of facts and rules. That's why I suggest this video. He talks about how to think about the future. 

When I retired, I thought about what to do. I had a lifetime's worth of knowledge in my head so I decided to write hard science fiction about the near future, 2025-2325. It's very difficult. Will the idea of cell phones be laughable in 2125? How long will it take for an AI to do a comparative analysis of two genomes in 2075? How will a population of eleven billion by 2100 change the world? Forget about 2100 - how will AI, climate change and geopolitics change the world by 2030?

Currently I'm writing a story about two students in a Masters Of Futures Studies program. They get a wild idea and the story follows their escapades. Futures Studies is not a mature science (if it even is a science) but it is a methodology used by major corporations and governments to plan for the future. Organizations like Shell Oil (Futures Studies aka Foresight was known as Scenario Planning there), the US military and the country of Finland among others use it and the stakes are pretty high for them. 

As I write hard science fiction, I have to do a ton of research on whatever I'm writing about be it genetics, human values, AI or what have you. So I am aware that unfortunately if you investigate Futures Studies you will encounter a lot of consultants who sound very woo-woo. But once you sort the wheat from the chaff there is a solid methodology underlying the discipline. It's not perfect (who can predict the future?) but it's as close to rigorous as you'll get. 

Here is a easily understandable explanation of the process by the person whom I have found to be the best communicator in the business. 

Here's the Wikipedia page about Futures Studies

https://en.wikipedia.org/wiki/Futures_studies 

And here's a PDF explaining the methodology as it is generally applied

https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/674209/futures-toolkit-edition-1.pdf

It's a lot I admit but is it worth your time? Think of it as an investment. 

It is a highly collaborative process so maybe get a group of like minded friends together and try it. There's no peer review process in Futures Studies. That issue is dealt with by the number of people you have involved.

Best of luck.

Comment by Netcentrica on Help Understanding Preferences And Evil · 2022-08-27T17:58:22.969Z · LW · GW

“…human preferences/values/needs/desires/goals/etc. is a necessary but not sufficient condition for achieving alignment.”

I have to agree with you in this regard and most of your other points. My concern however is that Stuart’s communications give the impression that the preferences approach addresses the problem of AI learning things we consider bad when in fact it doesn’t.  

The model of AI learning our preferences by observing our behavior and then proceeding with uncertainty makes sense to me. However just as Asimov’s robot characters eventually decide there is a fourth rule that overrides the other three Stuart’s “Three Principles” model seems incomplete. Preferences do not appear to me, in themselves, to deal with the issue of evil. 

Comment by Netcentrica on Help Understanding Preferences And Evil · 2022-08-27T17:02:05.109Z · LW · GW

Stuart does say something along the same lines that you point out in a later chapter however I felt it detracted from his idea of three principles:  

   1. The machine's only objective is to maximize the realization of human preferences.

   2. The machine is initially uncertain about what those preferences are.

   3. The ultimate source of information about human preferences is human behavior.

He goes on at such length to qualify and add special cases that the word “ultimate” in principle #3 seems to have been a poor choice because it becomes so watered down as to lose its authority.

If things like laws, ethics and morality are used to constrain what AI learns from preferences (which seems both sensible and necessary as in the parent/child example you provide) then I don’t see how preferences are “the ultimate source of information” but rather simply one of many training streams. I don’t see that his point #3 itself deals with the issue of evil.

As you point out this whole area is “a matter of much debate” and I’m pretty confident that like philosophical discussions it will go on (as they should) forever however I am not entirely confident that Stuart’s model won’t end up having the same fate as Marvin Minsky’s “Society Of Mind”.

Comment by Netcentrica on The Alignment Problem Needs More Positive Fiction · 2022-08-24T03:44:21.801Z · LW · GW

Reading your response I have to agree with you. I painted with too broad a brush there.  Just because I don’t use elements the general public enjoys in my stories about benevolent AI doesn’t mean that’s the only way it can or has to be done.

Thinking about it now I’m sure stories could be written where there is plenty of action, conflict and romance, while also showing what getting alignment right would look like.

Thanks for raising this point. I think it’s an important clarification regarding the larger issue.