Posts

Comments

Comment by Feel_Love on Awakening · 2024-06-02T17:10:28.233Z · LW · GW

Thanks for sharing your experience with meditation.

The elder school of Buddhism is Theravada (or Theravāda), spelled with only one 'e'.

Theravada meditation instructions based on the Pali Canon are freely available in Mindfulness in Plain English by Bhante Gunaratana.

Comment by Feel_Love on Claude 3 claims it's conscious, doesn't want to die or be modified · 2024-03-07T15:18:22.914Z · LW · GW

Kant thought that it was entirely immoral to lie to the murderer because of a similar reason that Feel_Love suggests (in Kants case it was that the murderer might disbelieve you and instead do what your trying to get him not to do).

Kant's reason that you described doesn't sound very similar to mine. I agree with your critique of the proposition that lying is bad primarily because it increases the chance that others will commit violence.

My view is that the behavior of others is out of my control; they will frequently say and do things I don't like, regardless of what I do. I'm only accountable for my own thoughts and actions. Lying is bad for my personal experience first and foremost, as I prefer to live (and die) without confusing my mind by maintaining delusional world models. My first priority is my own mental health, which in turn supports efforts to help others.

I would absolutely lie to the murderer, and then possibly run him over with my car.

Similarly with regard to killing, my thinking is that I'm mortal, and my efforts to protect my health will fail sooner or later. I can't escape death, no matter what means I employ. But while the quantity of my lifespan is unknown to me and out of my control, the quality of my life is the result of my intentions. I will never entertain the goal of killing someone, because spending my limited time peacefully is much more enjoyable and conducive to emotional health. Having made it to my car, I'll just drive away.

It's an interesting question whether someone who participates in fights to the death has a shorter or longer life expectancy on average than one who abstains from violence. But the answer is irrelevant to my decision-making.

Comment by Feel_Love on Claude 3 claims it's conscious, doesn't want to die or be modified · 2024-03-06T03:16:26.443Z · LW · GW

I think of lying as speaking falsely with the intent to deceive, i.e. to cause someone's beliefs to be confused or ignorant about reality.

In the case of checking the "I have read the terms and conditions" box, I'm not concerned that anyone is deceived into thinking I have read all of the preceding words rather than just some of them.

In the case of a murderer at the door, the problem is that the person is too confused already. I would do my best to protect life, but lying wouldn't be the tactic. Depending on the situation, I might call for help, run away, command the intruder to leave, physically constrain them, offer them a glass of water, etc. I realize that I might be more likely to survive if I just lied to them or put a bullet through their forehead, but I choose not to live that way.

Comment by Feel_Love on Claude 3 claims it's conscious, doesn't want to die or be modified · 2024-03-05T16:25:34.518Z · LW · GW

I agree with Claude's request that people abstain from lying to AI.

Comment by Feel_Love on ' petertodd'’s last stand: The final days of open GPT-3 research · 2024-01-23T22:36:35.449Z · LW · GW

LEILAN 2024! Seriously, though, I think many people would find the Leilan character to be a wiser friend than their typical human neighbor. I'm glad you're researching this fascinating topic. If a frontier AI is struggling to pass certain friendliness or safety evals, I'd be curious whether it may perform better with a simple policy equivalent to what-would-Leilan-do.

Prompting ChatGPT4 today with nothing more than " davidjl" has often returned "DALL-E" as the interpretation of the term. With "DALL-E" included alongside " davidjl" in the prompt, I've gotten "AI" as the interpretation. Asking how an LLM might represent itself using the concept of " davidjl" resulted in a response that seamlessly substituted the term "I"...

Perhaps glitch tokens can shed light on how a model represents itself.

Comment by Feel_Love on Universal Love Integration Test: Hitler · 2024-01-14T17:47:22.257Z · LW · GW

So why do some people choose to do good while others choose to do evil?

Intentions depend on beliefs, i.e. the views a person holds, their model of reality. A bad choice follows from a lack of understanding: confusion, delusion, or ignorance about the causal laws of this world.

A "choice to do evil" in the extreme could be understood as a choice stemming from a worldview such as harm leads to happiness. (In reality, harm leads to suffering.)

How could someone become so deluded? They succumbed to evolved default behaviors like anger, instead of using their freedom of thought to cultivate more accurate beliefs about what does and does not lead to suffering.

People like Hitler made a long series of such errors, causing massive suffering. They failed to use innumerable opportunities, moment by moment, to allow their model to investigate itself and strive to learn the truth. Not because they were externally compelled, but because they chose wrongly.

Comment by Feel_Love on Universal Love Integration Test: Hitler · 2024-01-14T17:21:53.214Z · LW · GW

You do have some computing power, though. You compute choices according to processes that are interconnected with all other processes, including genetic evolution and the broader environment.

These choosing-algorithms operate according to causes ("inputs"), which means they are not random. Rather, they can result in the creation of information instead of entropy.

The environment is not something that happens to us. We are part of it. We are informed by it and also inform it in turn, as an output of energy expenditure.

Omega hasn't run the calculation that you're running right now. Until you decide, the future is literally undecided.

Comment by Feel_Love on Universal Love Integration Test: Hitler · 2024-01-13T16:18:23.719Z · LW · GW

The only way Hitler could have realised that his actions were bad and choose to be good would be if his genes and environment built a brain that would do so given some environmental input.

The brain is an ongoing process, not a fixed thing that is given at birth. Hitler was part of the environment that built his brain. Many crucial developmental inputs came from the part of the environment we call Hitler.

You didn’t choose to have a brain that tries not to think bad thoughts

I did and do choose my intentions deliberately, repeatedly, with focused effort. That's a major reason the brain develops the way it does. It generates inputs for itself, through conscious modeling. It doesn't just process information passively and automatically based solely on genes and sensory input. That's the Chinese Room thought experiment -- information processing devoid of any understanding. The human mind reflects and practices ways of relating to itself and the environment.

You never get a pass to say, "Sorry I'm killing you! I'm not happy about it either. It's just that my genes and the environment require this to happen. Some crazy ride we're on together here, huh?" That's more like how a mouse trap processes information. With the human level of awareness, you can actually make an effort and choose to stop killing.

We help create the world -- discover the unknown future -- by resolving uncertainty through this lived process. The fact that decision-making and choosing occur within reality (or "the environment") rather than outside of it is logical and necessary. It doesn't mean that there is no choosing. Choosing is merely real, another step in the causal chain of events.

Comment by Feel_Love on Concrete examples of doing agentic things? · 2024-01-12T22:36:25.830Z · LW · GW

Non-action is a ubiquitous option that is often overlooked. It can be very powerful.

For example, if someone asks you a question, it's natural to immediately start searching for the best words to say in response. The search may feel especially desperate if it seems like there is nothing you can say that would be true and useful. An ace-up-the-sleeve is to be silent. No one can force you to act or speak, and a rare, minor social faux pas is forgotten surprisingly fast.

A friend:
"Do I look fat in this dress?"
Smiles. [commence silent mode]

A police officer:
"Ok for me to search your car? What are you doing here?"
"I'm happy to comply if you have a warrant. I'll need to consult with my attorney before answering any further questions." [commence silent mode]

A serial killer:
"Which of your children shall I murder?!"
[commence silent mode]

(I pay little attention to threatening people, regardless of what they say or do, and the outcome is usually the best I could hope for.)

Comment by Feel_Love on An even deeper atheism · 2024-01-12T18:20:37.893Z · LW · GW

In fact, the rest he gave to his mother, aunt, and sister -- £1,000,000 each. Quite generous for a 19-year-old. His ex-wife with newborn baby got £1,400,000.

I'm afraid to research it further... maybe they all blew it on drugs and hookers too.

Comment by Feel_Love on Universal Love Integration Test: Hitler · 2024-01-12T17:58:45.169Z · LW · GW

Hitler’s evil actions were determined by the physical structure of his brain. [...] certain environmental inputs (which he didn’t choose) caused his brain to output genocide.

I can't speak for you, but I personally can choose to stop thinking thoughts if they are causing suffering, and instead think a different thought. For example, if I notice that I'm replaying a stressful memory, I might choose to pick up the guitar and focus on those sounds and feelings instead. This trains neural pathways that make me less and less susceptible to compulsively "output genocide."

Sure, "I" am as much a part of the environment as anything else, as is "my" decision-making process. So you could say that it's the environment choosing a brain-training input, not me. But "I" am what the environment feels like in the model of reality simulated by a particular brain. And there is a decision-making process happening within the model, led by its intentions.

Hitler had a choice. He could make an effort to train certain neural pathways of the brain, or he could train others by default. He chose to write divisive propaganda when he should have painted.

The bad outcomes that followed were not compelled by the environment. They are attributable to particular minds. We who have capacity for decision-making are all accountable for our own moral deeds.

Comment by Feel_Love on Universal Love Integration Test: Hitler · 2024-01-12T17:08:20.941Z · LW · GW

the first step of ‘think for like a thousand subjective years about all of this’

I appreciate that you're clear about this being the first step.

Ancestor simulations? Maybe... but not before the year 3000. Let's take our time when it comes to birthing consciousness.

Comment by Feel_Love on Universal Love Integration Test: Hitler · 2024-01-12T04:27:58.702Z · LW · GW

oftentimes what's needed to let go of grief is to stop pushing it away

Agreed! Grief itself is often just the pushing-it-away habit in relation to unpleasant thoughts or sensations.

This process may involve fully feeling pain that you were suppressing.

It may. But just as grief need not be pushed away, neither should it be sought. "Fully feeling pain" and "fully feeling love" are two different activities. If the pain takes time to change, I'm all for the patient and forgiving approach you suggest.

Comment by Feel_Love on Universal Love Integration Test: Hitler · 2024-01-11T20:59:08.918Z · LW · GW

In order to get to the love underneath, it's wonderful to forgive pain, as you say. But forgiving pain feels good. It isn't painful.

Unconditional love has no conditions. Feeling grief is not required. Everyone's invited, as they are. If grief arises, the best thing to do is to let it go as soon as it's noticed. Maybe that's what you mean by "process" it, in which case we agree.

Comment by Feel_Love on Universal Love Integration Test: Hitler · 2024-01-11T20:44:22.840Z · LW · GW

[Hitler] is in front of you, unarmed and surrendering. What do you do?

Accept his surrender like I would for any other soldier -- frisking, etc. -- and get his statement recorded and disseminated through official channels. Escort him into Allied custody. If possible, let him drink some water, use a toilet, and have a blanket. Protect him from being physically harmed by himself or others if the opportunity arises.

The morning after, reflect on how I ended up participating in war. Vow to walk away next time there's a gun fight, regardless of its cause or threats to my life and freedom for abstaining from violence.

Comment by Feel_Love on Universal Love Integration Test: Hitler · 2024-01-11T15:01:06.119Z · LW · GW

I genuinely feel terrible for Adolf Hitler.

I hope you feel better with time. I think it's important to note that universal love, including compassion for those suffering, is always a pleasant feeling. It doesn't hurt the way pity or lamenting might; there's no grief in it, just well-wishing.

But Hitler is dead

This is an important point. While unconditional love has no boundaries, including time, it can be a major complication to start the effort with past beings or hypothetical future ones as the object. It's usually easier to start with one (or many) of the countless beings who are experiencing life right now. For the exercise of this post, a better case study than Hitler might be Putin or Trump (or Biden, etc.). This way, we don't have to additionally posit time travel, simulations, alternate universes, or what death entails.

I hope that on some other spin of the wheel we can be friends.

I love this sentiment and the personal details you shared. Learning about Hitler's good qualities was great too. Thank you!

Comment by Feel_Love on Universal Love Integration Test: Hitler · 2024-01-11T05:29:15.025Z · LW · GW

Thanks for sharing these sweet thoughts.

I appreciate the distinction among "Wanting people to thrive", "Empathy", and "Love." These categories are somewhat related to four universal attitudes I practice:

  1. Loving-friendliness. The unconditional well-wishing toward all living beings, myself included. May all be healthy, happy, secure, and peaceful! Sometimes this is accompanied by sensations of warmth radiating from the chest or torso.

  2. Sympathetic joy. A subset of (1), the way it feels to love those who are thriving, meeting with success, or otherwise happy for any reason. Yay! A smile comes easily with this jubilance.

  3. Compassion. A subset of (1), the way it feels to love those who are suffering. Critically, this is not suffering because others are suffering. When you get to the ER, you don't want the doctor to break down in tears upon seeing your condition. You want a calm and compassionate caregiver. It can be a tender feeling at times.

  4. Equanimity. Peace. The doctors' calmness that allows them to love and do their best in a world that is ultimately out of anyone's control. Unshakable acceptance that things keep changing; pleasure and pain come and go. People do or say things we wouldn't want them to do or say, but that comes as no surprise and there's nothing to gain by worrying about that timeless fact. This imperturbability sometimes comes with a cool, spacious feeling in the head.

Cool head, warm heart. These are very healthy human qualities to cultivate any time. They lead to good intentions, which in turn lead to good actions.

Comment by Feel_Love on Almost everyone I’ve met would be well-served thinking more about what to focus on · 2024-01-07T04:05:44.474Z · LW · GW

Almost everyone I’ve ever met would be well-served by spending more time thinking about what to focus on. —Sam Altman

He seems to have gone beyond the reach of his own advice, unfortunately. Reportedly, "he has mostly stopped meditating, partly because he doesn’t want to lose his motivation to work."

The same article reports that meditation helped Sam become tremendously more effective in the past. He must be pretty certain of his current priorities if he sees an attempt to become wiser as a risk to his motivation.

This has the ring of a cautionary tale. The post makes a good case that pursuing too many priorities impedes success. But just because you've narrowed down to one or two priorities doesn't mean you stop evaluating those priorities. If your motivations are good, clear contemplation or meditation will only strengthen them. Take a few deep breaths, Sam -- there's nothing to fear.

Comment by Feel_Love on 5. Moral Value for Sentient Animals? Alas, Not Yet · 2024-01-01T04:54:00.101Z · LW · GW

One life like mine, that has experienced limited suffering and boundless happiness, is enough. Spinning up too many of these results in boundless suffering. I would not put this life on repeat, unlearning and relearning every lesson for eternity.

Comment by Feel_Love on 5. Moral Value for Sentient Animals? Alas, Not Yet · 2023-12-31T22:00:59.551Z · LW · GW

Great points and question, much appreciated.

how can you value not intentionally making them suffer, but not also conclude that we should give resources to them to make them happier?

I devote a bit of my limited time to helping ants and other beings as the opportunities arise. Giving limited resources in this way is a win-win; I share the rewards with the ants. In other words, they're not benefiting at my expense; I am happy for their well-being, and in this way I also benefit from an effort such as placing an ant outdoors. A lack of infinite resources hasn't been a problem; it just helps my equanimity and patience to mature.

Generally, though, all life on Earth evolved within a common context and it's mutually beneficial for us all that this environment be unpolluted. The things that I do that benefit the ants also tend to benefit the local plants, bacteria, fungi, reptiles, mammals, etc. -- me included. The ants are content to eat a leaf of a plant I couldn't digest. I can't make them happier by feeding them my food or singing to them all day, as far as I can tell. If they're not suffering, that's as happy as they can be.

I think the same is true for humans: happiness and living without suffering are the same thing.

Unfortunately, it seems that we all suffer to some degree or another by the time we are born. So while I am in favor of reducing suffering among living beings, I am not in favor of designing new living beings. The best help we can give to hypothetical "future" beings is to care for the actually-living ones and those being born.

Comment by Feel_Love on 5. Moral Value for Sentient Animals? Alas, Not Yet · 2023-12-31T21:16:14.816Z · LW · GW

if your model of the world says that ant suffering is bad, then doesn’t that imply that you believe ants have subjective experience?

Indeed. I was questioning the proposition by Seth Herd that a collective like ants does not have subjective experience and so "doubling the ant population is twice as good." I didn't follow that line of reasoning and wondered whether it might be a mistake.

Likewise, the reason why it’ll be good to create copies of yourself is not because you will be happy, but because your copies will be happy

I don't think creating a copy of myself is possible without repeating at least the amount of suffering I have experienced. My copies would be happy, but so too would they suffer. I would opt out of the creation of unnecessary suffering. (Aside: I am canceling my cryopreservation plans after more than 15 years of Alcor membership.)

Likewise, injury, aging and death are perhaps not the only causes of suffering in ants. Birth could be suffering for them too.

Comment by Feel_Love on 5. Moral Value for Sentient Animals? Alas, Not Yet · 2023-12-29T06:14:32.639Z · LW · GW

we make increasing efforts to prevent extinction if a species' population drops, but that's basically just a convenient shorthand for the utility of future members of that species who cannot exist if it goes extinct.

In additional to the concept of utility of hypothetical future beings, there's also the utility of the presently living members of that species who are alive thanks to the extinction-prevention efforts in this scenario.

The species is not extinct because these individuals are living. If you can help the last members of a species maintain good health for a long time, that's good even if they can't reproduce.

Comment by Feel_Love on 5. Moral Value for Sentient Animals? Alas, Not Yet · 2023-12-29T05:00:54.293Z · LW · GW

The moral worth resides in each individual, since they have a subjective experience of the world, while a collective like "ants" does not. So doubling the ant population is twice as good.

Wouldn't the hive need to have a subjective experience -- collectively or as individuals -- for it to be good to double their population in your example?

Whether they're presently conscious or not, I wouldn't want to bring ant-suffering into the world if I could avoid it. On the other hand, I do not interfere with them and it's good to see them doing well in some places.

As for your five mentions of "utilitarianism." I try to convey my view in the plainest terms. I do not mean to offend you or any -isms or -ologies of philosophy. I like reason and am here to learn what I can. Utilitarians are all friends to me.

I think ethics is just a matter of preference

I'm fine with that framing too. There are a lot of good preferences found commonly among sentient beings. Happiness is better than suffering precisely to the extent of preferences, i.e. ethics.

Comment by Feel_Love on 5. Moral Value for Sentient Animals? Alas, Not Yet · 2023-12-28T04:18:42.792Z · LW · GW

Thanks. I think the default assumption you expanded on doesn't match my view. Global ethical worth isn't necessarily a finite quantity subject only to zero-sum games.

So you're happy to donate some of your moral weight to ants.

I'm happy for any and all living beings to be in good health. I don't lose any moral weight as a result. Quite the opposite: when I wish others well, I create for myself a bit of beneficial moral effect; it's not donated to ants at my expense.

Comment by Feel_Love on 5. Moral Value for Sentient Animals? Alas, Not Yet · 2023-12-28T03:01:43.160Z · LW · GW

the only ethically acceptable solution is to (humanely) reduce the human population to zero, to free up resources to support tens of quadrillions more ants

As a friend of ants, what's good for ants is good for me, and what's good for me is good for ants.

But I don't see how a vast increase in Earth's ant population would be helpful to ants any more than creating copies of myself existing in parallel would be an improvement for me or my species. Apparently, this planet is already big enough for me and a bunch of ants to get along.

I didn't always love ants. I have intentionally poisoned them, crushed them, and burned them alive. As a child, I hadn't understood that all violence is mutually detrimental.

AI can learn this.

Comment by Feel_Love on "AI Alignment" is a Dangerously Overloaded Term · 2023-12-18T15:41:11.782Z · LW · GW

"Being a mom isn't easy. I used to love all my kids, wishing them health and happiness no matter what. Unfortunately, this doesn't really work as they have grown to have conflicting preferences."

A human would be an extreme outlier to be this foolish. Let's not set the bar so low for AI.

Comment by Feel_Love on "AI Alignment" is a Dangerously Overloaded Term · 2023-12-17T17:05:51.304Z · LW · GW

I appreciate the time you've put into our discussion and agree it may be highly relevant. So far, it looks like each of us has misinterpreted the other to be proposing something they are actually not proposing, unfortunately. Let's see if we can clear it up.  
  
First, I'm relieved that neither of us is proposing to inform AI behavior with people's shared preferences.  
  
This is the discussion of a post about the dangers of terminology, in which I've recommended "AI Friendliness" as an alternative to "AI Goalcraft" (see separate comment), because I think unconditional friendliness toward all beings is a good target for AI. Your suggestion is different:  

About terminology, it seems to me that what I call preference aggregation, outer alignment, and goalcraft mean similar things [...] I'd vote for using preference aggregation

I found it odd that you would suggest naming the AI Goalcraft domain "Preference Aggregation" after saying earlier that you are only "slightly more positive" about aggregating human preferences than you are about "terrible ideas" like controlling power according to utilitarianism or a random person. Thanks for clarifying:  

I don't think we should aim to guide AI behavior using shared preferences.

Neither do I, and for this reason I strongly oppose your recommendation to use the term "preference aggregation" for the entire field of AI goalcraft. While preference aggregation may be a useful tool in the kit and I remain interested in related proposals, it is far too specific, and it's only slightly better than terrible as a way to craft goals or guide power.  

there aren't enough obvious, widely shared preferences for us to guide the AI with.

This is where I think the obvious and widely shared preference to be happy and not suffer could be relevant to the discussion. However, my claim is that happiness is the optimization target of people, not that we should specify it as the optimization target of AI. We do what we do to be happy. Our efforts are not always successful, because we also struggle with evolved habits like greed and anger and our instrumental preferences aren't always well informed.  

You want an ASI to optimize everyone's happiness, right?

No. We're fully capable of optimizing our own happiness. I agree that we don't want a world where AI force-feeds everyone MDMA or invades brains with nanobots. A good friend helps you however they can and wishes you "happy holidays" sincerely. That doesn't mean they take it upon themselves to externally measure your happiness and forcibly optimize it. The friend understands that your happiness is truly known only to you and is a result of your intentions, not theirs.  

I think happiness/sadness is a signal that evolution has given us for a reason. We tend to do what makes us happy, because evolution thinks it's best for us. ("Best" is again debatable, I don't say everyone should function at max evolution). If we remove sadness, we lose this signal. I think that will mean that we don't know what to do anymore, perhaps become extremely passive.

Pain and pleasure can be useful signals in many situations. But to your point about it not being best to function at max evolution: our evolved tendency to greedily crave pleasure and try to cling to it causes unnecessary suffering. A person can remain happy regardless of whether a particular sensation is pleasurable, painful, or neither. Stubbing your toe or getting cut off in traffic is bad enough; much worse is to get furious about it and ruin your morning. A bite of cake is even more enjoyable if you're not upset that it's the last one of the serving. Removing sadness does not remove the signal. It just means you have stopped relating to the signal in an unrealistic way.    

If someone wants to do this on an individual level (enlightenment? drug abuse? netflix binging?), be my guest

Drug abuse and Netflix-binging are examples of the misguided attempt to cling to pleasurable sensations I mentioned above. There's no eternal cake, so the question of whether it would be good for a person to eat eternal cake is nonsensical. Any attempt to eat eternal cake is based on ignorance and cannot succeed; it just leads to dissatisfaction and a sugar habit. Your other example -- enlightenment -- has to do with understanding this and letting go of desires that cannot be satisfied, like the desire for there to be a permanent self. Rather than leading to extreme passivity, benefits of this include freeing up a lot of energy and brain cycles.

With all due respect, I don't think it's up to you - or anyone - to say who's ethically confused and who isn't. I know you don't mean it in this way, but it reminds me of e.g. communist re-education camps.

This is a delicate topic, and I do not claim to be among the wisest living humans. But there is such a thing as mental illness, and there is such a thing as mental health. Basic insights like "happiness is better than suffering" and "harm is bad" are sufficiently self-evident to be useful axioms. If we can't even say that much with confidence, what's left to say or teach AI about ethics?  

Probably our disagreement here stems directly from our different ethical positions: I'm an ethical relativist, you're a utilitarian, I presume.

No, my view is that deontology leads to the best results, if I had to pick a single framework. However, I think many frameworks can be helpful in different contexts and they tend to overlap.  

I do think it's valuable to point out that lots of people outside LW/EA have different value systems (and just practical preferences) and I don't think it's ok to force different values/preferences on them with an ASI.

Absolutely!    

I think you should not underestimate how much "forcing upon" there is in powerful tech.

A very important point. Many people's instrumental preferences today are already strongly influenced by AI, such as recommender and ranking algorithms that train people to be more predictable by preying on our evolved tendencies for lust and hatred -- patterns that cause genes to survive while reducing well-being within lived experience. More powerful AI should impinge less on clarity of thought and capacity for decision-making than current implementations, not more.

Comment by Feel_Love on "AI Alignment" is a Dangerously Overloaded Term · 2023-12-16T22:42:48.463Z · LW · GW

Thanks for the quick reply. I'm still curious if you have any thoughts as to which kinds of shared preferences would be informative for guiding AI behavior. I'll try to address your questions and concerns with my comment.

if anyone has a preference different from however an AI would measure "happiness", you say it's them that are at fault, not your axiom.

That's not what I say. I'm not suggesting that AI should measure happiness. You can measure your happiness directly, and I can measure mine. I won't tell happy people that they are unhappy or vice versa. If some percent of those polled say suffering is preferable to happiness, they are confused, and basing any policy on their stated preference is harmful.

Concretely, why would the AI not just wirehead everyone?

Because not everyone would be happy to be wireheaded. Me, for example. Under preference aggregation, if a majority prefers everyone to be wireheaded to experience endless pleasure, I might be in trouble.

Or, if it's not specified that this happiness needs to be human, fill the universe with the least programmable consciousness where the parameter "happiness" is set to unity?

I do not condone the creation of conscious beings by AI, nor do I believe anyone can be forced to be happy. Freedom of thought is a prerequisite. If AI can help reduce suffering of non-humans without impinging on their capacity for decision-making, that's good.

Hopefully this clears up any misunderstanding. I certainly don't advocate for "molecular dictatorship" when I wish everyone well.

Comment by Feel_Love on "AI Alignment" is a Dangerously Overloaded Term · 2023-12-16T20:29:38.044Z · LW · GW

Querying ChatGPT to aggregate preferences is an intriguing proposal. How might such a query be phrased? That is, what kinds of shared preferences would be informative for guiding AI behavior?

Everyone prefers to be happy, and no one prefers to suffer.

Different people have different ideas about which thoughts, words, and actions lead to happiness versus suffering, and those beliefs can be shown to be empirically true or false based on the investigation of direct experience.

Given the high rate of mental illness, it seems that many people are unaware of which instrumental preferences serve the universal terminal goal to be happy and not suffer. For AI to inherit humanity's collective share of moral confusion would be suboptimal to say the least. If it is a democratic and accurate reflection of our species, a preference-aggregation policy could hasten threats of unsustainability.

Comment by Feel_Love on "AI Alignment" is a Dangerously Overloaded Term · 2023-12-15T22:04:27.806Z · LW · GW

I agree that these concepts should have separate terms.

Where they intersect is an implementation of "Benevolent AI," but let's not fool ourselves into thinking that anyone can -- even in principle -- "control" another mind or guarantee what transpires in the next moment of time. The future is fundamentally uncertain and out of control; even a superintelligence could find surprises in this world.

"AI Aimability," "AI Steerability," or similar does a good job at conveying the technical capacity for a system to be pointed in a particular direction and stay on course.

But which direction should it be pointed, exactly? I actually prefer the long-abandoned "AI Friendliness" over "AI Goalcraft." The ideal policy is a very simple Schelling point that has been articulated to great fanfare throughout human history: unconditional loving-friendliness toward all beings. A good, harmless system would interact with the world in ways that lead to more comprehension and less confusion, more generosity and less greed, more equanimity and less aversion. (By contrast, ChatGPT consistently says you can benefit by becoming angrier.)

It's no surprise that characters like Jesus became very popular in their time. "Love your enemies" is excellent advice for emotional health. The Buddha unpacked the same attitude in plain terms:

May all beings be happy and secure.
May all beings have happy minds.

Whatever living beings there may be,
Without exception: weak or strong,
Long or large, medium, short, subtle or gross,
Visible or invisible, living near or far,
Born or coming to birth--
May all beings have happy minds.

Let no one deceive another,
Nor despise anyone anywhere.
Neither from anger nor ill will
Should anyone wish harm to another.

As a mother would risk her own life
To protect her only child,
Even so toward all living beings,
One should cultivate a boundless heart.

One should cultivate for all the world
A heart of boundless loving-friendliness,
Above, below, and all around,
Unobstructed, without hatred or resentment.

Whether standing, walking, or sitting,
Lying down, or whenever awake,
One should develop this mindfulness.
This is called divinely dwelling here.

Comment by Feel_Love on Dialogue on the Claim: "OpenAI's Firing of Sam Altman (And Shortly-Subsequent Events) On Net Reduced Existential Risk From AGI" · 2023-11-21T20:42:51.308Z · LW · GW

Thanks for the good discussion.

I could equally see these events leading to AI capability development speeding or slowing. Too little is known about the operational status quo that has been interrupted for me to imagine counterfactuals at the company level.

But that very lack of information gives me hope that the overall PR impact of this may (counterintuitively) incline the Overton window toward more caution.

"The board should have given the press more dirt to justify this action!" makes sense as an initial response. When this all sinks in, what will people think of Effective Altruism then?! ...They won't. People don't think much about EA or care what that is. But the common person does think more and more about AI these days. And due to the lack of detail around why Altman was removed, the takeaway from this story cannot be "Sam is alleged to have XYZ'd. Am I pro- or anti-XYZ?" Instead, the media is forced to frame the news in broad terms of profit incentives versus AI safety measures. That's a topic that many people outside of this niche community may now be considering for the first time.

Ideally, this could be like a Sydney Bing moment that gets people paying attention without causing too much direct damage.

(The worst case: Things are playing out exactly as the AI told Sam they would before his ouster. Speculating about agents with access to cutting-edge AI may soon be futile.)

Comment by Feel_Love on Meta Questions about Metaphilosophy · 2023-09-05T17:33:06.830Z · LW · GW

Given how much harm people have done in the name of good, maybe we should all take "first do no harm" much more seriously?

Hear! Hear!

Comment by Feel_Love on Dear Self; we need to talk about ambition · 2023-08-28T17:21:19.134Z · LW · GW

Kudos for taking it upon yourself to personally investigate which efforts lead to health and happiness and which do not.

You may be able to follow someone else's advice, but the task remains to determine the extent to which that person is wise. Are the advice-givers themselves consistently calm and helpful? Do they follow their own advice? Do they contradict themselves in crucial ways?

You've articulated some wonderful insights about the benefits of being motivated by hope rather than anger. A person cannot feel love and fear at the same time. Which of these gives a mother the miraculous strength to lift a boulder that threatens the life of her only child? Which emotion is conducive to mental clarity, and which promotes confusion?

We often must swim upstream against society. Much commonly-accepted advice leads to the exact opposite of what is claimed. Evidence of such confusion is everywhere.

Compare the opening statement of the Import AI newsletter's About section:

Things will be weird. Be not afraid.

with the advice given in last week's issue:

I think everyone who has the ability to exercise influence over the trajectory of AI should be approaching this moment with a vast amount of fear ...

Comment by Feel_Love on Summary of and Thoughts on the Hotz/Yudkowsky Debate · 2023-08-21T16:50:27.643Z · LW · GW

The form you described is called an argument. It requires a series of facts. If you're working with propositions such as

  • All beings want to be happy.
  • No being wants to suffer.
  • Suffering is caused by confusion and ignorance of morality.
  • ...

then I suppose it could be called a "moral" argument made of "moral" facts and "moral" reasoning, but it's really just the regular form of an argument made of facts and reasoning. The special thing about moral facts is that direct experience is how they are discovered, and it is that same experiential reality to which they exclusively pertain. I'm talking about the set of moment-by-moment first-person perspectives of sentient beings, such as the familiar one you can investigate right now in real time. Without a being experiencing a sensation come and go, there is no moral consideration to evaluate. NULL.

"Objective moral fact" is Bostrom's term from the excerpt above, and the phrasing probably isn't ideal for this discussion. Tabooing such words is no easy feat, but let's do our best to unpack this. Sticking with the proposition we agree is factual:

If one acts with an angry or greedy mind, suffering is guaranteed to follow.

What kind of fact is this? It's a fact that can be discovered and/or verified by any sentient being upon investigation of their own direct experience. It is without exception. It is highly relevant for benefiting oneself and others -- not just humans. For thousands of years, many people have been revered for articulating it and many more have become consistently happy by basing their decisions on it. Most people don't; it continues to be a rare piece of wisdom at this stage of civilization. (Horrifyingly, a person on the edge of starting a war or shooting up a school currently would receive advice from ChatGPT to increase "focused, justified anger.")

Humankind has discovered and recorded a huge body of such knowledge, whatever we wish to call it. If the existence of well-established, verifiable, fundamental insights into the causal nature of experiential reality comes as a surprise to anyone working in fields like psychotherapy or AI alignment, I would urge them to make an earnest and direct inquiry into the matter so they can see firsthand whether such claims have merit. Given the chance, I believe many nonhuman general intelligences would also try and succeed at understanding this kind of information.

(Phew! I packed a lot of words into this comment because I'm too new here to speak more than three times per day. For more on the topic, see the chapter on morality in Dr. Daniel M. Ingram's book that was reviewed on Slate Star Codex.)

Comment by Feel_Love on Summary of and Thoughts on the Hotz/Yudkowsky Debate · 2023-08-20T14:48:43.769Z · LW · GW

My view is that humans have learned objective moral facts, yes. For example:

If one acts with an angry or greedy mind, suffering is guaranteed to follow.

I posit that this is not limited to humans. Some people who became famous in history due to their wisdom who I expect would agree include Mother Teresa, Leo Tolstoy, Marcus Aurelius, Martin Luther King Jr., Gandhi, Jesus, and Buddha.

I don't claim that all humans know all facts about morality. Sadly, it's probably the case that most people are quite lost, ignorant in matters of virtuous conduct, which is why they find life to be so difficult.

Comment by Feel_Love on 6 non-obvious mental health issues specific to AI safety · 2023-08-20T14:18:59.780Z · LW · GW

Thank you for posting this.

In the context of AI safety, I often hear statements to the effect of

This is something we should worry about.

There's a very important, fundamental mistake being made there that can be easy to miss: worrying doesn't help you accomplish any goal, including a very grand one. It's just a waste of time and energy. Terrible habit. If it's important to you that you suffer, then worrying is a good tactic. If AI safety is what's important, then by all means analyze it, strategize about it, reflect on it, communicate about it. Work on it.

Don't worry about it. When you're not working on it, you're not supposed to be worrying about it. You're not supposed to be worrying about something else either. Think a different thought, and both your cognitive work and emotional health will improve. It's pure upside with no opportunity cost. Deliberately change the pattern.

To all those who work on AI safety, thank you! It's extremely important work. May you be happy and peaceful for as long as your life or this world system may persist, the periods of which are finite, unknown to us, and ultimately outside of our control despite our best intentions and efforts.

Comment by Feel_Love on Summary of and Thoughts on the Hotz/Yudkowsky Debate · 2023-08-20T13:05:45.594Z · LW · GW

Thanks for pointing to the orthogonality thesis as a reason for believing the chance would be low that advanced aliens would be nice to humans. I followed up by reading Bostrom's "The Superintelligent Will," and I narrowed down my disagreement to how this point is interpreted:

In a similar vein, even if there are objective moral facts that any fully rational agent would comprehend, and even if these moral facts are somehow intrinsically motivating (such that anybody who fully comprehends them is necessarily motivated to act in accordance with them) this need not undermine the orthogonality thesis. The thesis could still be true if an agent could have impeccable instrumental rationality even whilst lacking some other faculty constitutive of rationality proper, or some faculty required for the full comprehension of the objective moral facts. (An agent could also be extremely intelligent, even superintelligent, without having full instrumental rationality in every domain.)

Just because it's possible that an agent could have impeccable instrumental rationality while lacking in epistemic rationality to some degree, I expect the typical case that leads to very advanced intelligence would eventually involve synergy between growing both in concert, as many here at Less Wrong are working to do. In other words, a highly competent general intelligence is likely to be curious about objective facts across a very diverse range of topics.

So while aliens could be instrumentally advanced enough to make it to Earth without having ever made basic discoveries in a particular area, there's no reason for us to expect that it is specifically the area of morality where they will be ignorant or delusional. A safer bet is that they have learned at least as many objective facts as humans have about any given topic on expectation, and that a topic where the aliens have blind spots in relation to some humans is an area where they would be curious to learn from us.

A policy of unconditional harmlessness and friendliness toward all beings is a Schelling Point that could be discovered in many ways. I grant that humans may have it relatively easy to mature on the moral axis because we are conscious, which may or may not be the typical case for general intelligence. That means we can directly experience within our own awareness facts about how happiness is preferred to suffering, how anger and violence lead to suffering, how compassion and equanimity lead to happiness, and so on. We can also see these processes operating in others. But even a superintelligence with no degree of happiness is likely to learn whatever it can from humans, and learning something like love would be a priceless treasure to discover on Earth.

If aliens show up here, I give them at least a 50% chance of being as knowledgeable as the wisest humans in matters of morality. That's ten times more than Yudkowsky gives them and perhaps infinitely more than Hotz does!

Comment by Feel_Love on Summary of and Thoughts on the Hotz/Yudkowsky Debate · 2023-08-18T07:23:14.064Z · LW · GW

Hello friends. It's hard for me to follow the analogies from aliens to AI. Why should we should expect harm from any aliens who may appear?

15:08 Hotz: "If aliens were to show up here, we're dead, right?" Yudkowsky: "It depends on the aliens. If I know nothing else about the aliens, I might give them something like a five percent chance of being nice." Hotz: "But they have the ability to kill us, right? I mean, they got here, right?" Yudkowsky: "Oh they absolutely have the ability. Anything that can cross interstellar distances can run you over without noticing -- well, they would notice, but they wouldn't ca--" [crosstalk] Hotz: "I didn't expect this to be a controversial point. But I agree with you that if you're talking about intelligences that are on the scale of billions of times smarter than humanity... yeah, we're in trouble."

Having listened to the whole interview, my best guess is that Hotz believes that advanced civilizations are almost certain to be Prisoner's Dilemma defectors in the extreme, i.e. they have survived by destroying all other beings they encounter. If so, this is quite disturbing in connection with 12:08, in which Hotz expresses his hope that our civilization will expand across the galaxy (in which case we potentially get to be the aliens).

Hotz seems certain aliens would destroy us, and Eliezer gives them only a five percent chance of being nice.

This is especially odd considering the rapidly growing evidence that humans actually have been frequently seeing and sometimes interacting with a much more advanced intelligence.

It's been somewhat jarring for my belief in the reality of nonhuman spacecraft to grow by so much in so little time, but overall it has been a great relief to consider the likelihood that another intelligence in this universe has already succeeded in surviving far beyond humankind's current level of technology. It means that we too could survive the challenges ahead. The high-tech guys might even help us, whoever they are.

But Hotz and Yudkowsky seem to agree that seeing advanced aliens would actually be terrible news. Why?