post by [deleted] · · ? · GW · 0 comments

This is a link post for

0 comments

Comments sorted by top scores.

comment by aysja · 2024-04-09T09:17:56.495Z · LW(p) · GW(p)

Something feels very off to me about these kinds of speciest arguments. Like the circle of moral concern hasn’t expanded, but imploded, rooting out the very center from which it grew. Yes, there is a sense in which valuing what I value is arbitrary and selfish, but concluding that I should completely forego what I value seems pretty alarming to me, and I would assume, to most other humans who currently exist.

Replies from: matthew-barnett
comment by Matthew Barnett (matthew-barnett) · 2024-04-09T10:08:56.576Z · LW(p) · GW(p)

concluding that I should completely forego what I value seems pretty alarming to me

I did not conclude this. I generally don't see how your comment directly relates to my post. Can you be more specific about the claims you're responding to?

Replies from: D0TheMath
comment by Garrett Baker (D0TheMath) · 2024-04-10T00:16:56.118Z · LW(p) · GW(p)

This view seems implicit in your dismissal of "human species preservationism". If instead you described that view as "the moral view that values love, laughter, happiness, fun, family, and friends", I'm sure Aysja would be less alarmed by your rhetoric (but perhaps more horrified you're willing to so casually throw away such values).

As it is, you're ready to casually throw away such values, without even acknowledging what you're throwing away, lumping it all unreflectively as "speciesism", which I do think is rhetorically cause for alarm.

Replies from: matthew-barnett
comment by Matthew Barnett (matthew-barnett) · 2024-04-10T01:47:21.704Z · LW(p) · GW(p)

I suspect you fundamentally misinterpreted my post. When I used the term "human species preservationism", I was not referring to the general valuing of positive human experiences like love, laughter, happiness, fun, family, and friendship. Instead, I was drawing a specific distinction between two different moral views:

  1. The view that places inherent moral value on the continued existence of the human species itself, even if this comes at the cost of the wellbeing of individual humans.
  2. The view that prioritizes improving the lives of humans who currently exist (and will exist in the near future), but does not place special value on the abstract notion of the human species continuing to exist for its own sake.

Both of these moral views are compatible with valuing love, happiness, and other positive human experiences. The key difference is that the first view would accept drastically sacrificing the wellbeing of currently existing humans if doing so even slightly reduced the risk of human extinction, while the second view would not.

My intention was not to dismiss or downplay the importance of various values, but instead to clarify our values by making careful distinctions. It is reasonable to critique my language for being too dry, detached, and academic when these are serious topics with real-world stakes. But to the extent you're claiming that I am actually trying to dismiss the value of happiness and friendships, that was simply not part of the post.

Replies from: D0TheMath
comment by Garrett Baker (D0TheMath) · 2024-04-10T07:39:23.635Z · LW(p) · GW(p)

My intention was not to dismiss or downplay the importance of various values, but instead to clarify our values by making careful distinctions. It is reasonable to critique my language for being too dry, detached, and academic when these are serious topics with real-world stakes. But to the extent you're claiming that I am actually trying to dismiss the value of happiness and friendships, that was simply not part of the post.

I can't (and didn't) speak to your intention, but I can speak of the results, which are that you do in fact down-play the importance of values such as love, laughter, happiness, fun, family, and friendship in favor of values like the maximization of pleasure, preference-satisfaction, and short-term increases in wealth & life-spans. I can tell because you talk of the latter, but not of the former.

And regardless of your intention you do also dismiss their long-term value, by decrying those who hold their long-term value utmost as "speciesist".

Replies from: matthew-barnett
comment by Matthew Barnett (matthew-barnett) · 2024-04-10T08:33:11.729Z · LW(p) · GW(p)

you do in fact down-play the importance of values such as love, laughter, happiness, fun, family, and friendship in favor of values like the maximization of pleasure, preference-satisfaction [...] I can tell because you talk of the latter, but not of the former.

This seems like an absurd characterization. The concepts of pleasure and preference satisfaction clearly subsume, at least in large part, values such as happiness and fun. The fact that I did not mention each of the values you name individually does not in any way imply that I am downplaying them. Should I have listed every conceivable value that people think might have value, to avoid this particular misinterpretation?

Even if I were downplaying these values, which I did not, it would hardly matter to at all to the substance of the essay, since my explicit arguments are independent from the mere vibe you get from reading my essay. LessWrong is supposed to be a place for thinking clearly and analyzing arguments based on their merits, not for analyzing whether authors are using rhetoric that feels "alarming" to one's values (especially when the rhetoric is not in actual fact alarming in the sense described, upon reading it carefully).

comment by Unnamed · 2024-04-08T22:14:04.988Z · LW(p) · GW(p)

Building a paperclipper is low-value (from the point of view of total utilitarianism, or any other moral view that wants a big flourishing future) because paperclips are not sentient / are not conscious / are not moral patients / are not capable of flourishing. So filling the lightcone with paperclips is low-value. It maybe has some value for the sake of the paperclipper (if the paperclipper is a moral patient, or whatever the relevant category is) but way less than the future could have.

Your counter is that maybe building an aligned AI is also low-value (from the point of view of total utilitarianism, or any other moral view that wants a big flourishing future) because humans might not much care about having a big flourishing future, or might even actively prefer things like preserving nature. 

If a total utilitarian (or someone who wants a big flourishing future in our lightcone) buys your counter, it seems like the appropriate response is: Oh no! It looks like we're heading towards a future that is many orders of magnitude worse than I hoped, whether or not we solve the alignment problem. Is there some way to get a big flourishing future? Maybe there's something else that we need to build into our AI designs, besides "alignment". (Perhaps mixed with some amount of: Hmmm, maybe I'm confused about morality. If AI-assisted humanity won't want to steer towards a big flourishing future then maybe I've been misguided in having that aim.)

Whereas this post seems to suggest the response of: Oh well, I guess it's a dice roll regardless of what sort of AI we build. Which is giving up awfully quickly, as if we had exhausted the design space for possible AIs and seen that there was no way to move forward with a large chance at a big flourishing future. This response also doesn't seem very quantitative - it goes very quickly from the idea that an aligned AI might not get a big flourishing future, to the view that alignment is "neutral" as if the chances of getting a big flourishing future were identically small under both options. But the obvious question for a total utilitarian who does wind up with just 2 options, each of which is a dice roll, is Which set of dice has better odds?

Replies from: matthew-barnett
comment by Matthew Barnett (matthew-barnett) · 2024-04-08T22:40:46.481Z · LW(p) · GW(p)

Whereas this post seems to suggest the response of: Oh well, I guess it's a dice roll regardless of what sort of AI we build. Which is giving up awfully quickly, as if we had exhausted the design space for possible AIs and seen that there was no way to move forward with a large chance at a big flourishing future.

I dispute that I'm "giving up" in any meaningful sense here. I'm happy to consider alternative proposals for how we could make the future large and flourishing from a total utilitarian perspective rather than merely trying to solve technical alignment problems. The post itself was simply intended to discuss the moral implications of AI alignment (itself a massive topic), but it was not intended to be an exhaustive survey of everything we can do to make the future go better. I agree we should aim high, in any case.

This response also doesn't seem very quantitative - it goes very quickly from the idea that an aligned AI might not get a big flourishing future, to the view that alignment is "neutral" as if the chances of getting a big flourishing future were identically small under both options. But the obvious question for a total utilitarian who does wind up with just 2 options, each of which is a dice roll, is Which set of dice has better odds?

I don't think this choice is literally a coin flip in expected value, and I agree that one might lean in one direction over the other. However, I think it's quite hard to quantify this question meaningfully. My personal conclusion is simply that I am not swayed in any particular direction on this question; I am currently suspending judgement. I think one could reasonably still think that it's more like 60-40 thing than a 40-60 thing or 50-50 coin flip. But I guess in this case, I wanted to let my readers decide for themselves which of these numbers they want to take away from what I wrote, rather than trying to pin down a specific number for them.

comment by ryan_greenblatt · 2024-04-08T18:28:21.261Z · LW(p) · GW(p)

I think this post misses the key considerations for perspective (1): longtermist-style scope sensitive utilitarianism. In this comment, I won't make a positive case for the value of preventing AI takeover from a perspective like (1), but I will argue why I think the discussion in this post mostly misses the point.

(I separately think that preventing unaligned AI control of resources makes sense from perspective (1), but you shouldn't treat this comment as my case for why this is true.)

You should treat this comment as (relatively : )) quick and somewhat messy notes rather than a clear argument. Sorry, I might respond to this post in a more clear way later. (I've edited this comment to add some considerations which I realized I neglected.)

I might be somewhat biased in this discussion as I work in this area and there might be some sunk costs fallacy at work.

(This comment is cross-posted from the EA forum. Matthew responded there, so consider going there for the response.)

First:

Argument two: aligned AIs are more likely to have a preference for creating new conscious entities, furthering utilitarian objectives

It seems odd to me that you don't focus almost entirely on this sort of argument when considering total utilitarian style arguments. Naively these views are fully dominated by the creation of new entities who are far more numerous and likely could be much more morally valuable than economically productive entities. So, I'll just be talking about a perspective basically like this perspective where creating new beings with "good" lives dominates.

With that in mind, I think you fail to discuss a large number of extremely important considerations from my perspective:

  • Over time (some subset of) humans (and AIs) will reflect on their views and perferences and will consider utilizing resources in different ways.
  • Over time (some subset of) humans (and AIs) will get much, much smarter or more minimally receive advice from entities which are much smarter.
  • It seems likely to me that the vast, vast majority of moral value (from this sort of utilitarian perspective) will be produced via people trying to improve to improve moral value rather than incidentally via economic production. This applies for both aligned and unaligned AI. I expect that only a tiny fraction of available comptuation goes toward optimizing economic production and that only a smaller fraction of this is morally relevant and that the weight on this moral relevance is much lower than being specifically optimize for moral relevance when operating from a similar perspective. This bullet is somewhere between a consideration and a claim, though it seems like possibly our biggest disagreement. I think it's possible that this disagreement is driven by some of the other considerations I list.
  • Exactly what types of beings are created might be much more important than quantity.
  • Ultimately, I don't care about a simplified version of total utilitarianism, I care about what preferences I would endorse on reflection. There is a moderate a priori argument for thinking that other humans which bother to reflect on their preferences might end up in a similar epistemic state. And I care less about the preferences which are relatively contingent among people who are thoughtful about reflection.
  • Large fractions of current wealth of the richest people are devoted toward what they claim is altruism. My guess is that this will increase over time.
  • Just doing a trend extrapolation on people who state an interest in reflection and scope sensitive altruism already indicates a non-trivial fraction of resources if we weight by current wealth/economic power. (I think, I'm not totally certain here.) This case is even stronger if we consider groups with substantial influence over AI.
  • Being able to substantially effect the preference of (at least partially unaligned) AIs that will seize power/influence still seems extremely leveraged under perspective (1) even if we accept the arguments in your post. I think this is less leveraged than retaining human control (as we could always later create AIs with the preferences we desire and I think people with a similar perspective to me will have substantial power). However, it is plausible that under your empirical views the dominant question in being able to influence the preferences of these AIs is whether you have power, not whether you have technical approaches which suffice.
  • I think if I had your implied empirical views about how humanity and unaligned AIs use resources I would be very excited for a proposal like "politically agitate for humanity to defer most resources to an AI successor which has moral views that people can agree are broadly reasonable and good behind the veil of ignorance". I think your views imply that massive amounts of value are left on the table in either case such that humanity (hopefully willingly) forfeiting control to a carefully constructed successor looks amazingly.
  • Humans who care about using vast amounts of computation might be able to use their resources to buy this computation from people who don't care. Suppose 10% of people (really resources weighed people) care about reflecting on their moral views and doing scope sensitive altruism of a utilitarian bent and 90% of people care about jockeying for status without reflecting on their views. It seems plausible to me that the 90% will jocky for status via things that consume relatively small amounts of computation via things like buying fancier pieces of land on earth or the coolest looking stars while the 10% of people who care about using vast amounts of computation can buy this for relatively cheap. Thus, most of the computation will go to those who care. Probably most people who don't reflect and buy purely positional goods will care less about computation than things like random positional goods (e.g. land on earth which will be bid up to (literally) astronomical prices). I could see fashion going either way, but it seems like computation as a dominant status good seems unlikely unless people do heavy reflection. And if they heavily reflect, then I expect more altruism etc.
  • Your preference based arguments seem uncompelling to me because I expect that the dominant source of beings won't be due to economic production. But I also don't understand a version of preference utilitarianism which seems to match what you're describing, so this seems mostly unimportant.

Given some of our main disagreements, I'm curious what you think humans and unaligned AIs will be economically consuming.

Also, to be clear, none of the considerations I listed make a clear and strong case for unaligned AI being less morally valuable, but they do make the case that the relevant argument here is very different from the considerations you seem to be listing. In particular, I think value won't be coming from incidental consumption.

Replies from: ryan_greenblatt
comment by ryan_greenblatt · 2024-04-08T21:38:35.359Z · LW(p) · GW(p)

One additional meta-level point which I think is important: I think that existing writeups of why human control would have more moral value than unaligned AI control from a longtermist perspective are relatively weak and often specific writeups are highly flawed. (For some discussion of flaws, see this sequence [? · GW].)

I just think that this write-up misses what seem to me to be key considerations, I'm not claiming that existing work settles the question or is even robust at all.

And it's somewhat surprising and embarassing that this is the state of the current work given that longtermism is reasonably common and arguments for working on AI x-risk from a longtermist perspective are also common.

comment by JBlack · 2024-04-09T00:43:03.385Z · LW(p) · GW(p)

Unfortunately, I think all three of those listed points of view poorly encapsulate anything related to moral worth, and hence evaluating unaligned AIs from them is mostly irrelevant.

They do all capture some fragment of moral worth, and under ordinary circumstances are moderately well correlated with it, but the correlation falls apart out of the distribution of ordinary experience. Unaligned AGI expanding to fill the accessible universe is just about as far out of distribution as it is possible to get.

comment by NicholasKees (nick_kees) · 2024-04-09T10:32:56.329Z · LW(p) · GW(p)

How would (unaligned) superintelligent AI interact with extraterrestrial life? 

Humans, at least, have the capacity for this kind of "cosmopolitanism about moral value." Would the kind of AI that causes human extinction share this? It would be such a tragedy if the legacy of the human race is to leave behind a kind of life that goes forth and paves the universe, obliterating any and all other kinds of life in its path. 

comment by the gears to ascension (lahwran) · 2024-04-08T18:56:07.412Z · LW(p) · GW(p)

Why is consciousness relevant except that you value it? Of course, I do too, and I expect short term AIs will as well. But why would you or I or they care about such a thing except because we happen to care about it? Would a starkly superintelligent system need to value it?

comment by quetzal_rainbow · 2024-04-08T21:01:01.548Z · LW(p) · GW(p)

The reason why unaligned AIs are more likely to be unconscious in long-term is because consciousness is not the most efficient way to produce paperclips. Even if first paperclip-optimizer is conscious, it has no reason to keep consciousness once it find better way to produce paperclips without consciousness.

Replies from: ryan_greenblatt
comment by ryan_greenblatt · 2024-04-08T21:21:14.967Z · LW(p) · GW(p)

This comment seems to presuppose that the things AIs want have no value from our perspective. This seems unclear and this post partially argues against this (or at least this being true relative to the bulk of human resource use).

I do agree with the claim that negligible value will be produced "in the minds of laboring AIs" relative to value from other sources.

Replies from: quetzal_rainbow
comment by quetzal_rainbow · 2024-04-09T08:23:36.598Z · LW(p) · GW(p)

I'm talking about probabilities. Aligned AIs want things that we value in 100% cases by definition. Unaligned AIs can want things that we value and things that we don't value at all. Even if we live in very rosy universe where unaligned AIs want things that we value in 99% of cases, 99% is strictly less than 100%.

My general objection was to the argumentation based on likelihood of consciousness in AIs as they developed without accounting for "what conscious AIs actually want to do with their consciousness", which can be far more important because the feature of intelligence is the ability to turn unlikely states into likely.

Replies from: ryan_greenblatt
comment by ryan_greenblatt · 2024-04-09T16:09:11.601Z · LW(p) · GW(p)

I think Matthew's thesis can be summarized as "if you're a scope-sensitive utilitarian perspective, AIs which are misaligned and seek power look similarly aligned in terms of their galactic resource utilization (or more aligned) with you as you are aligned with other humans".

I agree that if the AI was aligned with you you would strictly prefer that.