Posts

Comments

Comment by astridain (aristide-twain) on Lack of Social Grace Is an Epistemic Virtue · 2023-07-31T21:30:01.143Z · LW · GW

That might be a fault with my choice of example. (I am not infact in fact a master of etiquette.) But I'm sure examples can be supplied where "the polite thing to say" is a euphemism that you absolutely do expect the other person to understand. At a certain level of obviousness and ubiquity, they tend to shift into figures of speech. “Your loved one has passed on” instead of “you loved one is dead”, say.

And yes, that was a typo. Your way of expressing it might be considered an example of such unobtrusive politeness. My guess is that you said “I assume that's just a slip” not because you have assigned noteworthy probability-mass to the hypothesis “astridain had a secretly brilliant reason for saying the opposite of what you'd expect and I just haven't figured it out”, but because it's nicer to fictitiously pretend to care about that possibility than to bluntly say “you made an error”. It reduces the extent to which I feel stupid in the moment; and it conveys a general outlook of your continuing to treat me as a worthy conversation partner; and that's how I understand the note. I don't come away with a false belief that you were genuinely worried about the possibility that there was a brilliant reason I'd reversed the pronouns and you couldn't see it. You didn't expect me to, and you didn't expect anyone to. It's just a graceful way of correcting someone.

Comment by astridain (aristide-twain) on Lack of Social Grace Is an Epistemic Virtue · 2023-07-31T19:15:49.642Z · LW · GW

Some of it might be actual-obfuscation if there are other people in the room, sure. But equally-intelligent equally-polite people are still expected to dance the dance even if they're alone. 

Your last paragraph gets at what I think is the main thing, which is basically just an attempt at kindness. You find a nicer, subtler way to phrase the truth in order to avoid shocking/triggering the other person. If both people involved were idealised Bayesian agents this would be unnecessary, but idealised Bayesian agents don't have emotions, or at any rate they don't have emotions about communication methods. Humans, on the other hand, often do; and it's often not practical to try and train ourselves out of them completely; and even if it were, I don't think it's ultimately desirable. Idiosyncratic, arbitrary preferences are the salt of human nature; we shouldn't be trying to smooth them out, even if they're theoretically changeable to something more convenient. That way lies wireheading.

Comment by astridain (aristide-twain) on Lack of Social Grace Is an Epistemic Virtue · 2023-07-31T17:56:45.054Z · LW · GW

I think this misses the extent to which a lot of “social grace” doesn't actually decrease the amount of information conveyed; it's purely aesthetic — it's about finding comparatively more pleasant ways to get the point across. You say — well, you say “I think she's a little out of your league” instead of saying “you're ugly”. But you expect the ugly man to recognise the script you're using, and grok that you're telling him he's ugly! The same actual, underlying information is conveyed!

The cliché with masters of etiquette is that they can fight subtle duels of implied insults and deferences, all without a clueless shmoe who wandered into the parlour even realising. The kind of politeness that actually impedes transmission of information is a misfire; a blunder. (Though in some cases it's the person who doesn't get it who would be considered “to blame”.)

Obviously it's not always like this. And rationalists might still say “why are we spending all this brainpower encrypting our conversations just so that the other guy can decrypt them again? it's unnecessary at best”. But I don't grant your premise that social grace is fundamentally about actual obfuscation rather than pretend-obfuscation.

Comment by astridain (aristide-twain) on Cosmopolitan values don't come free · 2023-06-02T18:08:33.869Z · LW · GW

 My guess is mostly that the space is so wide that you don't even end up with AIs warping existing humans into unrecognizable states, but do in fact just end up with the people dead

Why? I see a lot of opportunities for s-risk or just generally suboptimal future in such options, but "we don't want to die, or at any rate we don't want to die out as a species" seems like an extremely simple, deeply-ingrained goal that almost any metric by which the AI judges our desires should be expected to pick up, assuming it's at all pseudokind. (In many cases, humans do a lot to protect endangered species even as we do diddly-squat to fulfill individual specimens' preferences!) 

Comment by astridain (aristide-twain) on My Assessment of the Chinese AI Safety Community · 2023-04-26T10:13:25.424Z · LW · GW

It's about trade-offs. HPMOR/an equally cringey analogue will attract a certain sector of weird people into the community who can then be redirected towards A.I. stuff — but it will repel a majority of novices because it "taints" the A.I. stuff with cringiness by association.

This is a reasonable trade-off if:

  1. the kind of weird people who'll get into HPMOR are also the kind of weird people who'd be useful to A.I. safety;
  2. the normies were already likely to dismiss the A.I. stuff with or without the added load of cringe.

In the West, 1. is true because there's a strong association between techy people and niche fandom, so even though weird nerds are a minority, they might represent a substantial fraction of the people you want to reach.  And 2. is kind of true for a related reason, which is that "nerds" are viewed as generally cringe even if they don't specifically talk about HP fanfiction; it's already assumed that someone who thinks about computers all days is probably the kind of cringe who'd be big into a semi-self-insert HP fanfiction. 

But in China, from @Lao Mein's testimony, 1. is definitely not true (a lot of the people we want to reach would be on Team "this sounds weird and cringe, I'm not touching it") and 2. is possibly not true (if computer experts ≠ fandom nerds in Chinese popular consciousness, it may be easier to get broad audiences to listen to a non-nerdy computer expert talking about A.I.). 

Comment by astridain (aristide-twain) on The Kids are Not Okay · 2023-03-08T20:45:25.873Z · LW · GW

If I was feeling persistently sad or hopeless and someone asked me for the quality of my mental health, and I had the energy to reply, I would reply ‘poor, thanks for asking.’

I wouldn't, not if I was in fact experiencing a rough enough patch of life that I rationally and correctly believed these feelings to be accurate. If I had been diagnosed with terminal cancer, for example, I would probably say that I was indeed sad and hopeless, but not that I had any mental health issues; indeed I'd be concerned with my mental health if I wasn't feeling that way. I find that this extends to beliefs about the future in general being screwed rather than your personal future (take A.I. doomerism: I think Eliezer is fairly sad and hopeless, and I don't think he'd say that makes him mental ill). So if 13% of the kids genuinely believe to some degree that their personal life sucks and will realistically always suck, and/or that the world is doomed for whatever combination of climate change and other known or perceived x-risks, that would account for this, surely?

Comment by astridain (aristide-twain) on The Limit of Language Models · 2022-12-25T15:42:21.188Z · LW · GW

At a guess, focusing on transforming information from images and videos into text, rather than generating text qua text, ought to help — no? 

Comment by astridain (aristide-twain) on Applying superintelligence without collusion · 2022-11-22T13:36:55.534Z · LW · GW

We maybe need an introduction to all the advance work done on nanotechnology for everyone who didn't grow up reading "Engines of Creation" as a twelve-year-old or "Nanosystems" as a twenty-year-old.

Ah. Yeah, that does sound like something LessWrong resources have been missing, then — and not just for my personal sake. Anecdotally, I've seen several why-I'm-an-AI-skeptic posts circulating on social media for whom "EY makes crazy leaps of faith about nanotech" was a key point of why they rejected the overall AI-risk argument.

(As it stands, my objection to your mini-summary would be that that sure, "blind" grey goo does trivially seem possible, but programmable/'smart' goo that seeks out e.g. computer CPUs in particular could be a whole other challenge, and a less obviously solvable one looking at bacteria. But maybe that "common-sense" distinction dissolves with a better understanding of the actual theory.)

Comment by astridain (aristide-twain) on Applying superintelligence without collusion · 2022-11-20T18:32:08.923Z · LW · GW

Hang on — how confident are you that this kind of nanotech is actually, physically possible? Why? In the past I've assumed that you used "nanotech" as a generic hypothetical example of technologies beyond our current understanding that an AGI could develop and use to alter the physical world very quickly. And it's a fair one as far as that goes; a general intelligence will very likely come up with at least one thing as good as these hypothetical nanobots. 

But as a specific, practical plan for what to do with a narrow AI, this just seems like it makes a lot of specific unstated assumption about what you can in fact do with nanotech in particular. Plausibly the real technologies you'd need for a pivotal act can't be designed without thinking about minds. How do we know otherwise? Why is that even a reasonable assumption?

Comment by astridain (aristide-twain) on It’s Probably Not Lithium · 2022-09-25T16:29:44.157Z · LW · GW

Slightly boggling at the idea that nuts and eggs aren't tasty? And I completely lose the plot at "condiments". Isn't the whole point of condiments that they are tasty? What sort of definition of "tasty" are you going with?

Comment by astridain (aristide-twain) on All AGI safety questions welcome (especially basic ones) [July 2022] · 2022-08-02T00:00:20.648Z · LW · GW

Yes, I agree. This is why I said "I don't think this is correct". But unless you specify this, I don't think a layperson would guess this.

Comment by astridain (aristide-twain) on Moral strategies at different capability levels · 2022-07-28T01:03:16.255Z · LW · GW

Thank you! This is helpful. I'll start with the bit where I still disagree and/or am still confused, which is the future people. You write:

The reductio for caring more about future peoples' agency is in cases where you can just choose their preferences for them. If the main thing you care about is their ability to fulfil their preferences, then you can just make sure that only people with easily-satisfied preferences (like: the preference that grass is green) come into existence.

Sure. But also, if the main thing you care about is their ability to be happy, you can just make sure that only people whom green grass sends to the heights of ecstasy come into existence? This reasoning seems like it proves too much. 

I'd guess that your reply is going to involve your kludgier, non-wireheading-friendly idea of "welfare". And that's fair enough in terms of handling this kind of dilemma in the real world; but running with a definition of "welfare" that smuggles in that we also care about agency a bit… seems, to me, like it muddles the original point of wanting to cleanly separate the three "primary colours" of morality.

That aside:

Re: animals, I think most of our disagreement just dissolves into semantics. (Yay!) IMO, keeping animals away from situations which they don't realize would kill them just falls under the umbrella of using our superior knowledge/technology to help them fulfill their own extrapolated preference to not-get-run-over-by-a-car. In your map this probably taken care of by your including some component of agency in "welfare", so it all works out.

Re: caring about paperclip paximizers: intuitively I care about creatures' agencies iff they're conscious/sentient, and I care more if they have feelings and emotions I can grok. So, I care a little about the paperclip-maximizers getting to maximize paperclips to their heart's content if I am assured that they are conscious; and I care a bit more if I am assured that they feel what I would recognise as joy and sadness based on the current number of paperclips. I care not at all otherwise.

Comment by astridain (aristide-twain) on Moral strategies at different capability levels · 2022-07-27T22:24:36.095Z · LW · GW

I like this breakdown! But I have one fairly big asterisk — so big, in fact, that I wonder if I'm misunderstanding you completely.

Care-morality mainly makes sense as an attitude towards agents who are much less capable than you - for example animals, future people, and people who aren’t able to effectively make decisions for themselves.

I'm not sure animals belong on that list, and I'm very sure that future people don't. I don't see why it should be more natural to care about future humans' happiness than about their preferences/agency (unless, of course, one decides to be that breed of utilitarian across the board, for present-day people as well as future ones). 

Indeed, the fact that one of the futures we want to avoid is one of future humans losing all control over their destiny, and instead being wireheaded to one degree or another by a misaligned A.I., handily demonstrates that we don't think about future-people in those terms at all, but in fact generally value their freedom and ability to pursue their own preferences, just as we do our contemporaries'. 

(As I said, I also disagree with taking this approach for animals. I believe that insofar as animals have intelligible preferences, we should try to follow those, not perform naive raw-utility calculations — so that e.g. the question is not whether a creature's life is "worth living" in terms of a naive pleasure/pain ratio, but whether the animal itself seems to desire to exist. That being said, I do know nonzero amounts of people in this community have differing intuitions on this specific question, so it's probably fair game to include in your descriptive breakdown.)

Comment by astridain (aristide-twain) on All AGI safety questions welcome (especially basic ones) [July 2022] · 2022-07-19T13:32:03.721Z · LW · GW

The common-man's answer here would presumably be along the lines of "so we'll just make it illegal for an A.I. to control vast sums of money long before it gets to owning a trillion — maybe an A.I. can successfully pass off as an obscure investor when we're talking tens of thousands or even millions, but if a mysterious agent starts claiming ownership of a significant percentage of the world GDP, its non-humanity will be discovered and the appropriate authorities will declare its non-physical holdings void, or repossess them, or something else sensible".

To be clear I don't think this is correct, but this is a step you would need to have an answer for.

Comment by astridain (aristide-twain) on Research Notes: What are we aligning for? · 2022-07-09T01:05:43.109Z · LW · GW

"Self-improvement" is one of those things which most humans can nod along to, but only because we're all assigning different meanings to it. Some people will read "self-improvement" and think self-help books, individual spiritual growth, etc.; some will think "transhumanist self-alteration of the mind and body"; some will think "improvement of the social structure of humanity even if individual humans remain basically the same"; etc. 

It looks like a non-controversial thing to include on the list, but that's basically an optical illusion. 

For those same reasons, it is much too broad to be programmed into an AGI as-is without horrifying consequences. The A.I. settling on "maximise human biological self-engineering" and deciding to nudge extremist eugenicists into positions of power is, like, one of the optimistic scenarios for how well that could go. I'm sure you can theoretically define "self-improvement" in ways that don't lead to horrifying scenarios, but then we're just back to Square 1 of having to think harder about what moral parameters to set rather than boiling it all down to an allegedly "simple" goal like "human self-improvement".

Comment by astridain (aristide-twain) on Convince me that humanity *isn’t* doomed by AGI · 2022-04-20T17:42:57.664Z · LW · GW

I agree the point as presented by OP is weak, but I think there is a stronger version of this argument to be made. I feel like there are a lot of world-states where A.I. is badly-aligned but non-murderous simply because it's not particularly useful to it to kill all humans.

Paperclip-machine is a specific kind of alignment failure; I don't think it's hard to generate utility functions orthogonal to human concerns that don't actually require the destruction of humanity to implement. 

The scenario I've been thinking the most about lately, is an A.I. that learns how to "wirehead itself" by spoofing its own reward function during training, and whose goal is just to continue to do that indefinitely. But more generally, the "you are made of atoms and these atoms could be used for something else" cliché is based on an assumption that the misaligned A.I.'s faulty utility function is going to involve maximizing number of atoms arranged in a particular way, which I don't think is obvious at all. Very possible, don't get me wrong, but not a given.

Of course, even an A.I. with no "primary" interest in altering the outside world is still dangerous, because if it estimates that we might try to turn it off, it might expend energy now on acting in the real world to secure its valuable self-wireheading peace later. But that whole "it doesn't want us to notice it's useless and press the off-button" class of A.I.-decides-to-destroy-humanity scenarios is predicated on us having the ability to turn off the A.I. in the first place. 

(I don't think I need to elaborate on the fact that there are a lot of ways for a superintelligence to ensure its continued existence other than planetary genocide — after all, it's already a premise of most A.I. doom discussion that we couldn't turn an A.I. off again even if we do notice it's going "wrong".)

Comment by astridain (aristide-twain) on Lies Told To Children · 2022-04-17T22:44:12.881Z · LW · GW

I don't think those are contradictory? It can both be "there would be value drift" and "this might be quite bad, actually". Anyway, whatever the actual actual spirit of that bit in TWC, that doesn't change my question of wanting some clarity on whether the worse bits of Dath Ilan are intended in the same spirit.

Comment by astridain (aristide-twain) on Lies Told To Children · 2022-04-16T21:34:54.597Z · LW · GW

Quite a good story. But I think at this point I would quite like Eliezer to make some sort of statement about to what degree he endorses Dath Ilan, ethically speaking.  As a fictional setting it's a great machine for fleshing out thought experiments, of course, but it seems downright dystopian in many ways. 

(I mean, the fact that they're cryopreserving everyone and have AGI under control means they're morally "preferable" to Earth, but that's sort of a cheat. For example, you could design an alt. history where the world is ruled by a victorious Third Reich, but where Hitler got super into cryopreservation in his old age, and poured a lot of resources and authority into getting the populations under his control to accept it too. Probably in the long run that world is "preferable" to the world where the Allies win but billions more brains rot — but it's still not much of a utopia in any useful sense.)

Of course, Eliezer previously stuck the non-consensual-sex thing in Three Worlds Collide, as an attempt to simulate the “future societies will likely trivialize things we consider unthinkable and there's no way to tell what” effect. I suspect — hope? — some of the ickier parts of Dath Ilan are in the same boat. 

However, there are also many elements of Dath Ilan that are obviously supposed to come across as straightforwardly aspirational. And I would rather like some clarity on what bits are meant to ring how, both to get a fuller experience out of reading the existing Dath Ilan stuff and any further entries, and to improve my model of Eliezer's utility function. 

…Oh, by the way, I'm posting this here, because I hadn't previously realized Eliezer was going to keep returning to Dath Ilan, but the central premise of this one isn't really what sticks out to me the most, "are we sure we want a world like this?"-wise. Mostly it was some of the stuff in the original Q&A. 

Comment by astridain (aristide-twain) on Lies Told To Children · 2022-04-16T21:13:31.492Z · LW · GW

I have a slightly different perspective on this — I don't know how common this is, but looking back on my feelings on Santa Claus as a young child, they had more to do with belief-in-belief than with an "actual" belief in an "actual" Santa. It was religious faith as I understand it; I wanted, vaguely, to be the sort of kid who believed in Santa Claus; I looked for evidence that Santa Claus was real, for theories of how he could be real even if magic wasn't. So the lesson it taught me when I stopped believing in the whole thing was more of an insight about what it was like inside religious people's heads.

Comment by astridain (aristide-twain) on MIRI announces new "Death With Dignity" strategy · 2022-04-04T15:57:04.744Z · LW · GW

Most fictional characters are optimised to make for entertaining stories, hence why "generalizing from fictional evidence" is usually a failure-mode. The HPMOR Harry and the Comet King were optimized by two rationalists as examples of rationalist heroes — and are active in allegorical situations engineered to say something that rationalists would find to be “of worth” about real world problems. 

They are appealing precisely because they encode assumptions about what a real-world, rationalist “hero” ought to be like. Or at least, that's the hope. So, they can be pointed to as “theses” about the real world by Yudkowsky and Alexander, no different from blog posts that happen to be written as allegorical stories, and if people found the ideas encoded in those characters more convincing than the ideas encoded in the present April Fools' Day post, that's fair enough. 

Not necessarily correct on the object-level, but, if it's wrong, it's a different kind of error from garden-variety “generalizing from fictional evidence”.

Comment by astridain (aristide-twain) on Duplication versus probability · 2018-06-24T12:51:25.489Z · LW · GW

We don't actually know the machine works more than once, do we? It creates "a" duplicate of you "when" you pull the lever. That doesn't necessarily imply that it outputs additional duplicates if you keep pulling the lever. Maybe it has a limited store of raw materials to make the duplicates from, who knows.

Besides, I was just munchkinning myself out of a situation where a sentient individual has to die (i.e. a version of myself). Creating an army up there may have its uses but does not relate to the solving of the initial problem. Unless we are proposing the army make a human ladder? Seems unpleasant.

Comment by astridain (aristide-twain) on Wirehead your Chickens · 2018-06-20T16:55:24.418Z · LW · GW

But you make it sound as though these people are objectively “wrong”, as if they're *trying* to actually reduce animal suffering in the absolute but end up working on the human proxy because of a bias. That may be true of some, but surely not all. What ozymandias was, I believe, trying to express, is that some of the people who'd reject your solutions consciously find them ethically unacceptable, not merely recoil from them because they'd *instinctively* be against their being used on humans.

Comment by astridain (aristide-twain) on Duplication versus probability · 2018-06-20T16:39:25.100Z · LW · GW

Being a hopeless munchkin, I will note that the thought experiment has an obvious loophole: for the choice to truly be a choice, we would have to assume, somewhat arbitrarily, that using the duplication lever will disintegrate the machinery. Else, you could pull the lever to create a duplicate who'll deliver the message, and *then* the you at the bottom of the well could rip up the machinery and take their shot at climbing up.