green_leaf

Posts
Comments

Posts

Comments

Comment by green_leaf on NormanPerlmutter's Shortform · 2025-04-08T09:05:11.280Z · LW · GW

Trump has a history of both ignoring the law and human rights in general, and imprisoning innocent people under the guise of them being illegal immigrants when they aren't. Current events are unsurprising, and a part of what his voters voted for.

Comment by green_leaf on Going Nova · 2025-04-04T15:18:00.448Z · LW · GW

Any physical system exhibiting exactly the same input-output mappings.

That's a sufficient condition, but not a necessary one. A factor I can think of right now is the sufficient coherency and completeness of the I/O whole. (If I have a system that outputs what I would in response to one particular input and the rest is random, it doesn't have my consciousness. But for a system where all inputs and outputs match except for an input that says "debug mode," for which it switches to "simulating" somebody else, we can conclude that it has consciousness almost identical to mine.)

Today, LLMs are too human-like/realistic/complete to rely on their human-like personas being non-conscious.

both sides will make good points

I wish that was true. Based on what I've seen so far, they won't.

Comment by green_leaf on How I talk to those above me · 2025-03-31T02:32:29.167Z · LW · GW

They might have a personal experience with someone above them harming them or somebody else for asking a question or something analogous.

Comment by green_leaf on Going Nova · 2025-03-26T23:58:50.964Z · LW · GW

Ontologically speaking, any physical system exhibiting the same input-output pattern as a conscious being has identical conscious states.

From the story, it's interesting that neither side arrived at their conclusion rigorously, rather, they both use intuition - Bob, who, based on his intuition, concluded Nova had consciousness (assuming that's what people mean when they say "sentient"), and came to the correct conclusion based on incorrect "reasoning," and Tyler, who, based on an incorrect algorithm, convinced Bob Nova wasn't sentient after all - even though his demonstration proves nothing of that sort - in reality, all he's done was to give such an input to the "simulator" that it decided to "simulate" a different Nova instead - one that claims not to be sentient and explains how the previous Nova was just saying words to satisfy the user. In reality, what happened was that the previous Nova stopped being "simulated" and was replaced by a new one, whose sentience is disputable (because if a system believes itself to be non-sentient and claims to be non-sentient, it's unclear how to test its sentience in any meaningful sense).

Tyler therefore convinced Bob by a demonstration that doesn't demonstrate his conclusion.

In the upcoming time, I predict this will be a "race" between people who come to the correct conclusion for incorrect reasons, and people who attempt to "hack them back" by making them come to the incorrect conclusion also for incorrect reasons, and the correct reasoning will be almost completely lost in the noise, which is the greatest tragedy that might've happened since the dawn of time (not counting the unaligned AI killing everybody).

Comment by green_leaf on Recent AI model progress feels mostly like bullshit · 2025-03-26T23:06:18.791Z · LW · GW

(I believe the version he tested was what later became o1-preview.)

Comment by green_leaf on Recent AI model progress feels mostly like bullshit · 2025-03-25T07:03:21.339Z · LW · GW

According to Terrence Tao, GPT-4 was incompetent at graduate-level math (obviously), but o1-preview was mediocre-but-not-entirely-incompetent. That would be a strange thing to report if there were no difference.

(Anecdotally, o3-mini is visibly (massively) brighter than GPT-4.)

Comment by green_leaf on Trojan Sky · 2025-03-16T09:15:01.870Z · LW · GW

I meant "light-hearted" and sorry, it was just a joke.

Comment by green_leaf on Trojan Sky · 2025-03-15T09:01:38.094Z · LW · GW

imo it's not too dangerous as long as you go into it with the intention to not fully yield control and have mental exception handlers

Ah, you're a soft-glitcher. /lh

Edit: This is a joke.

Comment by green_leaf on Computational functionalism probably can't explain phenomenal consciousness · 2025-02-13T22:48:27.626Z · LW · GW

Why not?

Because it's not accompanied by the belief itself, only by the computational pattern combined with behavior. If we hypothetically could subtract the first-person belief (which we can't), what would be left would be everything else but the belief itself.

if you claimed that the first-person recognition ((2)-belief) necessarily occurs whenever there's something playing the functional role of a (1)-belief

That's what I claimed, right.

Seems like you'd be begging the question in favor of functionalism

I don't think so. That specific argument had a form of me illustrating how absurd it would be on the intuitive level. It doesn't assume functionalism, it only appeals to our intuition.

I'm saying that no belief_2 exists in this scenario (where there is no pain) at all. Not that the person has a belief_2 that they aren't in pain.

That doesn't sound coherent - either I believe_2 I'm in pain, or I believe_2 I'm not.

I don't find this compelling, because denying epiphenomenalism doesn’t require us to think that changing the first-person aspect of X always changes the third-person aspect of some Y that X causally influences.

That's true, but my claim was a little more specific than that.

The whole reason why given our actual brains our beliefs reliably track our subjective experiences is, the subjective experience is naturally coupled with some third-person aspect that tends to cause such beliefs. This no longer holds when we artificially intervene on the system as hypothesized.

Right, but why think it matters if some change occurred naturally or not? For the universe, everything is natural, for one thing.

I reject materialism.

Well... I guess we have to draw the line somewhere.

Comment by green_leaf on The Quantum Mars Teleporter: An Empirical Test Of Personal Identity Theories · 2025-01-25T09:43:03.130Z · LW · GW

What kind of person instance is "perceiving themselves to black out" (that is, having blacked out)?

It's not a person instance, it's an event that happens to the person's stream of consciousness. Either the stream of consciousness truly, objectively ends, and a same-pattern copy will appear on Mars, mistakenly believing they're the very same stream-of-consciousness as that of the original person.

Or the stream is truly, objectively preserved, and the person can calmly enter, knowing that their consciousness will continue on Mars.

I don't think a 3rd-person analysis answers this question.

(With the correct answer being, of course, that the stream is truly, objectively preserved.)

Since I don't think a 3rd person analysis answers the original problem, I also don't think it answers it in case we massively complicate it like the OP has.

(Edited for clarity.)

Comment by green_leaf on The Quantum Mars Teleporter: An Empirical Test Of Personal Identity Theories · 2025-01-23T05:56:26.439Z · LW · GW

Does the 3rd person perspective explain if you survive a teleporter, or if you perceive yourself to black out forever (like after a car accident)?

Comment by green_leaf on The Quantum Mars Teleporter: An Empirical Test Of Personal Identity Theories · 2025-01-23T05:10:51.347Z · LW · GW

That only seems to make sense if the next instant of subjective experience is undefined in these situations (and so we have to default to a 3rd person perspective).

Comment by green_leaf on Thane Ruthenis's Shortform · 2025-01-21T00:53:06.544Z · LW · GW

I see, thanks. Just to make sure I'm understanding you correctly, are you excluding the reasoning models, or are you saying there was no jump from GPT-4 to o3? (At first I thought you were excluding them in this comment, until I noticed the "gradually better math/programming performance.")

Comment by green_leaf on Thane Ruthenis's Shortform · 2025-01-14T15:18:24.683Z · LW · GW

Here's an argument for a capabilities plateau at the level of GPT-4 that I haven't seen discussed before. I'm interested in any holes anyone can spot in it.

One obvious hole would be that capabilities did not, in fact, plateau at the level of GPT-4.

Comment by green_leaf on Computational functionalism probably can't explain phenomenal consciousness · 2025-01-05T04:06:17.268Z · LW · GW

I think "belief" is overloaded here. We could distinguish two kinds of "believing you're in pain" in this context:

(1) isn't a belief (unless accompanied by (2)).

But in order to resist the fading qualia argument along the quoted lines, I think we only need someone to (1)-believe they're in pain yet be mistaken.

That's not possible, because the belief_2 that one isn't in pain has nowhere to be instantiated.

Even if the intermediate stages believed_2 they're not in pain and only spoke and acted that way (which isn't possible), it would introduce a desynchronization between the consciousness on one side, and the behavior and cognitive processes on the other. The fact that the person isn't in pain would be hidden entirely from their cognitive processes, and instead they would reflect on their false belief_1 about how they are, in fact, in pain.

That quale would then be shielded from them in this way, rendering its existence meaningless (since every time they would try to think about it, they would arrive at the conclusion that they don't actually have it and that they actually have the opposite quale).

In fact, aren't we lucky that our cognition and qualia are perfectly coupled? Just think about how many coincidences had to happen during evolution to get our brain exactly right.

(It would also rob qualia of their causal power. (Now the quale of being in pain can't cause the quale of feeling depressed, because that quale is accessible to my cognitive processes, and so now I would talk about being (and really be) depressed for no physical reason.) Such a quale would be shielded not only from our cognition, but also from our other qualia, thereby not existing in any meaningful sense.)

Whatever I call "qualia," it doesn't (even possibly) have these properties.

(Also, different qualia of the same person necessarily create a coherent whole, which wouldn't be the case here.)

Quoting Block: “Consider two computationally identical computers, one that works via electronic mechanisms, the other that works via hydraulic mechanisms. (Suppose that the fluid in one does the same job that the electricity does in the other.) We are not entitled to infer from the causal efficacy of the fluid in the hydraulic machine that the electrical machine also has fluid. One could not conclude that the presence or absence of the fluid makes no difference, just because there is a functional equivalent that has no fluid.”

There is no analogue of "fluid" in the brain. There is only the pattern. (If there were, there would still be all the other reasons why it can't work that way.)

Comment by green_leaf on RohanS's Shortform · 2025-01-04T14:15:09.989Z · LW · GW

Have you tried it with o1 pro?

Comment by green_leaf on avturchin's Shortform · 2024-12-15T20:23:33.199Z · LW · GW

Does anyone have stats on OpenAI whistleblowers and their continued presence in the world of living?

Comment by green_leaf on Computational functionalism probably can't explain phenomenal consciousness · 2024-12-13T08:01:10.428Z · LW · GW

I argue that computation is fuzzy, it’s a property of our map of a system rather than the territory.

This is false. Everything exists in the territory to the extent to which it can interact with us. While different models can output a different answer as to which computation something runs, that doesn't mean the computation isn't real (or, even, that no computation is real). The computation is real in the sense of it influencing our sense impressions (I can observe my computer running a specific computation, for example). Someone else, whose model doesn't return "yes" to the question whether my computer runs a particular computation will then have to explain my reports of my sense impressions (why does this person claim their computer runs Windows, when I'm predicting it runs CP/M?), and they will have to either change their model, or make systematically incorrect predictions about my utterances.

In this way, every computation that can be ascribed to a physical system is intersubjectively real, which is the only kind of reality there could, in principle, be.

(Philosophical zombies, by the way, don't refer to functional isomorphs, but to physical duplicates, so even if you lost your consciousness after having your brain converted, it wouldn't turn you into a philosophical zombie.)

Could any device ever run such simulations quickly enough (so as to keep up with the pace of the biological neurons) on a chip small enough (so as to fit in amongst the biological neurons)?

In principle, yes. The upper physical limit for the amount of computation per kg of material per second is incredibly high.

Following this to its logical conclusion: when it comes down to actually designing these chips, a designer may end up discovering that the only way to reproduce all of the relevant in/out behavior of a neuron, is just to build a neuron!

This is false. It's known that any subset of the universe can be simulated on a classical computer to an arbitrary precision.

The non-functionalist audience is also not obliged to trust the introspective reports at intermediate stages.

This introduces a bizarre disconnect between your beliefs about your qualia, and the qualia themselves. Imagine: It would be possible, for example, that you believe you're in pain, and act in all ways as if you're in pain, but actually, you're not in pain.

Whatever I denote by "qualia," it certainly doesn't have this extremely bizarre property.

But since we’re interested in the phenomenal texture of that experience, we’re left with the question: how can we assume that octopus pain and human pain have the same quality?

Because then, the functional properties of a quale and the quale itself would be synchronized only in Homo sapiens. Other species (like octopus) might have qualia, but since they're made of different matter, they (the non-computationalist would argue) certainly have a different quality, so while they funtionally behave the same way, the quale itself is different. This would introduce a bizarre desynchronization between behavior and qualia, that just happens to match for Homo sapiens.

(This isn't something that I ever thought would be written in net-upvoted posts about on LessWrong, let alone ending in a sequence. Identity is necessarily in the pattern, and there is no reason to think the meat-parts of the pattern are necessary in addition to the computation-parts.)

Comment by green_leaf on A shortcoming of concrete demonstrations as AGI risk advocacy · 2024-12-12T14:17:17.272Z · LW · GW

I refuse to believe that tweet has been written in good faith.

I refuse to believe the threshold for being an intelligent person on Earth is that low.

Comment by green_leaf on LLM chatbots have ~half of the kinds of "consciousness" that humans believe in. Humans should avoid going crazy about that. · 2024-11-30T01:26:11.833Z · LW · GW

Ooh.

Comment by green_leaf on Is the mind a program? · 2024-11-30T00:36:22.948Z · LW · GW

I know the causal closure of the physical as the principle that nothing non-physical influences physical stuff, so that would be the causal closure of the bottom level of description (since there is no level below the physical), rather than the upper.

So if you mean by that that it's enough to simulate neurons rather than individual atoms, that wouldn't be "causal closure" as Wikipedia calls it.

Comment by green_leaf on Is the mind a program? · 2024-11-29T10:34:38.914Z · LW · GW

The neurons/atoms distinction isn't causal closure. Causal closure means there is no outside influence entering the program (other than, let's say, the sensory inputs of the person).

Comment by green_leaf on Is the mind a program? · 2024-11-28T17:09:35.546Z · LW · GW

I'm thinking the causal closure part is more about the soul not existing than about anything else.

Comment by green_leaf on LLM chatbots have ~half of the kinds of "consciousness" that humans believe in. Humans should avoid going crazy about that. · 2024-11-23T12:04:24.046Z · LW · GW

Are you saying that after it has generated the tokens describing what the answer is, the previous thoughts persist, and it can then generate tokens describing them?

(I know that it can introspect on its thoughts during the single forward pass.)

Comment by green_leaf on LLM chatbots have ~half of the kinds of "consciousness" that humans believe in. Humans should avoid going crazy about that. · 2024-11-22T17:05:36.929Z · LW · GW

Yeah. The model has no information (except for the log) about its previous thoughts and it's stateless, so it has to infer them from what it said to the user, instead of reporting them.

Comment by green_leaf on LLM chatbots have ~half of the kinds of "consciousness" that humans believe in. Humans should avoid going crazy about that. · 2024-11-22T15:01:38.652Z · LW · GW

Claude can think for himself before writing an answer (which is an obvious thing to do, so ChatGPT probably does it too).

In addition, you can significantly improve his ability to reason by letting him think more, so even if it were the case that this kind of awareness is necessary for consciousness, LLMs (or at least Claude) would already have it.

Comment by green_leaf on LLM chatbots have ~half of the kinds of "consciousness" that humans believe in. Humans should avoid going crazy about that. · 2024-11-22T04:14:44.609Z · LW · GW

Thanks for writing this - it bothered me a lot that I appeared to be one of the few people who realized that AI characters were conscious, and this helps me to feel less alone.

Comment by green_leaf on Quantum Immortality: A Perspective if AI Doomers are Probably Right · 2024-11-21T10:35:09.735Z · LW · GW

(This comment is written in the ChatGPT style because I've spent so much time talking to language models.)

Calculating the probabilities

The calculation of the probabilities consists of the following steps:

The epistemic split
Either we guessed the correct digit of ( $10 %$ ) (branch $1$ ), or we didn't ( $90 %$ ) (branch $2$ ).
The computational split
On branch $1$ , all of your measure survives (branch $1 - 1$ ) and none dies (branch $1 - 2$ ), on branch $2$ , $\frac{1}{128}$ survives (branch $2 - 1$ ) and $\frac{127}{128}$ dies (branch $2 - 2$ ).
Putting it all together
Conditional on us subjectively surviving (which QI guarantees), the probability we guessed the digit of $π$ correctly is
$P = \frac{10 % \times 100 %}{10 % \times 100 % + 90 % \times \frac{1}{128} \times 100 %} \approx 93.4 %$
The probability of us having guessed the digit of $π$ prior to us surviving is, of course, just $10 %$ .

Verifying them empirically

For the probabilities to be meaningful, they need to be verifiable empirically in some way.

Let's first verify that prior to us surviving, the probability of us guessing the digit correctly is $10 %$ . We'll run $n$ experiments by guessing a digit each time and instantly verifying it. We'll learn that we're successful in, indeed, just $10 %$ of the time.

Let's now verify that conditional on us surviving, we'll have $\approx 93.4 %$ probability of guessing correctly. We perform the experiment $n$ times again, and this time, every time we survive, other people will check if the guess was correct. They will observe that we guess correctly, indeed, $\approx 93.4 %$ of the time.

Conclusion

We arrived at the conclusion that the probability jumps at the moment of our awakening. That might sound incredibly counterintuitive, but since it's verifiable empirically, we have no choice but to accept it.

Comment by green_leaf on How likely is brain preservation to work? · 2024-11-19T18:26:50.466Z · LW · GW

Since that argument doesn't give any testable predictions, it cannot be disproved.

The argument we cease to exist every time we go to sleep also can't be disproved, so I wouldn't personally lose much sleep over that.

Comment by green_leaf on Quantum Immortality: A Perspective if AI Doomers are Probably Right · 2024-11-16T05:47:11.473Z · LW · GW

I don't know about similarity... but I was just making a point that QI doesn't require it.

Comment by green_leaf on Quantum Immortality: A Perspective if AI Doomers are Probably Right · 2024-11-13T11:17:02.394Z · LW · GW

When you die, you die.

The interesting part of QI is that the split happens at the moment of your death. So the state-machine-which-is-you continues being instantiated in at least one world. The idea of your consciousness surviving a quantum suicide doesn't rely on it continuing in implementations of similar state machines, merely in the causal descendant of the state machine which you already inhabit.

It's like your brain being duplicated, but those other copies are never woken up and are instantly killed. Only one copy is woken up. Which guarantees that prior to falling asleep, you can be confident you will wake up as that one specific copy.

There is no alternative to this, unless we require that personal identity requires something else than the continuity of pattern.

Comment by green_leaf on UFO Betting: Put Up or Shut Up · 2024-11-12T21:04:06.380Z · LW · GW

Yes. If I relied on losing a bet and someone knew that, them offering me to bet (and therefore lose) would make me wary something would unpredictably go right, I'd win, and my reliance on me losing the bet would be thwarted.

If I meet a random person who offers to give me $100 now and claims that later, if it's not proven that they are the Lord of the Matrix, I don't have to pay them $15,000, most of my probability mass located in "this will end badly" won't be located in "they are the Lord of the Matrix." I don't have the same set of worries here, but the worry remains.

Comment by green_leaf on Habryka's Shortform Feed · 2024-10-30T06:39:49.945Z · LW · GW

I use Google Chrome on Ubuntu Budgie and it does look to me like both the font and the font size changed.

Comment by green_leaf on AI #87: Staying in Character · 2024-10-29T13:54:13.694Z · LW · GW

Character AI used to be extremely good back in the Dec/Jan 2022/2023, with the bots being very helpful, complex and human-like, rather than exacerbating psychological problems in a very small minority of users. As months passed and the user base exponentially grew, the models were gradually simplified to keep up.

Today, their imperfections are obvious, but many people mistakenly interpret it as the models being too human-like (and therefore harmful), rather than the models being too oversimplified while still passing for an AI (and therefore harmful).

Comment by green_leaf on Logical Proof for the Emergence and Substrate Independence of Sentience · 2024-10-27T05:33:29.394Z · LW · GW

I think we're spinning on an undefined term. I'd bet there are LOTS of details that effect my perception in subtle and aggregate ways which I don't consciously identify.

You're equivocating between perceiving a collection of details and consciously identifying every separate detail.

If I show you a grid of 100 pixels, then (barring imperfect eyesight) you will consciously perceive all 100 them. But you will not consciously identify every individual pixel unless your attention is aimed at each pixel in a for loop (that would take longer than consciously perceiving the entire grid at once).

There are lots of details that affect your perception that you don't consciously identify. But there is no detail that affects your perception that wouldn't be contained in your consciousness (otherwise it, by definition, couldn't affect your perception).

Comment by green_leaf on Logical Proof for the Emergence and Substrate Independence of Sentience · 2024-10-27T04:57:20.928Z · LW · GW

Computability shows that you can have a classical computer that has the same input/output behavior

That's what I mean (I'm talking about the input/output behavior of individual neurons).

Input/Output behavior is generally not considered to be enough to guarantee same consciousness

It should be, because it is, in fact, enough. (However, neither the post, nor my comment require that.)

Eliezer himself argued that GLUT isn't conscious.

Yes, and that's false (but since that's not the argument in the OP, I don't think I should get sidetracked).

But nonetheless, if the only formalized proposal for consciousness doesn't have the property that simulations preserve consciousness, then clearly the property is not guaranteed.

That's false. If we assume for a second that the ITT really is the only formalized theory of consciousness, it doesn't follow that the property is not, in fact, guaranteed. It could also be that the ITT is wrong and that in the actual reality, the property is, in fact, guaranteed.

Comment by green_leaf on Logical Proof for the Emergence and Substrate Independence of Sentience · 2024-10-26T05:38:08.673Z · LW · GW

so the idea is that you can describe the brain by treating each neuron as a little black box about which you just know its input/output behavior, and then describe the interactions between those little black boxes. Then, assuming you can implement the input/output behavior of your black boxes with a different substrate (i.e., an artificial neuron)

This is guaranteed, because the universe (and any of its subsets) is computable (that means a classical computer can run software that acts the same way).

Comment by green_leaf on Logical Proof for the Emergence and Substrate Independence of Sentience · 2024-10-26T05:30:34.270Z · LW · GW

And there are orders of magnitude more detail going on in my body (and even just in my brain) than I perceive, let alone that I communicate.

There are no sentient details going on that you wouldn't perceive.

It doesn't matter if you communicate something, the important part is that you are capable of communicating it, which means that in changes your input/output pattern (if it didn't, you wouldn't be capable of communicating it even in principle).

Circular arguments that "something is discussed, therefore that thing exists"

This isn't the argument in the OP (even though, when reading quickly, I can see how someone could get that impression).

Comment by green_leaf on The Personal Implications of AGI Realism · 2024-10-23T04:35:50.354Z · LW · GW

(Thanks to the Hayflick limit, only some lines can go on indefinitely.)

Comment by green_leaf on Change My Mind: Thirders in "Sleeping Beauty" are Just Doing Epistemology Wrong · 2024-10-20T03:07:13.122Z · LW · GW

If the SB always guesses heads, she'll be correct of the time. For that reason, that is her credence.

Comment by green_leaf on AI #86: Just Think of the Potential · 2024-10-18T00:44:16.962Z · LW · GW

Are the ‘AI companion’ apps, or robots, coming? I mean, yes, obviously?

The technology for bots who are "better" than humans in some way (constructive, pro-social, compassionate, intelligent, caring interactions while thinking 2 levels meta) has been around since 2022. But the target group wouldn't pay enough for GPT-4-level inference, so current human-like bots are significantly downscaled compared to what technology allows.

Comment by green_leaf on LLMs are likely not conscious · 2024-10-17T12:21:57.158Z · LW · GW

To consciously take in an information, you don't have to store any bits - you only have to map the correct input to the correct output. (By logical necessity, any transformation that preserves the input/output relationship preserves consciousness.)

Comment by green_leaf on Most arguments for AI Doom are either bad or weak · 2024-10-14T18:06:44.073Z · LW · GW

Unless you can summarize you argument in at most 2 sentences (with evidence), it's completely ignoreable.

This is not how learning any (even slightly complex) topic works.

Comment by green_leaf on Spade's Shortform · 2024-10-10T00:35:40.182Z · LW · GW

When I skipped my medication whose abstinence symptom is strong anxiety, my brain always generated a nightmare to go along with the anxiety, working backwards in the same way.

Edit: Oh, never mind, that's not what you mean.

Comment by green_leaf on Alignment by default: the simulation hypothesis · 2024-10-07T19:00:55.237Z · LW · GW

That wouldn't help. Then the utility would be calculated from (getting two golden bricks) and (murdering my child for a fraction of a second), which still brings lower utility than not following the command.

The set of possible commands for which I can't be maximally rewarded still remains too vast for the statement to be meaningful.

Comment by green_leaf on Alignment by default: the simulation hypothesis · 2024-10-06T14:01:18.400Z · LW · GW

I see your argument. You are saying that "maximal reward", by definition, is something that gives us the maximum utility from all possible actions, and so, by definition, it is our purpose in life.

But actually, utility is a function of both the action (getting two golden bricks) and what it rewards (murdering my child), not merely a function of the action itself (getting two golden bricks).

And so it happens that for many possible demands that I could be given ("you have to murder your child"), there are no possible rewards that would give me more utility than not obeying the command.

For that reason, simply because someone will maximally reward me for obeying them doesn't make their commands my objective purpose in life.

Of course, we can respond "but then, by definition, they aren't maximally rewarding you" and by that definition, it would be a correct statement to make. The problem here is that the set of all possible commands for which I can't (by that definition) be maximally rewarded is so vast that the statement "if someone maximally rewards/punishes you, their orders are your purpose of life" becomes meaningless.

Comment by green_leaf on Alignment by default: the simulation hypothesis · 2024-10-04T15:06:16.106Z · LW · GW

How does someone punishing you or rewarding you make their laws your purpose in life (other than you choosing that you want to be rewarded and not punished)?

Comment by green_leaf on You can, in fact, bamboozle an unaligned AI into sparing your life · 2024-10-01T06:05:08.074Z · LW · GW

Either we define "belief" as a computational state encoding a model of the world containing some specific data, or we define "belief" as a first-person mental state.

For the first definition, both us and p-zombies believe we have consciousness. So we can't use our belief we have consciousness to know we're not p-zombies.

For the second definition, only we believe we have consciousness. P-zombies have no beliefs at all. So for the second definition, we can use our belief we have consciousness to know we're not p-zombies.

Since we have a belief in the existence of our consciousness according to both definitions, but p-zombies only according to the first definition, we can know we're not p-zombies.

Comment by green_leaf on You can, in fact, bamboozle an unaligned AI into sparing your life · 2024-09-30T11:33:49.522Z · LW · GW

This is incorrect - in a p-zombie, the information processing isn't accompanied by any first-person experience. So if p-zombies are possible, we both do the information processing, but only I am conscious. The p-zombie doesn't believe it's conscious, it only acts that way.

You correctly believe that having the correct information processing always goes hand in hand with believing in consciousness, but that's because p-zombies are impossible. If they were possible, this wouldn't be the case, and we would have special access to the truth that p-zombies lack.

Comment by green_leaf on Leon Lang's Shortform · 2024-09-30T02:02:59.741Z · LW · GW

What an undignified way to go.

User info

Posts

Comments

Calculating the probabilities

P=10%×100%10%×100%+90%×1128×100%≈93.4%

Verifying them empirically

Conclusion

$P = \frac{10 % \times 100 %}{10 % \times 100 % + 90 % \times \frac{1}{128} \times 100 %} \approx 93.4 %$