Posts

Signer's Shortform 2024-03-01T19:25:46.989Z

Comments

Comment by Signer on When is a mind me? · 2024-04-19T13:57:20.076Z · LW · GW

If we live in naive MWI, an IBP agent would not care for good reasons, because naive MWI is a “library of babel” where essentially every conceivable thing happens no matter what you do.

Isn't the frequency of amplitude-patterns changes depending on what you do? So an agent can care about that instead of point-states.

Comment by Signer on When is a mind me? · 2024-04-18T07:46:26.101Z · LW · GW

In the case of teleportation, I think teleportation-phobic people are mostly making an implicit error of the form “mistakenly modeling situations as though you are a Cartesian Ghost who is observing experiences from outside the universe”, not making a mistake about what their preferences are per se.

Why not both? I can imagine that someone would be persuaded to accept teleportation/uploading if they stopped believing in physical Cartesian Ghost. But it's possible that if you remind them that continuity of experience, like table, is just a description of physical situation and not divinely blessed necessary value, that would be enough to tip the balance toward them valuing carbon or whatever. It's bad to be wrong about Cartesian Ghosts, but it's also bad to think that you don't have a choice about how you value experience.

Comment by Signer on When is a mind me? · 2024-04-17T18:36:42.421Z · LW · GW

Analogy: When you’re writing in your personal diary, you’re free to define “table” however you want. But in ordinary English-language discourse, if you call all penguins “tables” you’ll just be wrong. And this fact isn’t changed at all by the fact that “table” lacks a perfectly formal physics-level definition.

You're also free to define "I" however you want in your values. You're only wrong if your definitions imply wrong physical reality. But defining "I" and "experiences" in such a way that you will not experience anything after teleportation is possible without implying anything physically wrong.

You can be wrong about physical reality of teleportation. But even after you figured out that there is no additional physical process going on that kills your soul, except for the change of location, you still can move from "my soul crashes against an asteroid" to "soul-death in my values means sudden change in location" instead of to "my soul remains alive".

It's not like I even expect you specifically to mean "don't liking teleportation is necessary irrational" much. It's just that saying that there should be an actual answer to questions about "I" and "experiences" makes people moral-realist.

Comment by Signer on Any evidence or reason to expect a multiverse / Everett branches? · 2024-04-17T15:23:41.534Z · LW · GW

I'm asking how physicists in the laboratory know that their observation are sharp-valued and classical?

Comment by Signer on When is a mind me? · 2024-04-17T07:34:26.808Z · LW · GW

If we were just talking about word definitions and nothing else, then sure, define “self” however you want. You have the universe’s permission to define yourself into dying as often or as rarely as you’d like, if word definitions alone are what concerns you.

But this post hasn’t been talking about word definitions. It’s been talking about substantive predictive questions like “What’s the very next thing I’m going to see? The other side of the teleporter? Or nothing at all?”

There should be an actual answer to this, at least to the same degree there’s an answer to “When I step through this doorway, will I have another experience? And if so, what will that experience be?”

Why? If "I" is arbitrary definition, then “When I step through this doorway, will I have another experience?" depends on this arbitrary definition and so is also arbitrary.

But I hope the arguments I’ve laid out above make it clear what the right answer has to be: You should anticipate having both experiences.

So you always anticipate all possible experiences, because of multiverse? And if they are weighted, than wouldn't discovering that you are made of mini-yous will change your anticipation even without changing your brain state?

Comment by Signer on Any evidence or reason to expect a multiverse / Everett branches? · 2024-04-16T18:30:53.340Z · LW · GW

What's the evidence for these "sharp-valued classical observations" being real things?

Comment by Signer on Any evidence or reason to expect a multiverse / Everett branches? · 2024-04-16T18:23:48.416Z · LW · GW

In particular, a.follower many worlder has to discard unobserved results in the same way as a Copenhagenist—it’s just that they interpret doing so as the unobserved results existing in another branch, rather than being snipped off by collapse.

A many-worlder doesn't have to discard unobserved results - you may care about other branches.

Comment by Signer on Any evidence or reason to expect a multiverse / Everett branches? · 2024-04-16T17:50:52.516Z · LW · GW

The wrong part is mostly in https://arxiv.org/pdf/1405.7577.pdf, but: indexical probabilities of being a copy are value-laden - seems like the derivation first assumes that branching happens globally and then assumes that you are forbidden to count different instantiations of yourself, that were created by this global process.

Comment by Signer on Ackshually, many worlds is wrong · 2024-04-16T17:22:16.309Z · LW · GW

"The" was just me being bad in English. What I mean is:

  1. There is probably a way to mathematically model true stochasticity. Properly, not as many-worlds.
  2. Math being deterministic shouldn't be a problem, because the laws of truly stochastic world are not stochastic themselves.
  3. I don't expect any such model to be simpler than many-worlds model. And that's why you shouldn't believe in true stochasticity.
  4. If 1 is wrong and it's not possible to mathematically model true stochasticity, then it's even worse and I would question your assertion of true stochasticity being coherent.
  5. If you say that mathematical models turn out complex because deterministic math is unnatural language for true stochasticity, then how do you compare them without math? The program that outputs an array is also simpler than the one that outputs one sample from that array.

How would you formulate this axiom?

Ugh, I'm bad at math. Let's say given the space of outcomes O and reality predicate R, the axiom would be .

Comment by Signer on Any evidence or reason to expect a multiverse / Everett branches? · 2024-04-16T16:41:13.269Z · LW · GW

Carroll's additional assumptions are not relied on by the MWI.

Comment by Signer on Ackshually, many worlds is wrong · 2024-04-16T16:14:55.493Z · LW · GW

I don't know, any model you like? Space of outcomes with "one outcome is real" axiom. The point is that I can understand the argument for why the true stochasticity may be coherent, but I don't get why it would be better.

Comment by Signer on Ackshually, many worlds is wrong · 2024-04-12T19:29:39.876Z · LW · GW

I disagree with this part—if Harry does the quantum equivalent of flipping an unbiased coin, then there’s a branch of the universe’s wavefunction in which Harry sees heads and says “gee, isn’t it interesting that I see heads and not tails, I wonder how that works, hmm why did my thread of subjective experience carry me into the heads branch?”, and there’s also a branch of the universe’s wavefunction in which Harry sees tails and says “gee, isn’t it interesting that I see tails and not heads, I wonder how that works, hmm why did my thread of subjective experience carry me into the tails branch?”. I don’t think either of these Harrys is “preferred”.

This is how it works in MWI without additional postulates. But if you postulate the probability that you will find yourself somewhere, then you are postulating the difference between the case where you have found yourself there, and the case where you haven't. Having a number for how much you prefer something is the whole point of indexical probabilities. And as probability of some future "you" goes to zero, this future "you" goes to not being the continuation of your subjective experience, right? Surely that would make this "you" dispreferred in some sense?

Comment by Signer on Ackshually, many worlds is wrong · 2024-04-12T10:57:33.763Z · LW · GW
  1. such formalisms are unwieldy

Do you actually need any other reason to not believe in True Randomness?

  1. that’s just passing the buck to the one who interprets the formalism

Any argument is just passing the buck to the one who interprets the language.

Comment by Signer on Ackshually, many worlds is wrong · 2024-04-12T02:53:20.789Z · LW · GW

If the simplest assumption is that the world is just quantum mechanical

It isn't a simpler assumption? Mathematically "one thing is real" is not simpler than "everything is real". And I wouldn't call "philosophically, but not mathematically coherent" objection "technical"? Like, are you saying the mathematical model of true stochasticity (with some "one thing is real" formalization) is somehow incomplete or imprecise or wrong, because mathematics is deterministic? Because it's not like the laws of truly stochastic world are themselves stochastic.

Comment by Signer on Any evidence or reason to expect a multiverse / Everett branches? · 2024-04-09T17:54:25.789Z · LW · GW

My intuition finds zero problem with many worlds interpretation.

Why do you care about the Born measure?

Comment by Signer on My intellectual journey to (dis)solve the hard problem of consciousness · 2024-04-07T08:50:10.843Z · LW · GW

this is in some sense the only thing I know for sure

You don't. All your specific experiences are imprecise approximations: you can't be sure what exact color you saw for how many nanoseconds, you can't be sure all your brain except small part implementing only current thought haven't evaporated microsecond ago. So you can have imprecise models of a fish brain the same way you have imprecise models of your brain - your awareness of your brain is casually connected to your brain the same way your thoughts can be casually connected to a fish brain. You just can't be fully fish.

Comment by Signer on My intellectual journey to (dis)solve the hard problem of consciousness · 2024-04-07T08:26:49.772Z · LW · GW

I would be curious to know what you know about my box trying to solve the meta-problem.

Sounds unethical. At least don't kill them afterwards.

Any conclusions would raise usual questions about how much AI's reasoning is about real things and how much it is about extrapolating human discourse. The actual implementation of this reasoning in AI could be interesting, especially given that AI would have different assumptions about its situation. But it wouldn't be necessary the same as in a human brain.

Philosophically I mostly don't see how is that different from introspecting your sensations and thoughts and writing isomorphic Python program. I guess Chalmers may agree that we have as much evidence of AIs' consciousness as of other humans', but would still ask why the thing that implements this reasoning is not a zombie?

But the most fun to think about are cases where it wouldn't apparently solve the problem: like if the reasoning was definitely generated by a simple function over relevant words, but you still couldn't find where it differs from human reasoning. Or maybe the actual implementation would be so complex, that humans couldn't comprehend it on lower level, than what we have now.

The justification for pruning this neuron seems to me to be that if you can explain basically everything without using a dualistic view, it is so much simpler.

Yeah, but can you? Your story ended on stating the meta problem, so until it's actually solved, you can't explain everything. So how did you actually check that you would be able to explain everything once it's solved? Just stating the meta problem of consciousness is like stating the meta problem of why people talk about light and calling the idea of light "a virus".

Comment by Signer on My intellectual journey to (dis)solve the hard problem of consciousness · 2024-04-07T02:50:47.670Z · LW · GW

Sure, "everything is a cluster" or "everything is a list" is as right as "everything is emergent". But what's the actual justification for pruning that neuron? You can prune everything like that.

Great! This text by Yudkowsky has convinced me that the Philosophical Zombie thought experiment leads only to epiphenomenalism and must be avoided at all costs.

Do you mean that the original argument that uses zombies leads only to epiphenomenalism, or that if zombies were real that would mean consciousness is epiphenomenal, or what?

Comment by Signer on Beauty and the Bets · 2024-03-28T11:58:50.625Z · LW · GW

And the answer is no, you shouldn’t. But probability space for Technicolor Sleeping beauty is not talking about probabilities of events happening in this awakening, because most of them are illdefined for reasons explained in the previous post.

So probability theory can't possibly answer whether I should take free money, got it.

And even if "Blue" is "Blue happens during experiment", you wouldn't accept worse odds than 1:1 for Blue, even when you see Blue?

Comment by Signer on Beauty and the Bets · 2024-03-28T10:29:07.755Z · LW · GW

No, I mean the Beauty awakes, sees Blue, gets a proposal to bet on Red with 1:1 odds, and you recommend accepting this bet?

Comment by Signer on Beauty and the Bets · 2024-03-28T08:49:32.287Z · LW · GW

You observe outcome “Blue” which correspond to event “Blue or Red”.

So you bet 1:1 on Red after observing this “Blue or Red”?

Comment by Signer on Beauty and the Bets · 2024-03-28T07:34:35.144Z · LW · GW

mathematically sound

*ethically

Utility Instability under Thirdism

Works against Thirdism in the Fissure experiment too.

Technicolor Sleeping Beauty

I mean, if you are going to precommit to the right strategy anyway, why do you even need probability theory? The whole question is how do you decide to ignore that P(Head|Blue) = 1/3, when you chose Red and see Blue. And how is it not "a probabilistic model produces incorrect betting odds", when you need to precommit to ignore it?

Comment by Signer on The Solution to Sleeping Beauty · 2024-03-27T13:06:19.058Z · LW · GW

Somehow every time people talk about joints, it turns out being more about naive intuitions of personal identity, than reality^^.

I don’t see how it is possible in principle. If the Beauty in the middle of experiment how can she starts participating in another experiment without breaking the setting of the current one?

If you insist on Monday and Tuesday being on the same week, then by backing up her memory: after each awakening we save memory and schedule memory loading and new experiment to a later free week. Or we can start new experiment after each awakening and schedule Tuesdays for later. Does either of these allow you to change your model?

In what sense is she the same person anyway if you treat any waking moment as a different person?

You can treat every memory sequence as a different person.

No, they are not. Events that happen to Beauty on Monday and Tuesday are not mutually exclusive because they are sequential. On Tails if an awakening happened to her on Monday it necessary means that an awakening will happen to her on Tuesday in the same experiment.

But the same argument isn’t applicable to fissure, where awakening in different Rooms are not sequential, and truly are mutually exclusive. If you are awaken in Room 1 you definetely are not awaken in Room 2 in this experiment and vice versa.

I'm not saying the arguments are literally identical.

Your argument is:

  1. The awakening on Tuesday happens always and only after the awakening on Monday.
  2. Therefore !(P(Monday) = 0 & P(Tuesday) = 1) & !(P(Monday) > 0 & P(Tuesday) < 1).
  3. Therefore they are not exclusive.

The argument about copies is:

  1. The awakening in Room 1 always happens and the awakening in Room 2 always happens.
  2. Therefore !(P(Room 1) < 1) & !(P(Room 2) < 1).
  3. Therefore they are not exclusive.

Why the second one doesn't work?

But not all definitions are made equal.

I agree, some are more preferable. Therefore probabilities depend on preferences.

Comment by Signer on The Solution to Sleeping Beauty · 2024-03-21T12:22:32.232Z · LW · GW

I’m afraid I won’t be able to address your concerns without the specifics. Currently I’m not even sure that they are true. According to Wei Dai in one of a previous comments our current best theory claims that Everett branches are causally disconnected and I’m more than happy to stick to that until our theories change.

They are approximately disconnected according to our current best theory. Like your clones in different rooms are approximately disconnected, but still gravitationally influence each other.

You can participate in a thousand fissure experiment in a row and accumulate a list of rooms and coin outcomes corresponding to your experience and I expect them to fit Lewis’s model. 75% of time you find yourself in room 1, 50% of time the coin is Heads.

Still don't get how it's consistent with your argument about statistical test. It's not about multiple experiments starting from each copy, right? You still would object to simulating multiple Beauties started from each awakening as random? And would be ok with simulating multiple Fissures from one original as random?

Because coexistence in space happens separately to different people who are not causally connected, while coexistence in one timeline happen to the same person, whose past and future are causally connected. I really don’t understand why everyone seem to have so much trouble with such an obvious point.

I understand that there is a difference. The trouble is with justification for why this difference is relevant. Like, you based your modelling of Monday and Tuesday as both happening on how we usually treat events when we use probability theory. But the same justification is even more obvious, when both the awakening in Room 1 and the awakening in Room 2 happen simultaneously. Or you say that the Beauty knows that she will be awake both times so she can't ignore this information. But both copies also know that they both will be awake, so why they can ignore it?

If you participate in a Fissure experiment you do not experience being at two rooms on Tails. You are in only one of the rooms in any case, and another version of you is in another room when it’s Tails.

Is this what it is all about? It depends on definition of "you". Under some definitions the Beauty also doesn't experience both days. Are you just saying that distinction is that no sane human would treat different moments as distinct identities?

Comment by Signer on The Solution to Sleeping Beauty · 2024-03-20T18:51:55.115Z · LW · GW

Can’t we model interference as separate branches? My QM is a bit rusty, what kind of casual behaviour is implied? It’s not that we can actually jump from one branch to the other.

Don't know specifics, as usual, but as far as I know, amplitudes of the branch would be slightly different from what you get by evolving this branch in isolation, because other branch would also spread everywhere. The point is just that they all exist, so, as you say, why use imperfect approximation?

Simultaneous of existence has nothing to do with it. Elga’s model is wrong here because unlike the Sleeping Beauty, learning that you are in Room 1 is evidence for Heads, as you could not be sure to find yourself in Room 1 no matter what. Here Lewis’ model seems a better fit.

I meant the experiment where you don't know which room it is, but anyway - wouldn't Lewis’ model fail statistical test, because it doesn't generate both rooms on Tails? I don't get why modeling coexistence in one timeline is necessary, but coexistence in space is not.

What do you mean by "can be correctly approximated as random sampling"? If all souls are instantiated, then Elga’s model still wouldn't pass statistical test.

Comment by Signer on The Solution to Sleeping Beauty · 2024-03-14T16:07:59.128Z · LW · GW

Oh, right, I missed that your simulation has 1/3 Heads. Thank you for your patient cooperation in finding mistakes in your arguments, by the way. So, why is it ok for a simulation of an outcome with 1/2 probability to have 1/3 frequency? That sounds like more serious failure of statistical test.

Nothing out of the ordinary. The Beauty will generate the list with the same statistical properties. Two lists if the coin is Tails.

I imagined that the Beauty would sample just once. And then if we combine all samples into list, we will see that if the Beauty uses your model, then the list will fail the "have the correct number of days" test.

Which is “Beauty is awakened today which is Monday” or simply “Beauty is awakened on Monday” just as I was saying.

They are not the same thing? The first one is false on Tuesday.

(I'm also interested in your thoughts about copies in another thread).

Comment by Signer on 'Empiricism!' as Anti-Epistemology · 2024-03-14T06:57:00.986Z · LW · GW

I agree that this should be said, but there is also actual disagreement about which theory is better.

Getting reliable 20% returns every year is really quite amazingly hard.

Foundations for analogous arguments about future AI systems are not sufficiently understood - I mean, maybe we can get very capable system that optimise softly like current systems.

And then the AI companies, if they’re allowed to keep selling those—we have now observed—just brute-RLHF their models into not talking about that. Which means we can’t get any trustworthy observations of what later models would otherwise be thinking, past that point of AI company shenanigans.

Seems to me like the weakest point of all this theory - models not only "don't talk" about wiping out humanity? They not always kill you, even if you give them (or make them think they have) a real chance? Yes, it's not reliable. But the question is how much we should update from Sydney (that was mostly fixed) versus RLHF mostly working. And whether RLHF is actually changing thoughts or the model is secretly acting benevolent is an empirical question with different predictions - can't we just look at weights?

Comment by Signer on The Solution to Sleeping Beauty · 2024-03-10T14:57:25.499Z · LW · GW

What else does event “Monday” that has 2⁄3 probability means then?

It means "today is Monday".

I do not I understand what you mean here. Beauty is part of simulation. Nothing prevents any person from running the same code and getting the same results.

I mean what will happen, if Beauty runs the same code? Like you said, "any person" - what if this person is Beauty during the experiment? If we then compare combined statistics, which model will be closer to reality?

Why would it?

My thinking is because then Beauty would experience more tails and simulation would have to reproduce that.

How is definition of knowledge relevant to probability theory? I suppose, if someone redefines “knowledge” as “being wrong” then yes, in such definition the Beauty should not accept the correct model, but why would we do it?

The point of using probability theory is to be right. That's why your simulations have persuasive power. But different definition of knowledge may value average knowledge of awake moments of Beauty instead of knowledge of outside observer.

Comment by Signer on Claude 3 claims it's conscious, doesn't want to die or be modified · 2024-03-09T10:25:24.862Z · LW · GW

This does not describe any experience of mine.

Sure, I'm not saying you are usually wrong about your sensations, but it still means there are physical conditions on your thoughts being right - when you are right about your sensation, you are right because that sensation influenced your thoughts. Otherwise being wrong about past sensation doesn't work. And if there are conditions, then they can be violated.

Comment by Signer on Claude 3 claims it's conscious, doesn't want to die or be modified · 2024-03-08T15:35:58.868Z · LW · GW

I can be wrong about what is happening to produce some sensation

And about having it in the past, and about which sensation you are having. To calibrate you about how unsurprising it should be.

Well, it's hard to give impressive examples in normal conditions - it's like asking to demonstrate nuclear reaction with two sticks - brain tries to not be wrong about stuff. Non-impressive examples include lying to yourself - deliberately thinking "I'm feeling warmth" and so on, when you know, that you don't. Or answering "Yes" to "Are you feeling warm?" when you are distracted and then realizing, that no, you weren't really tracking your feelings at that moment. But something persistent that survives you actually querying relevant parts of the brain, and without externally spoofing this connection... Something like reading, that you are supposed to feel warmth, when looking at kittens, believing it, but not actually feeling it?

I guess I'll go look what people did with actual electrodes, if "you can misidentify sensation, you can be wrong about it being present in any point in time, but you can't be wrong about having it now" still seems likely to you.

Comment by Signer on Claude 3 claims it's conscious, doesn't want to die or be modified · 2024-03-08T14:02:38.450Z · LW · GW

They are supposed to test consistency of beliefs. I mean, if you think some part of the experiment is impossible, like separating your thoughts from your experiences, say so. I just want to know what your beliefs are.

And the part about memory or colors is not a thought experiment but just an observation about reality? You do agree about that part, that whatever sensation you name, you can be wrong about having it, right?

Comment by Signer on Claude 3 claims it's conscious, doesn't want to die or be modified · 2024-03-08T09:42:00.689Z · LW · GW

I can be mistaken about the cause of a sensation of warmth, but not about the fact of having such a sensation.

That's incorrect, unless you make it an axiom. You do at least agree that you can be mistaken about having a sensation in the past? But that implies that sensation must actually modify your memory for you to be right about it. You also obviously can be mistaken about which sensation you are having - you can initially think that you are seeing 0x0000ff, but after a second conclude that no, it's actually 0x0000fe. And I'm not talking about external cause of you sensations, I'm talking about you inspecting sensations themselves.

In the case of consciousness, to speculate about some part not being what it seems is still to be conscious in making that speculation. There is no way to catch one’s own tail here.

You can speculate unconsciously. Like, if we isolate some part of you brain that makes you think "I can't be wrong about being conscious, therefore I'm conscious", put you in a coma and run just that thought, would you say you are not mistaken in that moment, even though you are in a coma?

Comment by Signer on Claude 3 claims it's conscious, doesn't want to die or be modified · 2024-03-07T18:33:17.320Z · LW · GW

No, I'm asking you to constrain the space of solutions using the theory we have. For example, if you know your consciousness as sun's warmth, then now we know you can in principle be wrong about being conscious - because you can think that you are feeling warmth, when actually your thoughts about it were generated by electrodes in your brain. Agree?

Comment by Signer on Claude 3 claims it's conscious, doesn't want to die or be modified · 2024-03-07T17:56:06.367Z · LW · GW

We have non-ordinary theories about many things that ordinary words are about, like light. What I want is for you to consider implications of some proper theory of knowledge for your claim about knowing for a fact that you are conscious. Not "theory of knowledge" as some complicated philosophical construction - just non-controversial facts, like that you have to interact with something to know about it.

Comment by Signer on Claude 3 claims it's conscious, doesn't want to die or be modified · 2024-03-07T17:09:29.262Z · LW · GW

I'm just asking what do you mean by "knowing" in "But here I am, conscious anyway, knowing this from my own experience, from the very fact of having experience.". If you don't know what you mean, and nobody does, then why are you using "knowing"?

Comment by Signer on Claude 3 claims it's conscious, doesn't want to die or be modified · 2024-03-07T11:55:02.416Z · LW · GW

I mean, we know how knowing works - you do not experience knowing. For you to know how things are you have to be connected to these things. And independently of consciousness we also know how "you" works - identity is just an ethical construct over something physical, like brain. So, you can at least imagine how an explanation of you knowing may look like, right?

Comment by Signer on Many arguments for AI x-risk are wrong · 2024-03-07T07:34:45.611Z · LW · GW

The MTurkers are certainly affecting the model, but the model is not imitating the MTurkers, nor is it doing what the MTurkers want, nor is it listening to the MTurkers’ advice. Instead the model is learning to exploit weaknesses in the MTurkers’ play, including via weird out-of-the-box strategies that would have never occurred to the MTurkers themselves.

How is this very different from RLHF?

Comment by Signer on Claude 3 claims it's conscious, doesn't want to die or be modified · 2024-03-07T07:19:06.457Z · LW · GW

But here I am, conscious anyway, knowing this from my own experience, from the very fact of having experience.

What do mean by "I" here - what physical thing does the knowing?

Comment by Signer on The Solution to Sleeping Beauty · 2024-03-06T16:12:34.182Z · LW · GW

Are you saying that probability of Heads is in principle not defined before the Beauty awakens in the experiment? Or just that it can’t be defined if we assume that Elga’s model is true? Because if it’s the latter—it’s not a point in favor of Elga’s model.

It can’t be usefully defined if we assume that Elga’s model is true. I agree that it is not a point in favor. Doesn't mean we can't use it instead of assuming it is true.

Try rigorously specifying the event “today is Monday” in the Sleeping Beauty problem.

What do you mean by "rigorously"? "Rigorously" as "using probability theory" it is specified as Monday in Elga's model. "Rigorously" as "connected to reality" today is specified as Monday on physical Monday, and Tuesday on physical Tuesday.

It’s just the definition of the problem, that when the coin is Tails first Monday awakening and then Tuesday awakening happens. We do not have a disagreement here, do we?

We do! You are using wrong "happens" and "then" in the definition - the actual definition uses words connected to reality, not parts of probability theory. It's not a theorem of probability theory, that if event physically happens, it has P > 0. And "awakening happens" is not even directly represented in Elga’s model.

Yes, it's all unreasonable pedantry, but you are just all like "Math! Math!".

The main question of the Sleeping Beauty problem is what her credence for Heads should be when she is awakened, while participating in the experiment. This is the question my model is answering. People just mistakenly assume that it means “What is you credence specifically today”, because they think that “today” is a coherent variable in Sleeping Beauty, while it’s not.

On wiki it's "When you are first awakened, to what degree ought you believe that the outcome of the coin toss is Heads?" - notice the "ought"^^. And the point is mostly that humans have selfish preferences.

I suppose you mean something else by “subjective experience of Beauty”?

Nah, I was just wrong. But... Ugh, I'm not sure about this part. First of all, Elga’s model doesn't have "Beauty awakened on Monday" or whatever you simulate - how do you compare statistics with different outcomes? And what would happen, if Beauty performed simulation instead of you? I think then Elga's model would be statistically closest, right? Also what if we tell Beauty what day it is after she tells her credence - would you then change your simulation to have 1/3 Heads?

I’ve already explained what does Elga’s model does wrong—it’s talking about a random awakening in Sleeping Beauty. So if we think that it is correct for a *current awakening *we have to smuggle the notion that current awakening is random, which isn’t specified in the condition of the experiment. Which may give you an impression that you learn more, but that’s because you’ve unlawfully assumed a thing.

No, that's the point - it means they are using different definitions of knowledge. You can use Elga’s model without assuming randomness of an awakening, whatever that means. You'll need preferred definition of knowledge instead, but everyone already has preferences.

I’ve shown that this is the default way to deal with this problem according to probability theory as it is, without making any extra assumptions out of nowhere.

"Default" doesn't mean "better" - if extra assumptions give you what you want, then it's better to make more assumptions.

Comment by Signer on The Solution to Sleeping Beauty · 2024-03-06T12:34:54.117Z · LW · GW

I don’t think I understand what you mean here. Can you elaborate? I’m talking about the difference in causal graphs.

I mean that different branches are casually connected - there is some level of interference between them. In practice you would be approximating it differently - coin toss causing all branches as opposed to Monday causing Tuesday, yes. But it's basically the same causal graph as if we copy Beauty instead of reawakening, so I don't get why such causality matters. You said in another comment, that copying changes things, but I assume (from the OP) that you still would say that Elga's model is not allowed, because both rooms exist simultaneously? Well, branches also exist simultaneously.

The math stays the same, regardless. That’s the whole point.

It doesn't - if all branches exist, then P of everything is 1. Even if you believe in Born probabilities, they are probably indexical too.

“Objective statistics” shows that all the sheets are spread among students and all the questions are asked. And yet there is a meaningful way to say that to a particular student there is a specific probability to receive a particular question in the exam.

...or do you accept Elga's model for copies and it is really all about awakenings being sequential? Why the same arguments about changing probability of questions wouldn't apply here? Or "if two models are claiming that you get different amount of knowledge while observing the same data one is definitely wrong"?

Comment by Signer on Claude 3 claims it's conscious, doesn't want to die or be modified · 2024-03-05T19:03:34.967Z · LW · GW

As far as ethics is concerned, it's not a question of fact whether any LLM is conscious. It's just a question of what abstract similarities to human mind do you value. There is no fundamental difference between neural processes that people call conscious vs unconscious.

Comment by Signer on Many arguments for AI x-risk are wrong · 2024-03-05T18:42:14.283Z · LW · GW

Undo the update from the “counting argument”, however, and the probability of scheming plummets substantially.

Wait, why? Like, where is low probability is actually coming from? I guess from some informal model of inductive biases, but then why "formally I only have Solomonoff prior, but I expect other biases to not help" is not an argument?

Comment by Signer on The Solution to Sleeping Beauty · 2024-03-05T17:43:53.177Z · LW · GW

Elga model doesn’t have an explanation for the change of probabilities of the coin

There is no change of probabilities, because there are no probabilities without outcome space.

My model doesn’t have such issues

And why "the model doesn't represent "today is Monday"" is not weird, when that was what you wanted to know in the first place? Wouldn't it fail the statistical test if we simulated only subjective experience of Beauty?

Math doesn’t always let us do things that we want, which is what makes it useful in the first place.

But it's not math, it's your objectivity biased "no-weirdness" principle. Without it you can use Elga's model to get more knowledge for yourself in some sense.

But when applied to Sleeping Beauty problem, where Tails&Monday and Tails&Tuesday happen sequentially and, therefore, are not mutually exclusive it does, because by definition sample space has to consist of mutually exclusive events.

It's not a theorem of probability theory that Sleeping Beauty is a problem, where Tails&Monday and Tails&Tuesday happen sequentially and, therefore, are not mutually exclusive.

Have I missed something?

You've shown that there is a persuasive argument for treating Monday and Tuesday as both happening simultaneously, that it is possible to treat them like this. But you haven't shown that they definitely can't be treated differently.

Comment by Signer on The Solution to Sleeping Beauty · 2024-03-05T17:21:22.343Z · LW · GW

The problem seems to be equivalent to picking a random branch from several possible ones, according to their known probabilities. Which is modelled as logical uncertainty without much problems, as far as I see.

But who does the picking? The problem is that all branches exist, so objective statistics shows them always existing simultaneously.

Monday and Tuesday awakening in one branch are causally connected. Different branches are not.

On the fundamental level, they are. And if you are fine with approximations, then you can treat Elga's model as approximation too.

Comment by Signer on The Solution to Sleeping Beauty · 2024-03-04T18:44:35.469Z · LW · GW

Which is equal to the sample space of the coin toss, just with different names for the outcomes.

Well then by your arguments it can't be describing the Sleeping Beauty problem, when it is a much better match for the Just a Coin Toss problem.

Whether Monday is today or was yesterday is irrelevant—it is the same outcome, anyway.

But what if you actually want to know?

Well, here is why not. Because in the latter case you are making baseless assumptions and contradicting the axioms of probability theory.

Again, Elga’s model doesn't contradict the axioms of probability theory. Are we supposed to just ignore that mathematical fact, that you can use probability theory with subjective decomposition of outcomes?

Comment by Signer on Signer's Shortform · 2024-03-01T19:25:47.108Z · LW · GW

Oh, I just got it: NNs have stupidity inductive bias.

Comment by Signer on Counting arguments provide no evidence for AI doom · 2024-02-29T17:49:38.751Z · LW · GW

Once we understand that relationship, it should become pretty clear why the overfitting argument doesn’t work: the overfit model is essentially the 2n model, where it takes more bits to specify the core logic, and then tries to “win” on the simplicity by having m unspecified bits of extra information. But that doesn’t really matter: what matters is the size of the core logic, and if there are simple patterns that can fit the data in n bits rather than 2n bits, you’ll learn those.

Under this picture, or any other simplicity bias, why NNs with more parameters generalize better?

Comment by Signer on Counting arguments provide no evidence for AI doom · 2024-02-29T17:43:27.718Z · LW · GW

I don't get how you can arrive at 0.1% for future AI systems even if NNs are biased against scheming. Humans scheme, the future AI systems trained to be capable of long if-then chains may also learn to scheme, maybe because explicitly changing biases is good for performance. Or even, what, you have <0.1% on future AI systems not using NNs?

Also, not saying "but it doesn't matter", but assuming everyone agrees that spectrally biased NN with classifier or whatever is a promising model of a safe system. Do you then propose we should not worry and just make the most advanced AI we can as fast as possible. Or it would be better to first reduce remaining uncertainty about behavior of future systems?

Comment by Signer on Complexity of value but not disvalue implies more focus on s-risk. Moral uncertainty and preference utilitarianism also do. · 2024-02-25T08:29:03.852Z · LW · GW

I think it’s extremely rare to have an asymmetric distribution towards thinking the best happiness is better in expectation.

In a survey from SSC I counted ~10% of answers that preferred <50% probability of heaven vs hell to certainty of oblivion. 10% is not "extremely rare".

Comment by Signer on And All the Shoggoths Merely Players · 2024-02-22T17:57:01.273Z · LW · GW

AutoGPT is certainly not capable to do this

It's not capable under all conditions, but you can certainly prepare conditions under which AutoGPT can kill you: you can connect it to a robot arm with a knife, explain what commands do what, and tell it to proceed. And AutoGPT will not suddenly start trying to kill you just because it can, right?

If this alignment failure doesn’t kill everyone, we can fix it even by very dumb methods, like “RLHF against failure outputs”, but it doesn’t tell us anything about kill-everyone level of capabilities.

Why doesn’t it? Fixing alignment failures under relatively safe conditions may fix them for other conditions too. Or why are you thinking about "kill-everyone" capabilities anyway - do you expect RLHF to work for arbitrary levels of capabilities if you don't die doing it? Like if an ASI trained some weaker AI by RLHF in an environment where it can destroy Earth or two, it would work?

What happened to ChatGPT in slightly unusual environment despite all alignment training.

Huh, it's worse than I expected, thanks. And it even gets worse from GPT-3 to 4. But still - extrapolation from this requires quantification - after all they did mostly fix it by using different promt. How do you decide whether it's just an evidence for "we need more finetuning"?