Posts

Comments

Comment by red75prime on When is a mind me? · 2024-04-19T21:35:56.322Z · LW · GW

What concrete fact about the physical world do you think you're missing? What are you ignorant of?

Let's flip very unfair quantum coin with 1:2^1000000 heads to tails chances (that would require quite an engineering feat to prepare such a quantum state, but it's theoretically possible). You shouldn't expect to see heads if the quantum state is prepared correctly, but the post-flip universe (in MWI) contains a branch where you see heads. So, by your logic, you should expect to see both heads and tails even if the state is prepared correctly.

What I do not know is how it all ties together. MWI is wrong? Copying is not equivalent to MWI branching (thanks to the no-cloning theorem, for example)? And so on

Comment by red75prime on Ackshually, many worlds is wrong · 2024-04-12T22:15:12.699Z · LW · GW

"Thread of subjective experience" was an aside (just one of the mechanisms that explains why we "find ourselves" in a world that behaves according to the Born rule), don't focus too much on it.

The core question is which physical mechanism (everything should be physical, right?) ensures that you almost never will see a string of a billion tails after a billion quantum coin flips, while the universe contains a quantum branch with you looking in astonishment on a string with a billion tails. Why should you expect that it will almost certainly not happen, when there's always a physical instance of you that will see it happened?

You'll have 2^1000000000 branches with exactly the same amplitude. You'll experience every one of them. Which physical mechanism will make it more likely for you to experience strings with roughly the same number of heads and tails?

In the Copenhagen interpretation it's trivial: when the quantum coin flipper writes a result of the flip the universe somehow samples from a probability distribution and the rest is the plain old probability theory. You don't expect to observe a string of a billion tails (or any other preselected string), because you who observes this string almost never exist.

What happens in MWI?

Comment by red75prime on Ackshually, many worlds is wrong · 2024-04-12T19:12:17.410Z · LW · GW

I haven't fully understood your stance towards the many minds interpretation. Do you find it unnecessary?

I don’t think either of these Harrys is “preferred”.

And simultaneously you think that existence of future Harries who observe events with probabilities approaching zero is not a problem because current Harry will almost never find himself to be those future Harries. I don't understand what it means exactly. 

Harries who observe those rare events exist and they wonder how they found themselves in those unlikely situations. Harries who hadn't found anything unusual exist too. Current Harry became all of those future Harries. 

So, we have a quantum state of the universe that factorizes into states with different Harries. OK. What property distinguished a universe where "Harry found himself in a tails branch" and a universe where "Harry found himself in a heads branch"?

You have already answered it: "I don’t think either of these Harrys is “preferred”." That is there's no property of the universe that distinguishes those outcomes.

Let's get back to the initial question 'What it means that "Harry will almost never find himself to be those future Harries"?' To answer that we need to jump from a single physical Universe (containing multitude of Harries who found themselves in branches of every possible probability) to a single one (or maybe a set) of those Harries and proclaim that, indeed, that Harry (or Harries) found himself in a usual branch of the universe and all other Harries don't matter for some reason (their amplitudes are too low to matter despite them being fully conscious? That's the point that I don't understand).

The many minds interpretation solves this by proposing metaphysical threads of consciousness, thus adding a property that distinguishes  outcomes where Harry observes different things. So we can say that indeed the vast majority of Harries' threads of consciousness ended up in probable branches.

I don't like this interpretation. Why don't we use a single thread of consciousness that adheres to Born rule? Or why don't we get rid of threads of consciousness altogether and just use the Copenhagen interpretation?

So, my question is how you tackle this problem? I hope I've made it sufficiently coherent.

 

My own resolution is that either collapse is objective, or due to imperfect decoherence the vast majority of branches (which also have relatively low amplitude) interfere with each other, making it impossible for conscious beings to exist in them and, consequently, observe them  (it doesn't explain billion quantum coin-flips scenario in my comment below)

Comment by red75prime on Ackshually, many worlds is wrong · 2024-04-12T09:50:47.492Z · LW · GW

For example: “as quantum amplitude of a piece of the wavefunction goes to zero, the probability that I will ‘find myself’ in that piece also goes to zero”

What I really don't like about this formulation is extreme vagueness of "I will find myself", which implies that there's some preferred future "I" out of many who is defined not only by observations he receives, but also by being a preferred continuation of subjective experience defined by an unknown mechanism.

It can be formalized as the many minds interpretation, incurring additional complexity penalty and undermining surface simplicity of the assumption. Coexistence of infinitely many (measurement operators can produce continuous probability distributions)  threads of subjective experience in a single physical system also doesn't strike me as "feeling more natural".

Comment by red75prime on What is the best argument that LLMs are shoggoths? · 2024-03-18T09:02:41.797Z · LW · GW

First, a factual statement that is true to the best of my knowledge: LLM state, that is used to produce probability distribution for the next token, is completely determined by the state of its input buffer (plus a bit of indeterminism due to parallel processing and non-associativity of floating point arithmetic).

That is LLM can pass only a single token (around 2 bytes) to its future self. That follows from the above.

What comes next is a plausible (to me) speculation.

For humans what's passed to our future self is most likely much more that a single token. That is a state of the human brain that leads to writing (or uttering) the next word most likely cannot be derived from a small subset of a previous state plus a last written word (that is state of the brain changes not only because we had written or said a word, but by other means too).

This difference can lead to completely different processes that LLM uses to mimic human output, that is potential shoggethification. But to be the real shoggoth LLM also needs a way to covertly update its shoggoth state, that is the part of its state that can lead to inhuman behavior. Output buffer is the only thing it has to maintain state, so the shoggoth state should be steganographically encoded in it, thus severely limiting its information density and update rate.

I wonder how a shoggoth state may arise at all, but it might be my lack of imagination.

Comment by red75prime on 0th Person and 1st Person Logic · 2024-03-13T19:46:07.431Z · LW · GW

Expanding a bit on the topic.

Exhibit A: flip a fair coin and move a suspended robot into a green or red room using a second coin with probabilities (99%, 1%) for heads, and (1%, 99%) for tails.

Exhibit B: flip a fair coin and create 99 copies of the robot in green rooms and 1 copy in a red room for heads, and reverse colors otherwise.

What causes the robot to see red instead of green in exhibit A? Physical processes that brought about a world where the robot sees red.

What causes a robot to see red instead of green in exhibit B? The fact that it sees red, nothing more. The physical instance of the robot who sees red in one possible world, could be the instance who sees green in another possible world, of course (physical causality surely is intact). But a robot-who-sees-red (that is one of the instances who see red) cannot be made into a robot-who-sees-green by physical manipulations. That is subjective causality of seeing red is cut off from physical causes (in the case of multiple copies of an observer). And as such cannot be used as a basis for probabilistic judgements.

I guess that if I'll not see a resolution of the Anthropic Trilemma in the framework of MWI in about 10 years, I'll be almost sure that MWI is wrong.

Comment by red75prime on 0th Person and 1st Person Logic · 2024-03-12T09:46:48.351Z · LW · GW

I have a solution that is completely underwhelming, but I can see no flaws in it, besides the complete lack of definition of which part of the mental state should be preserved to still count as you and rejection of MWI (as well as I cannot see useful insights into why we have what looks like continuous subjective experience).

  1. You can't consistently assign probabilities for future observations in scenarios where you expect creation of multiple instances of your mental state. All instances exist and there's no counterfactual worlds where you end up as a mental state in a different location/time (as opposed to the one you happened to actually observe). You are here because your observations tells you that you are here, not because something intangible had moved from previous "you"(1) to the current "you" located here.
  2. Born rule works because MWI is wrong. The collapse is objective and there's no alternative yous.

(1) I use "you" in scare quotes to designate something beyond all information available in the mental state that presumably is unique and moves continuously (or jumps) thru time.

Let's iterate through questions of The Anthropic Trilemma.

  1. The Boltzmann Brain problem: no probabilities, no updates.  Observing either room doesn't tell you anything about the value of the digit of pi. It tells you that you observe the room you observe.
  2. Winning the lottery: there's no alternative quantum branches, so your machinations don't change anything.
  3. Personal future: Britney Spears observes that she has memories of Britney Spears, you observe that you have your memories. There's no alternative scenarios if you are defined just by the information in your mental state. If you jump off the cliff, you can expect that someone with a memory of deciding to jump off the cliff (as well as all other your memories) will hit the ground and there will be no continuation of this mental state in this time and place. And your memory tells you that it will be you who will experience consequences of your decisions (whatever the underlying causes for that).

Probabilistic calculations of your future experiences work as expected, if you add "conditional on me experiencing staying here and now".

It's not unlike operator "do(X=x)" in Graphical Models that cuts off all other causal influences on X.

Comment by red75prime on I played the AI box game as the Gatekeeper — and lost · 2024-02-17T11:07:56.192Z · LW · GW

Do you think the exploited flaw is universal or, at least, common?

Comment by red75prime on Scale Was All We Needed, At First · 2024-02-14T08:35:18.638Z · LW · GW

Excellent story. But what about "pull the plug" option? ALICE found a way to run itself efficiently on the traditional datacenters that aren't packed with backprop and inference accelerators? And shutting them down would have required too strong a political will than what the government could muster at the time?

Comment by red75prime on Is a random box of gas predictable after 20 seconds? · 2024-01-30T15:54:34.989Z · LW · GW

Citing https://arxiv.org/abs/cond-mat/9403051: "Furthermore if a quantum system does possess this property (whatever it may be), then we might hope that the inherent uncertainties in quantum mechanics lead to a thermal distribution for the momentum of a single atom, even if we always start with exactly the same initial state, and make the measurement at exactly the same time."

Then the author proceed to demonstrate that it is indeed the case. I guess it partially answers the question: quantum state thermalises and you'll get classical thermal distribution of measurement results of at least some measurements even when measuring the system in the same quantum state.

The less initial uncertainty in energy the faster the system thermalises. That is to slow quantum thermalisation down you need to initialize the system with atoms in highly localized positions, but then you can't know their exact velocities and can't predict classical evolution.

Comment by red75prime on Is a random box of gas predictable after 20 seconds? · 2024-01-26T08:43:43.956Z · LW · GW

we are assuming that without random perturbation, you would get 100% accuracy

That is the question is not about the real argon gas, but about a billiard ball model? It should be stated in the question.

Comment by red75prime on Is a random box of gas predictable after 20 seconds? · 2024-01-25T16:55:11.797Z · LW · GW
Comment by red75prime on Another Non-Anthropic Paradox: The Unsurprising Rareness of Rare Events · 2024-01-22T10:41:53.566Z · LW · GW

here are creatures in the possible mind space[3] whose intuition works in the opposite way. They are surprised specifically by the sequence of  and do not mind the sequence of 

That is creatures who aren't surprised by outcomes of lower Kolmogorov complexity or not surprised by the fact that the language they use for estimation of Kolmogorov complexity has a special compact case for producing  "HHTHTTHTTH".

Looks possible, but not probable.

Comment by red75prime on You can rack up massive amounts of data quickly by asking questions to all your friends · 2024-01-21T15:27:11.373Z · LW · GW

For returns below $2000, I'd use 50/50 quantum random strategy just for fun of dropping Omega's stats.

Comment by red75prime on Some quick thoughts on "AI is easy to control" · 2023-12-06T12:46:57.062Z · LW · GW

what happens if we automatically evaluate plans generated by superhuman AIs using current LLMs and then launch plans that our current LLMs look at and say, "this looks good". 

The obvious failure mode is that LLM is not powerful enough to predict consequences of the plan. The obvious fix is to include human-relevant description of the consequences. The obvious failure modes: manipulated description of the consequences, optimizing for LLM jail-breaking. The obvious fix: ...

I won't continue, but shallow rebuttals is not that convincing, but deep ones is close to capability research, so I don't expect to find interesting answers.

Comment by red75prime on 2023 Unofficial LessWrong Census/Survey · 2023-12-05T10:53:41.249Z · LW · GW

What if all I can assign is a probability distribution of probabilities? Like in extraterrestrial life question. All that can be said is that extraterrestrial life is sufficiently rare to not find evidence of it yet. Our observation of our existence is conditioned on our existence, so it doesn't provide much evidence one way or another.

Should I sample the distribution to give an answer, or maybe take mode, or mean, or median? I've chosen a value that is far from both extremes, but I might have done something else with no clear justification for any of the choices.

Comment by red75prime on An Idea on How LLMs Can Show Self-Serving Bias · 2023-11-24T09:45:43.059Z · LW · GW

This means that LLMs can inadvertently learn to replicate these biases in their outputs.

Or the network learns to trust more the tokens that were already "thought about" during generation. 

Comment by red75prime on Why am I Me? · 2023-06-29T07:09:31.101Z · LW · GW

Suppose when you are about to die [...] Omega shows up

Suppose something pertaining more to the real world: if you think that you are here and now because there will not be significantly more people in the future, then you are more likely to become depressed.

Also, why Omega uses 95% and not 50%, 10%, or 0.000001%?

ETA: Ah, Omega in this case is an embodiment of the litany of Tarski. Still, if there will be no catastrophe we are those 5% who violate the litany. Not saying that the litany comes closest to useless as it can get when we are talking about a belief in an inevitable catastrophe you can do nothing about.

Comment by red75prime on Lessons On How To Get Things Right On The First Try · 2023-06-20T15:28:19.915Z · LW · GW

After all, in the AI situation for which the exercise is a metaphor, we don’t know exactly when something might foom; we want elbow room.

Or you can pretend that you are impersonating an AI that is preparing to go foom.

Comment by red75prime on How could AIs 'see' each other's source code? · 2023-06-04T09:54:51.372Z · LW · GW

conduct a hostage exchange by meeting in a neutral country, and bring lots of guns and other hostages they intend not to exchange that day

That is they alter payoff matrix instead of trying to achieve CC in prisoner's dilemma. And that may be more efficient than spending time and energy on proofs, source code verification protocols and yet unknown downsides of being an agent that you can robustly CC with, while being the same kind of agent.

Comment by red75prime on AI Will Not Want to Self-Improve · 2023-05-17T10:04:39.341Z · LW · GW

the simpler the utility function the easier time it has guaranteeing the alignment of the improved version

If we are talking about a theoretical  AI, where  (expectation of utility given the action a) somehow points to the external world, then sure. If we are talking about a real AI with aspiration to become the physical embodiment of the aforementioned theoretical concept (with the said aspiration somehow encoded outside of , because  is simple), then things get more hairy.

Comment by red75prime on AGI-Automated Interpretability is Suicide · 2023-05-11T09:47:52.629Z · LW · GW

You said it yourself, GPT ""wants"" to predict the correct  probability distribution of the next token

No, I said that GPT does predict next token, while probably not containing anything that can be interpreted as "I want to predict next token". Like a bacterium does divide (with possible adaptive mutations), while not containing "be fruitful and multiply" written somewhere inside.

If you instead meant that GPT is "just an algorithm"

No, I certainly didn't mean that. If the extended Church--Turing thesis holds for macroscopic behavior of our bodies, we can indeed be represented as Turing-machine algorithms (with polynomial multiplier on efficiency).

What I feel, but can't precisely convey, is that there's a huge gulf (in computational complexity maybe) between agentic systems (that do have explicit internal representation of, at least, some of their goals) and "zombie-agentic" systems (that act like agents with goals, but have no explicit internal representation of those goals).

we don't know what our utility actually is

How do you define the goal (or utility function) of an agent? Is it something that actually happens when universe containing the agent evolves in its usual physical fashion? Or is it something that was somehow intended to happen when the agent is run (but may not actually happen due to circumstances and agent's shortcomings)?

Comment by red75prime on AGI-Automated Interpretability is Suicide · 2023-05-10T23:21:04.269Z · LW · GW

I really don't expect "goals" to be explicitly written down in the network. There will very likely not be a thing that says "I want to predict the next token" or "I want to make paperclips" or even a utility function of that. My mental image of goals is that they are put "on top" of the model/mind/agent/person. Whatever they seem to pursue, independently of their explicit reasoning.

I'm sure that I don't understand you. GPT most likely doesn't have "I want to predict next token" written somewhere, because it doesn't want to predict next token. There's nothing in there that will actively try to predict next token no matter what. It's just the thing it does when it runs.

Is it possible to have a system that just "actively try to make paperclips no matter what" when it runs, but it doesn't reflect it in its reasoning and planning? I have a feeling that it requires God-level sophistication and knowledge of the universe to create a device that can act like that, when the device just happens to act in a way that robustly maximizes paperclips while not containing anything that can be interpreted as that goal.

I found that I can't precisely formulate why I feel that. Maybe I'll be able to express that in a few weeks (or I'll find that the feeling is misguided).

Comment by red75prime on AGI-Automated Interpretability is Suicide · 2023-05-10T20:12:47.002Z · LW · GW

Solving interpretability with an AGI (even with humans-in-the-loop) might not lead to particularly great insights on a general alignment theory or even on how to specifically align a particular AGI

Wouldn't it at least solve corrigibility by making it possible to detect formation of undesirable end-goals? I think even GPT-4 can classify textual interpretation of an end-goal on a basis of its general desirability for humans.

It seem to need another assumption, namely that the AGI has sufficient control of its internal state and knowledge of the detection network to be able to bring itself into a state that produces interpretation that trips detection network, while also allowing the AGI to advance its agenda.

Comment by red75prime on Have you heard about MIT's "liquid neural networks"? What do you think about them? · 2023-05-10T18:43:14.462Z · LW · GW

I have low confidence in that, but I guess it (OOD generalization by "liquid" networks) works well in differentiable continuous domains (like low-level motion planning) by exploiting natural smoothness of a system. So I wouldn't get my hopes high in its universal applicability.

Comment by red75prime on LeCun’s “A Path Towards Autonomous Machine Intelligence” has an unsolved technical alignment problem · 2023-05-10T18:30:29.890Z · LW · GW

If you have a next-frame video predictor, you can't ask it how a human would feel. You can't ask it anything at all - except "what might be the next frame of thus-and-such video?". Right?

Not exactly. You can extract embeddings from a video predictor (activations of the next-to-last layer may do, or you can use techniques, which enhance semantic information captured in the embeddings). And then use supervised learning to train a simple classifier from an embedding to human feelings on a modest number of video/feelings pairs.

Comment by red75prime on Yoshua Bengio argues for tool-AI and to ban "executive-AI" · 2023-05-10T13:36:25.029Z · LW · GW

the issue I still see is - how do you recognize an ai executive that is trying to disguise itself?

It can't disguise itself without researching disguising methods first. The question is will interpretability tools be up to the task of catching it.

It will not work for catching AI executive originating outside of controlled environment (unless it queries AI scientist). But given that such attempts will originate from uncoordinated relatively computationally underpowered sources, it may be possible to preemptively enumerate disguising techniques that such AI executive could come up with. If there are undetectable varieties..., well, it's mostly game over.

Comment by red75prime on All AGI Safety questions welcome (especially basic ones) [May 2023] · 2023-05-09T21:57:28.282Z · LW · GW

Thanks. Could we be sure that a bare utility maximizer doesn't modify itself into a mugging-proof version? I think we can. Such modification drastically decreases expected utility.

It's a bit of relief that a sizeable portion of possible intelligences can be stopped by playing god to them.

Comment by red75prime on All AGI Safety questions welcome (especially basic ones) [May 2023] · 2023-05-09T15:23:05.828Z · LW · GW

Are there ways to make a utility maximizer impervious to Pascal's mugging?

Comment by red75prime on An artificially structured argument for expecting AGI ruin · 2023-05-08T09:11:52.540Z · LW · GW

Humans were created by evolution, but [...]

We know that evolution has no preferences (evolution is not an agent), so we generally don't frame our preferences as an approximation of evolution's ones. People who believe that they were created with some goal in mind of the creator do engage in reasoning of what was truly meant for them to do.

Comment by red75prime on An artificially structured argument for expecting AGI ruin · 2023-05-08T08:39:13.619Z · LW · GW

See also, in the OP: "Problem of Fully Updated Deference: Normative uncertainty doesn't address the core obstacles to corrigibility."

The provided link assumes that any preference can be expressed as a utility function over world-states. If you don't assume that (and you shouldn't as human preferences can't be expressed as such), you cannot maximize weighted average of potential utility functions. Some actions are preference-wise irreversible. Take for example virtue ethics: wiping out your memory doesn't restore your status as a virtuous person even if the world doesn't contain any information of your unvirtuous acts anymore, so you don't plan to do that.

When I asked here earlier why the article "Problem of Fully Updated Deference" uses incorrect assumption, I've got the answer that it's better to have some approximation than none as it allows to move forward in exploring the problem of alignment. But I see that it became an unconditional cornerstone and not a toy example of analysis.

Comment by red75prime on An artificially structured argument for expecting AGI ruin · 2023-05-08T00:58:04.440Z · LW · GW

As a strong default, STEM-level AGIs will have "goals"—or will at least look from the outside like they do. By this I mean that they'll select outputs that competently steer the world toward particular states.

Clarification: when talking about world-states I mean world-state minus the state of agent (we are interested in the external actions of the agent).

For starters, you can have goal-directed behavior without steering the world toward particular states. Novelty seeking, for example, don't imply any particular world-state to achieve. 

And I think that the more strong default is that agent will have goal uncertainty. What reinforcement learning agent can say about its desired world-states or world-histories (the goal might not be expressible as an utility function over world-states) upon introspection? Nothing. Would it conclude that its goal is to make sure to self-stimulate as long as possible? Given its vast knowledge of humans, the idea looks fairly dumb (it has low prior probability) and its realization contradict almost any other possibility.

The only kind of agent that will know its goal with certainty is an agent that was programmed with its preferences explicitly pointing to the external world. That is upon introspection the agent finds that its action selection circuitry contains a module that compares expected world-states (or world-state/action pairs) produced by the given set of actions. That is someone was dumb enough to try to program explicit utility function, but secured sufficient funding anyway (completely possible situation, I agree).

But does it really removes goal uncertainty? Sufficiently intelligent agent knows that its utility function is an approximation of true preferences of the creator. That is prior probability of "stated goal == true goal" is infinitesimal (alignment is hard and agent knows it). Will it be enough to prevent the usual "kill-them-all and make tiny molecular squiggles"? The agent still has a choice of which actions to feed to the action selection block.

Comment by red75prime on Hell is Game Theory Folk Theorems · 2023-05-01T06:28:04.738Z · LW · GW

In a realistic setting agents will be highly incentivized to seek other forms of punishment besides turning dial. But nice toy hell.

Comment by red75prime on Could a superintelligence deduce general relativity from a falling apple? An investigation · 2023-04-24T12:58:35.021Z · LW · GW

Thanks for clearing my confusion. I've grown rusty on the topic of AIXI.

So going forwards from simple theories and seeing how they bridge to your effective model would probably do the trick

Assuming that there's not much fine-tuning to do. Locating our world in the string theory landscape could take quite a few bits if it's computationally feasible at all.

And remember, we're talking about an ASI here

It hinges on assumption that ASI of this type is physically realizable. I can't find it now, but I remember that preprocessing step, where heuristic generation is happening, for one variant of computable AIXI was found to take impractical amount of time. Am I wrong? Are there newer developments?

Comment by red75prime on Could a superintelligence deduce general relativity from a falling apple? An investigation · 2023-04-24T07:45:23.781Z · LW · GW

it seems plausible that you could have GR + QFT and a megabyte of briding laws plus some other data to specify local conditions and so on. 

How computationally bound variant of AIXI can arrive at QFT? You most likely can't faithfully simulate a non-trivial quantum system on a classical computer within reasonable time limits. The AIXI is bound to find some computationally feasible approximation of QFT first (Maxwell's equations and cutoff at some arbitrary energy to prevent ultraviolet catastrophe, maybe). And with no access to experiments it cannot test simpler systems.

Comment by red75prime on Could a superintelligence deduce general relativity from a falling apple? An investigation · 2023-04-23T21:54:55.209Z · LW · GW

I mean are there reasons to assume that a variant of computable AIXI (or its variants) can be realized as a physically feasible device? I can't find papers indicating significant progress in making feasible AIXI approximations.

Comment by red75prime on Could a superintelligence deduce general relativity from a falling apple? An investigation · 2023-04-23T21:08:39.137Z · LW · GW

Assume it has disgusting amounts of compute

Isn't it the same as "assume that it can do argmax as fast as needed for this scenario"?

Comment by red75prime on How does AI Risk Affect the Simulation Hypothesis? · 2023-04-21T08:40:10.108Z · LW · GW

Of all the peoples' lives that exist and have existed, what are the chances I'm living [...here and now]

Is there a more charitable interpretation of this line of thinking rather than "My soul selected this particular body out of all available"?

You being you as you are is a product of your body developing in circumstances it happened to develop in.

Comment by red75prime on All AGI Safety questions welcome (especially basic ones) [April 2023] · 2023-04-10T01:57:33.625Z · LW · GW

"Hard problem of corrigibility" refers to Problem of fully updated deference - Arbital, which uses a simplification (human preferences can be described as a utility function) that can be inappropriate for the problem. Human preferences are obviously path-dependent (you don't want to be painfully disassembled and reconstituted as a perfectly happy person with no memory of disassembly). Was appropriateness of the above simplification discussed somewhere?