Weighting the probability of being a mind by the quantity of the matter composing the computer that calculates that mind

post by yttrium · 2014-02-11T15:34:52.374Z · LW · GW · Legacy · 23 comments

Contents

23 comments

TL;DR by lavalamp: Treating "computers running minds" as discrete objects might cause a paradox in probability calculations that involve self-location. "The probability of being a certain mind" is probably an extensive physical quantity, i.e. rises proportionally to the size of the physical system doing the associated computations.

There are two computers simulating two minds. At some time, one of the minds is being shown a red light, and the other one is shown a green one (call this "Situation 1"). Conditioned on you being one of the minds, what is the probability you should assign to seeing red?

Naively, the answer seems to be 1/2, which comes from assigning being each of the minds an equal probability. If one had three computers and showed two of them a red light and the third one a green one, the probability would be calculated as 2/3, even if the red-seeing computers will be in exactly the same computational state at all times (call this "Situation 2").

However, I think that taking this point of view leads to paradoxes.

An example: Consider an electrical circuit made of (ideal) wires, resistors, capacitors and transistors (sufficient in principle to build a computer); the supply voltage comes from outside of the circuit considered. Under assumptions regarding the physical implementation of this circuit that do not restrict the possible circuit diagrams, it is possible to split the matter composing it into two part that both comprise working circuits reproducing the original circuit's behavior independently of the other part, in an analogous fashion to how the Ebborian's brains are split.* To clarify, what I have in mind is cutting up the wires and resistors orthogonally to their cross-sections - after the splitting, equivalent wires should be on equivalent potentials at the same time, but the currents flowing will be reduced by some factor.

Now imagine the circuit is a computer, simulating the mind that is going to see red in Situation 1 (the mind that will see green still exists). If one splits the circuit as described, one suddenly ends up with two circuits simulating the same mind, i.e. Situation 2 (let's imagine that the computers are split before they are turned on for the first time, so that stream-of-consciousness-considerations will not influence the calculated probability, like e.g. Deda answering 1/2 to Yu'el's question in the linked article). However, it is not clear how far the circuit components need to be apart from each other so that they should be considered "split". I.e., if one fixes a direction in which the circuits are moved apart and then defines P(d) as "the probability one should assign to seeing red, as a function of the distance by which the circuits have been moved apart), P(0) would be 1/2 and P(∞) would be 2/3 in the naive model, but there seems to be no intuitive way how the function should look like in between.

I think that therefore, it is more plausible that a way closer to the correct one to calculate the probability of having one mind's experiences involves somehow weighting this probability by the amount (maybe mass or electron count) of matter that calculates the mind. If one does this, after splitting, the matter comprising each of the parts will add up exactly to the matter of the original circuit, so P(d) would be constant over all distances.

What do you think?

*Namely, the resistors could be full cylinders with the wires protruding along the axes - one could then split them by a plane surface surface that includes the cylinder's axis and would end up with two resistors that have twice the resistance.

The capacitors could look exactly like in this picture and could then be split up along a plane that includes the wires, so that the capacitance is halved.

The transistors could look exactly like in this picture (being homogeneous in the z-direction), and be split up in half across a plane that is parallel to the picture shown).

If one does all of those splittings and splits up the wires so that the parts of each electronic component are connected in the same way as the original circuit was connected, and then operates the resulting circuits with the same supply voltage as one operated the original circuit, the voltages of all wires will always be the same as in the original circuit, and currents will be halved.

23 comments

Comments sorted by top scores.

comment by kokotajlod · 2014-02-11T18:07:48.193Z · LW(p) · GW(p)

If you haven't already, you should read Bostrom's paper Quantity of Experience: Brain Duplication and Degrees of Consciousness where he discusses exactly this scenario.

comment by lmm · 2014-02-11T19:26:43.557Z · LW(p) · GW(p)

It seems like this is covered by the line that you should use the number of bits needed to locate you in the universe as a factor. All other things being equal it takes 1 more bit to describe either of the 1kg computer than the 2kg computer.

Replies from: kokotajlod
comment by kokotajlod · 2014-02-11T19:54:14.480Z · LW(p) · GW(p)

All other things being equal it takes 1 more bit to describe either of the 1kg computer than the 2kg computer.

I don't see why this is. Doesn't it all depend on what language you use, and what the fundamental laws of physics are?

Replies from: lmm
comment by lmm · 2014-02-11T20:21:25.955Z · LW(p) · GW(p)

Sorry, what I meant was: if you're comparing an otherwise identical scenario with two 1kg computers or one 2kg computer (and they're physically the same, just sliced in half or not), you need an extra bit to individually identify one of the two 1kg computers that you don't need to include when it's a single 2kg computer. You're right that in a general scenario the difference is probably not exactly 1 bit - which actually aligns more closely with my intuitions than the idea that a 2kg computer is always "twice as real" as a 1kg one.

Replies from: kokotajlod
comment by kokotajlod · 2014-02-12T01:57:38.511Z · LW(p) · GW(p)

I think it depends on how you do the identification, right? If your identification is simply "Find the simplest set of directions that takes the universe as input and outputs the computer," then what you will get might look like a bunch of coordinates and dimensions, in which case a smaller object would actually be easier to describe, and having copies elsewhere would be irrelevant.

Replies from: lmm
comment by lmm · 2014-02-12T18:04:17.787Z · LW(p) · GW(p)

If you imagine a universe tiled with computers, it's easier to identify any individual one the bigger the computers are, right?

I think this generalizes to more usual universes, but I could be wrong.

Replies from: kokotajlod
comment by kokotajlod · 2014-02-12T21:17:39.699Z · LW(p) · GW(p)

Still depends on how you do the identification. If you have to describe not just the location of the computer but the collection of fundamental entities that form it, then the more fundamental entities form it, the harder it (may) be to describe it.

Also, we aren't talking about a universe tiled with computers; we are talking about a single 2kg computer or two 1kg computers. We leave it unspecified what the rest of the world looks like. EDIT: Rather, what I should say is: I'm not so sure it generalizes.

comment by Manfred · 2014-02-11T16:43:19.032Z · LW(p) · GW(p)

Suppose there's a paperclip maximizer that could either be running on a 1 kg computer or a 2 kg computer - say the humans flipped a coin when picking which computer to run it on.

Since the computations are the same, the paperclip maximizer doesn't know whether it's 1 kg or 2 kg until I tell it. But before I tell it, I offer the paperclip maximizer a choice between options A and B: A results in 5 paperclips if it's 1 kg and 0 otherwise, B results in 4 paperclips if it's 2 kg and 0 otherwise.

It seems like the paperclip-maximizing strategy is to give equal weight (ha) to being 1 kg and 2 kg, and pick A.

Replies from: kokotajlod, Gurkenglas, yttrium
comment by kokotajlod · 2014-02-11T18:13:40.003Z · LW(p) · GW(p)

What if, instead of a paperclip-maximizer, we had a machine that was designed to maximize the amount of machine-pleasure in the world, where machine-pleasure is "the firing of a certain reward circuit in a system that is sufficiently similar to myself."

Then it seems yttrium has a point: it is all going to come down to when the machine decides there are two systems in the world, and when it decides there is only one. And there is no "obvious" choice for the machine to make in this regard.

Edit: And so, if we want to make an AI that maximizes (among other things) certain subjective human experiences, we will have to make sure it doesn't come to some sort of crazy conclusion about what that entails.

I've followed you, Manfred, in framing this question in terms of values and right actions. But the original question was framed in terms of expectations and future experiences. Do you think that the original question doesn't make sense, or do you have something to say about the original formulation as well? I myself am on the fence.

Replies from: Manfred
comment by Manfred · 2014-02-11T20:19:18.492Z · LW(p) · GW(p)

If you make a robot that explicitly cares about things differently depending on how heavy it is, then sure, it can take actions as if it cared about things more when it's heavier.

But that is done using the same probabilities as normal, merely a different utility function. Changing your utilities without changing your probabilities has no impact on the "probability of being a mind."

Replies from: kokotajlod
comment by kokotajlod · 2014-02-12T01:54:21.350Z · LW(p) · GW(p)

We don't have to program the machine to explicitly care about things differently depending on how heavy they are. Instead, we program the machine to care simply about how many systems exist--but wait! It turns out it we don't know what we mean by that! According to yttrium.

comment by Gurkenglas · 2014-02-13T09:13:57.406Z · LW(p) · GW(p)

That's because the maximizer is now conditioning not on the probability that he is running on 1kg vs 2kg hardware, but on the probability that you/Omega selected the 1kg/2kg machine to talk to, which sounds intuitively more close to 50/50 based on your arguments.

But now suppose that I made the same deal to the maximizer.

There was a post about this point recently somewhere around here, that your solution to the Monty Hall problem should depend about what you know about the algorithm behind the moderators choice to open a door, and which.

comment by yttrium · 2014-02-11T19:25:28.352Z · LW(p) · GW(p)

If I understand you correctly, your scenario is different from the one I had in mind in that I'd have both computers instantiated at the same time (I've clarified that in the post), and then considering the relative probability of experiencing what the 1 kg computer experiences vs experiencing what the 2 kg computer experiences. It seems like one could adapt your scenario by creating a 1 kg and a 2 kg computer at the same time, offering both of them a choice between A and B, and then generating 5 paperclips if the 1 kg computer chooses A and (additionally) 4 paperclips if the 2 kg computer chooses B. Then, the right choice for both systems (who still can't distinguish themselves from each other) would still be A, but I don't see how this is related to the relative weight of both maximizer's experiences - after all, how much value to give each of the computer's votes is decided by the operators of the experiment, not the computers. To the contrary, if the maximizer cares about the experienced number of paperclips, and each of the maximizers only learns about the paperclips generated by it's own choice regarding the given options, I'd still say that the maximizer should choose B.

Replies from: Manfred
comment by Manfred · 2014-02-11T20:33:54.507Z · LW(p) · GW(p)

To the contrary, if the maximizer cares about the experienced number of paperclips, and each of the maximizers only learns about the paperclips generated by it's own choice regarding the given options

Right, that's why I split them up into different worlds, so that they don't get any utility from paperclips created by the other paperclip maximizer.

how much value to give each of the computer's votes is decided by the operators of the experiment, not the computers

Not true - see the Sleeping Beauty problem.

Replies from: yttrium
comment by yttrium · 2014-02-12T07:55:58.040Z · LW(p) · GW(p)

I still think that the scenario you describe is not obviously and according to all philosophical intuitions the same as one where both minds exist in parallel.

Also, the expected number of paperclips (what you describe) is not equal to the expected experienced number of paperclips (what would be the relevant weighting for my post). After all, if A involves killing the maximizer before generating any paperclip, the paperclip-maximizer would choose A, while the experienced-paperclip-maximizer would choose B. The probability of experiencing paperclips would be obviously different from the probability of paperclips existing, when choosing A.

Replies from: Manfred
comment by Manfred · 2014-02-12T19:15:24.958Z · LW(p) · GW(p)

Also, the expected number of paperclips (what you describe) is not equal to the expected experienced number of paperclips (what would be the relevant weighting for my post).

If you make robots that maximize your proposed "subjective experience" (proportional to mass) and I make robots that maximize some totally different "subjective experience" (how about proportional to mass squared!), all of those robots will act exactly like one would expect - the linear-experience maximizers would maximize linear-experience, the squared-experience maximizers would maximize squared-experience.

Because anything can be putt into a utility function, it's very hard to talk about subjective experience by referencing utility functions. We want to reduce "subjective experience" to some kind of behavior that we don't have to put into the utility function by hand.

In the Sleeping Beauty problem, we can start with an agent that selfishly values some payoff (say, candy bars), with no specific weighting on the number of copies, and no explicit terms for "subjective experience." But then we put it in an unusual situation, and it turns out that the optimum betting strategy is the one where it gives more weight to world where there are more copies of it. That kind o behavior is what indicate to me that there's something going on with subjective experience.

comment by Scott Garrabrant · 2014-02-11T18:17:02.200Z · LW(p) · GW(p)

It seems like if you weight by matter, you also have to weight by the amount of time it takes to do the computation. Slower minds should be more likely.

Replies from: Scott Garrabrant
comment by Scott Garrabrant · 2014-02-11T18:41:17.243Z · LW(p) · GW(p)

This raised a question in my mind about relativity. It was a dead end, but in case others have a similar thought:

You have to take the time as experienced by the mind itself, and you have to take the rest mass of the mind.

If you do this, you can go on thinking about weights by mass of the brain, without having to choose a reference frame.

comment by ThisSpaceAvailable · 2014-02-16T07:08:09.100Z · LW(p) · GW(p)

If I take a 1 kg computer and put a 1 kg rock on top of it, do I now have 2 kg computer? Are you only counting the "essential" weight, and if so, how do you define "essential"? What if I have a 100 kg computer, of which 1 kg is running a sentient program, and 99 kg is playing Solitaire? How do you decide how much of the computations are part of the sentience?

What if we run a computer, record its state at each clock cycle, and broadcast those states to a billion TV screens? Do we now weight the computer nine orders of magnitude more than we would otherwise?

Replies from: yttrium
comment by yttrium · 2014-07-28T07:10:08.257Z · LW(p) · GW(p)

The rock on top of the computer wouldn't count into the "amount doing the computation". Apart from that, I agree that weight shouldn't be the right quantity. A better way to formulate what I am getting at would maybe be that "probability of being a mind is an extensive physical quantity". I have updated the post accordingly.

Regarding your second paragraph: No, the TV screens aren't part of the matter that does the computation.

comment by lavalamp · 2014-02-12T01:11:19.730Z · LW(p) · GW(p)

I think you're getting downvoted for your TL;DR, which is extremely difficult to parse. May I suggest:

TL;DR: Treating "computers running minds" as discrete objects might cause a paradox in probability calculations that involve self-location.

Replies from: yttrium
comment by yttrium · 2014-02-12T08:06:25.116Z · LW(p) · GW(p)

Changed it, that sounds better.

Replies from: ThisSpaceAvailable
comment by ThisSpaceAvailable · 2014-02-16T07:09:54.289Z · LW(p) · GW(p)

Of course, now your introductory phrase "TL;DR by lavalamp:" doesn't make sense without the knowledge that "lavalamp" is a proper noun.