# Negentropy Overrated?

post by paulfchristiano · 2011-12-29T10:14:30.847Z · LW · GW · Legacy · 32 comments

This post may be interesting to some LWers.

In summary: it looks like our universe can support reversible computers which don't create entropy. Reversible computers can simulate irreversible computers, with pretty mild time and space blowup. So if moral value comes from computation, negentropy probably won't be such an important resource for distant future folks, and if the universe lasts a long time we may be able to simulate astronomically long-lived civilizations (easily 10^(10^25) clock cycles, using current estimates and neglecting other obstructions).

Has this been discussed before, and/or is there some reason that it doesn't work or isn't relevant? I suspect that this consideration won't matter in the long run, but it is at least interesting and seems to significantly deflate (long-run) concerns about entropy.

comment by Wei_Dai · 2011-12-29T18:24:18.818Z · LW(p) · GW(p)

Anything you can compute using an infinite amount of negentropy, time T, and space S, can be computed using only about S negentropy if you are willing to spend a little bit of extra space and time (S log T and T^1.5, for example, or S and 2^S * T). So future universes may be constrained by available space and time rather than negentropy,

I don't understand the conclusion here. It sounds like the set of potential computations we can do is a (somewhat complicated) function of available time, space, and negentropy. Given a fixed amount of time and space, we can do more computations if we had more negentropy, right? So in what sense would we not be constrained by negentropy?

comment by paulfchristiano · 2011-12-29T18:31:14.628Z · LW(p) · GW(p)

In general, a source of unlimited negentropy buys you only a small polynomial increase in the available time and space. So negentropy does matter, but the total amount of computation you can do is dominated by the available space and time rather than the available negentropy.

In the limit where you have exponentially more time than space (say, the universe turns out to be some arbitrary reversible bounded cellular automaton) then entropy does no good at all.

comment by Wei_Dai · 2011-12-29T18:57:51.314Z · LW(p) · GW(p)

In the limit where you have exponentially more time than space (say, the universe turns out to be some arbitrary reversible bounded cellular automaton) then entropy does no good at all.

Ok, I see, but this assumes that once you've completed a computation, a second execution of it has no moral value, right? (Because more negentropy would allow you to drive the reversible computation forward faster and complete more executions in the same available time.)

comment by paulfchristiano · 2011-12-29T19:15:48.030Z · LW(p) · GW(p)

Yes--if over the course of your computation you explore on a fraction X of all possible states of the computer, a supply of infinite negentropy would allow you to run the computation something like 1/X faster.

comment by JGWeissman · 2011-12-29T17:53:36.038Z · LW(p) · GW(p)

Before discussing application to such hard problems as simulating brains, I would like to see an example of a short irreversible program, it reversible equivalent, and a trace through running each one that demonstrates the advantage of the reversible program.

comment by paulfchristiano · 2011-12-29T19:29:45.161Z · LW(p) · GW(p)

I want to calculate f^128(X) (mod N), where f is a polynomial and f^128 is f composed with itself 128 times.

First I calculate f(X) (mod N). Now my workspace is:

X, f(X), garbage.

I copy out the f(X) to some scratch space. Now my workspace is:

X, f(X), garbage, f(X).

Now I reverse my original computation to clear the garbage.

X, 0, 0, f(X).

Now I repeat this process to compute f(f(X)).

X, f(f(X)), garbage, f(X).

Now copy out f(f(X)) and uncompute.

X, 0, 0, f(X), f(f(X)).

Now I repeat my calculation of f(X), but backwards.

X, 0, 0, 0, f(f(X)).

Now I repeat my computation of f(f(X)), but starting from f(f(X)) instead of from X.

X, 0, 0, 0, f(f(X)), f(f(f(f(X))))

Now I reverse my computation of f(f(X)):

X, 0, 0, 0, 0, f(f(f(f(X))))

And so on.

I end up with

X, 0, 0, 0, 0, 0, 0, 0, 0, 0, f^128(X).

I've used 11 times more memory than required to compute f. But on the flipside, I've generated no waste heat, while just applying f over and over again would have generated 128 times the waste heat required to compute f.

comment by JGWeissman · 2011-12-29T19:42:48.705Z · LW(p) · GW(p)

1. First I calculate f(X) (mod N). Now my workspace is:

X, f(X), garbage-1.

2. I copy out the f(X) to some scratch space. Now my workspace is:

X, f(X), garbage-2, f(X).

3. Now I reverse my original computation to clear the garbage.

X, 0, 0, f(X).

Is garbage-1 the same as garbage-2? If so, how do you copy f(X) in step 2 without introducing more garbage? If not, how do you reverse the computation in step 3?

comment by paulfchristiano · 2011-12-29T20:14:03.423Z · LW(p) · GW(p)

You can copy something reversibly (into fresh memory) without creating garbage by a using a CNOT (controlled not).

comment by timtyler · 2011-12-29T14:13:38.310Z · LW(p) · GW(p)

Has this been discussed before, and/or is there some reason that it doesn't work or isn't relevant?

I go over some of the issues here. One of the points as a taster:

However, in order to attain the supposed benefits of reversible computation, the reversible machine must actually be run backwards to attain its original state. If this step is not taken, then typically the machine becomes clogged up with digital heat i.e. entropy, and is thus rendered unable to perform further useful work.

comment by paulfchristiano · 2011-12-29T18:18:28.572Z · LW(p) · GW(p)

I am talking about very long term applications, which (it seems?) you aren't trying to address.

For example: yes, to run a reversible computer without waste heat you need to actually uncompute intermediate results. This introduces time overhead which is generally unacceptable for real applications in the modern world, where negentropy is abundant. But what does this have to do with the long term capability of the universe for computation?

comment by timtyler · 2011-12-29T18:45:17.148Z · LW(p) · GW(p)

Reversible computing is good news for power consumption and heat dissipation (including in the long term) - but not great news - bacause of the reasons I go over in my article.

If you think that actually running things backwards is actually an attractive answer, perhaps think some more about how you are going to correct all errors and prevent an error catastrophe upon reversal - and about how you are going to propagate the "reverse now" signal in a reversible manner.

comment by paulfchristiano · 2011-12-29T19:11:52.971Z · LW(p) · GW(p)

perhaps think some more about how you are going to correct all errors and prevent an error catastrophe upon reversal

The normal way. This generates waste heat, but at a rate which depends on the error rate of your components. Under our current understanding of physics, this can be driven essentially to zero in the long run. Even if it can't, it can at least be driven down until we encounter some as-yet-unknown fundamental physical limitation. If we imagine people living in a reversible CA, or any other laws of physics which we can understand, then we can see how they could build an error-free computer once they had a theory of everything. Do you suspect our universe is more complicated, so that such an understanding is impossible? What do you think determines the quantitative bound on achievable error rate?

and about how you are going to propagate the "reverse now" signal in a reversible manner.

I don't understand this objection. I can write down reversible circuits which coordinate with only a constant factor penalty (it is trivial if I don't care about the constant--just CNOT in a 'reverse' bit to each gate from a central controller, and then make each gate perform its operation in reverse if the bit is set, tripling the number of gates). What fundamental non-idealness of reality are you appealing to here, that would prevent a straightforward solution?

comment by timtyler · 2011-12-30T11:47:33.949Z · LW(p) · GW(p)

perhaps think some more about how you are going to correct all errors and prevent an error catastrophe upon reversal

The normal way. This generates waste heat, but at a rate which depends on the error rate of your components. Under our current understanding of physics, this can be driven essentially to zero in the long run.

It seems to me that with hardware error correction, you have to pay for your error rate with heat. If you want to recover your machine by running it backwards you need a very low hardware error rate - which is correspondingly expensive in terms of heat dissipation.

If we imagine people living in a reversible CA, or any other laws of physics which we can understand, then we can see how they could build an error-free computer once they had a theory of everything.

I am not clear how having a TOE will help with cosmic rays and thermal noise. Of course you can deal with thermal noise using a fridge - but then your fridge needs a power supply...

and about how you are going to propagate the "reverse now" signal in a reversible manner.

I don't understand this objection. I can write down reversible circuits which coordinate with only a constant factor penalty (it is trivial if I don't care about the constant--just CNOT in a 'reverse' bit to each gate from a central controller, and then make each gate perform its operation in reverse if the bit is set, tripling the number of gates).

Propagating a "reverse" signal through a large system and getting the components to synchronously reverse course is not exactly trivial. It's also hard to do reversibly, since the "reverse" signal itself tends to dissipate at the edges of the system. - though as you say, that's a one-off cost.

You proposed "tripling of the number of gates". Plus the machine has twice the runtime because of the "running backwards" business. Reversibility has some costs, it seems...

I am pretty sceptical about the idea that the future will see reversible computers that people will bother to run backwards. Instead, I expect that we will see more of the strategies mentioned in my article.

comment by Sniffnoy · 2011-12-29T20:23:16.041Z · LW(p) · GW(p)

I think if you realistically want to run a reversible computer, you need to uncompute regardless, because otherwise your space is non-reusable, becoming only as useful as time.

comment by gwern · 2011-12-29T17:04:04.008Z · LW(p) · GW(p)

This sounds right to me. Reversible computing means you can get closer to the energy limit set by Landauer's principle, but you still don't drive the negentropy cost per bit to zero.

comment by paulfchristiano · 2011-12-29T18:11:39.049Z · LW(p) · GW(p)

You get the energy limit set by Landauer's principle without reversible computing. Reversible computing completely circumvents Landauer's principle (although there may be other limitations).

comment by gwern · 2011-12-29T21:37:55.179Z · LW(p) · GW(p)

I don't follow. You still have to pay for erasing and changing your bits, regardless of whether you use reversible computing and do the erasure at the end, or whether you do it during the computation as in irreversible computing.

comment by paulfchristiano · 2011-12-29T21:44:27.985Z · LW(p) · GW(p)

You generally uncompute intermediate results in reversible computation, rather than erasing them: if you produced some garbage by starting from a low entropy state and running the computation C forward, you can get rid of the garbage by just running C backwards (perhaps first copying whatever output you care about, so that it doesn't get destroyed).

comment by gwern · 2011-12-29T22:01:50.249Z · LW(p) · GW(p)

perhaps first copying whatever output you care about, so that it doesn't get destroyed

Well, yeah. You're going to use up negentropy for that - where are you copying to? Reversible computing just means you spend less negentropy. (Feel like I've said this before.)

comment by paulfchristiano · 2011-12-29T22:32:24.437Z · LW(p) · GW(p)

Yes, you just produce less entropy. But you produce a lot less entropy and it is completely unrelated to Landauer's principle.

Suppose I want to calculate a 1TB document created by a googol person civilization running for a googol googol years. I only have to produce a TB of entropy, not more than a googol googol googol bits (as I would have to if I used irreversible computing naively).

comment by Vladimir_Nesov · 2011-12-30T15:03:45.001Z · LW(p) · GW(p)

Very nice! So you don't just cheaply compute-and-uncompute a lot of independent worlds, you can allow them to leave an arbitrarily-difficult-to-produce trace on the future worlds. Given how much entropy we really have, sufficiently small persons for example can be spared from uncomputation.

In particular, a person can live in an incrementally computed-and-uncomputed virtual world that is being regularly reversed to its initial state, with the effect that only the person consumes entropy, and the whole arbitrarily complicated world has zero entropic footprint. The world could also be optimized game-save-style over person-time, starting from an initial state, but going forward, so that some version of the person does all the updating, and so carries the excess entropy. Alternatively, improvements to the world could be carried out by discarded copies. Think of software downloaded from the distant future...

Or more generally, this is just time travel, where you can transport sufficiently small things (and people) to the past (or between timelines) and change things the next time over. You travel forwards in time by computing, backwards by uncomputing, you can take some luggage with you, you can climb up a different timeline by observing without interference, and you can intervene and change a timeline any time you want. The network of people navigating baseline network of virtual worlds builds up a second level (implementing meta-time), which itself can be navigated and intervened-in by other observers, and so forth. Sounds like a Greg Egan novel (that wasn't written yet).

comment by Dmytry · 2011-12-29T13:19:39.059Z · LW(p) · GW(p)

The interesting issue with reversible computation is that you do not need complete reversibility to reap nearly all the energy saving benefits of reversible computation. For example, a CPU built on perfect reversible logic, which would run normal irreversibly computing software, would still have ridiculously low energy requirements compared to today's CPUs, as most of the irreversibility happens on the very low level (think intermediate steps when adding two numbers, think loss of state of each transistor in CPU at every clock cycle) and at the level of software the number of erased bits per cycle is actually rather small. edit: and can probably be further minimized by optimizing compiler without changing behaviour of program in any way or changing it's memory requirements by more than a constant factor

comment by Douglas_Knight · 2011-12-30T02:35:22.423Z · LW(p) · GW(p)

I think this article needs the terms "P" and "PSPACE": reversible computing can apply PSPACE algorithms, while normal computers are limited to P by Landauer's principle. Yes, there's nothing true in that sentence that's not already explicit in the article, but abstractions are very useful for organizing understanding.

comment by Dmytry · 2011-12-29T23:06:02.644Z · LW(p) · GW(p)

Other thing that may be interesting to note. You can describe reversible algorithms in straight C or C++ . Instead of the assignment operator, you need to use operators ^= , += , -= which are reversible, and you would need a compiler that would be intelligent about wiping the temporary variables in the expressions by reversion of the statement. The multiplications are slightly tricky but a+=b*c can be implemented in a reversible manner.

I am a game engine programmer (with background in special effects / simulation) and it seems to me that I could easily write a neural network simulator (fairly basic) which would make use of reversible computation and would use about 4x the memory but would have a hundred million times lower power consumption for 10^14 synapses. Essentially it will only wipe as much data per step as is the input and output of conscious being, which is much smaller than the state ( assuming 1 megapixel vision and 10 000 muscles.) The exercise seems pretty trivial to me; the calculation structure would vaguely resemble a Feistel network (I drew my inspiration from it), and it would be able to run the being in reverse using some constant-size extra data that is carried along with the being, if a log of the being's original 'sensory input' is recorded.

comment by Will_Newsome · 2011-12-29T21:49:31.797Z · LW(p) · GW(p)

if you have a sufficient density of superintelligences you can reverse the computation done by the universe itself which is pretty crazy isnt it yeah it is anyway this has practical consequences for ai cooperation problems so your decision theory should be able to handle it if your decision theory doesnt suck the end result can end up looking like qm with a nonrandom collapse postulate if superintelligences reverse computations in a way that is not identical to the born rule ie they prefer worlds where magical unicorns exist or whatever

comment by paulfchristiano · 2011-12-29T22:26:32.728Z · LW(p) · GW(p)

Let me try to parse and respond:

Yes, it seems possible that a superintelligence embedded in the world can exert enough control to reverse the universe's computation. I don't think density is the relevant notion, and there are confusing issues going on here (which have certainly been touched on at LW). It is not clear that this sort of reversal is possible.

Yes, it seems like reversing the universe's computation may be useful under a broad range of circumstances, though ai cooperation is not high on the list for me--recovering negentropy the universe has already burned seems much more likely, and incomprehensible superintelligence-activities seem more likely still. I agree that recovering as much negentropy as possible may require cooperation between not-identical AIs.

I agree that if you have some sort of causal/decision-theoretic-significance account of consciousness, the possibility of these interactions might give you a Born rule. But it seems like an account of consciousness has already done most of the work of explaining the Born rule, and this is sort of an afterthought. I think it is possible in principle that consciousness in our universe only exists because of some convergence of this form, but again most of the mystery is coming from the account of consciousness (and similar strangeness is possible without QM).

I don't have a good model for why you write the things you write (though I rarely vote them down). LessWrong seems like the only place anywhere on earth where people care about some of the things you think about it, but your behavior suggests you don't care about it. I accept that this is due to facts about you I don't understand, and that comments like this one may just make the issue worse, but it does seem to me like the stuff you write about isn't at very much inferential distance from the zeitgeist here, and that if you wanted to talk about it uncryptically you just could, and if you don't want to talk about it it is perplexing that you bother.

comment by Will_Newsome · 2011-12-29T22:36:08.318Z · LW(p) · GW(p)

Density is important because entanglements run away at light speed: if you can't get those entanglements back then you can't reverse the past, you can only reverse back to the point where the superintelligence originated, which isn't that neat. The only way to recohere the system is if there's a boundary condition or, more understandably, if a superintelligence catches your lost information and rushes it back to you (and presumably you would do the same for it, so it's a very clear-cut trade scenario). Problem is that it might take a very high density of superintelligences throughout the universe to pull this off all the way back to the big bang: even so it seems likely that you can get perfect information for at least the last few thousand years, enough to, say, ressurect every human who'd ever died. Even cooler than cryonics. (ETA: Insofar that these ideas are correct they're Steve Rayhawk's, insofar as they're retarded they're mine.)

I don't follow what consciousness has to do with it? I think I see what you're getting at but your use of the word "consciousness" kinda threw me off, and I'd rather not misinterpret you.

comment by paulfchristiano · 2011-12-29T22:52:57.826Z · LW(p) · GW(p)

The only way to recohere the system is if there's a boundary condition

Or if spacetime is compact in any way, or in fact if there is only finitely much negentropy. In any of these cases, your abstraction of a bunch of distinct but potentially interfering branches will break down, and you can pick up all of your old waste heat even if it is "running away" at light speed.

In the other universes, where somehow things can continue expanding at light speed indefinitely, you can recover perfect info by exploring possible physical theories until you find yourself.

But it seems like all of these considerations are far too frail to have much impact in themselves; they just serve to put a lower bound on how weird we can expect the far future to be.

comment by Will_Newsome · 2011-12-29T23:11:35.976Z · LW(p) · GW(p)

In the other universes, where somehow things can continue expanding at light speed indefinitely, you can recover perfect info by exploring possible physical theories until you find yourself.

I don't immediately see how this gets around the problem; I'm probably just being stupid, but aren't you still left with a bunch of possible histories consistent with your current state/decisions only an unknown subset of which are real? ("Real" in the usual sense, i.e. can reliably be used to coordinate with other agents.)

I agree re lower bound on weirdness, I'd add it serves as a lower bound on how competent your decision theory has to be (which shouldn't be a problem).

comment by paulfchristiano · 2011-12-29T23:55:14.115Z · LW(p) · GW(p)

("Real" in the usual sense, i.e. can reliably be used to coordinate with other agents.)

With respect to e.g. bringing back all of the dead at least it doesn't seem to matter: there are lots of histories consistent with your memories, some of them aren't consistent with your 'real' history, but (at least if we have appropriate philosophical views towards our prior and so on) each of these histories also leads to an agent in your current situation, so if each one of them guesses the same distribution you end up with the same guesses as if each had been informed of their "real" history. With respect to uncomputing the universe I agree that you can't recover all of the negentropy, but you do seem to recover perfect info in the relevant sense and in such universes you have infinitely much stuff anyway.

comment by Will_Newsome · 2011-12-30T01:36:45.834Z · LW(p) · GW(p)

Okay, I understand now; I was thinking about the problem of reversing the past. Your arguments make sense if you just want to resurrect folk; it's possible (as you seem to think?) that there's no particularly good reason to reverse the past as long you have tons of computing power and all the information about the past that you'd need in practice. It's definitely true that the latter strategy is applicable in more possible universes.

comment by Will_Newsome · 2011-12-29T21:53:16.269Z · LW(p) · GW(p)

New hobby: expounding profound ideas in lolcat or without any punctuation or capitalization.