Taking Occam Seriously
post by steven0461 · 2009-05-29T17:31:52.268Z · LW · GW · Legacy · 51 commentsContents
51 comments
Paul Almond's site has many philosophically deep articles on theoretical rationality along LessWrongish assumptions, including but not limited to some great atheology, an attempt to solve the problem of arbitrary UTM choice, a possible anthropic explanation why space is 3D, a thorough defense of Occam's Razor, a lot of AI theory that I haven't tried to understand, and an attempt to explain what it means for minds to be implemented (related in approach to this and this).
51 comments
Comments sorted by top scores.
comment by SilasBarta · 2009-05-29T21:31:50.826Z · LW(p) · GW(p)
Okay, now for a more substantive comment (ETA see note 1). I read the essay on what it means for a mind to be implemented, and Almonds talks about a "problem" presented by Searle that says, "Can you call a wall a mind (or word processor program) on the grounds that you can find some ismorphism between the molecular motion of the wall, and some more interesting program?" and thus, "Why is the ismorphism to the program somehow less of a valid interpretation than that which we apply to an actual computer running the known program?"
I really don't see what the problem is here. The argument relies on the possibility of finding an isomorphism between an arbitrary "interesting" algorithm, and something completely random. Yes, you can do it: but only by applying an interpreter of such complexity that it is itself the algorithm, and the random process is just background noise.
The reason that we call a PC (or a domino setup ) a "computer" is because its internal dynamics are consistently ismorphic to the abstract calculation procedure the user wants it to do. In a random environment, there is no such consistency, and as time progresses you must keep expanding your interpretation so that it continues to output what WordStar does. Which, again, makes you the algorithm, not the wall's random molecular motions.
(Edit to add: By contrast, a PC's interpreter (the graphics card, monitor, mouse, keyboard, etc.) do not change in complexity or the mapping the perform from the CPU/memory to me.)
Surely, the above differences show how you can meaningfully differentiate between true programs/minds and random processes, yet Almond doesn't mention this possibility (or I don't understand him).
1 By this remark, I was absolutely not meaning to trivialize the other comments here. Rather, at the time I posted this, there were few comments, and I had just made a comment with no substance. The remark compares to my other comment, not to any other commenter.
Replies from: gjm, PaulUK, steven0461↑ comment by gjm · 2009-05-30T23:40:53.447Z · LW(p) · GW(p)
One picky remark: Paul Almond ascribes this argument to Searle, and indeed it appears in a work of Searle's from 1990; but Hilary Putnam published a clearer and more rigorous presentation of it, two years earlier, in his book "Representation and reality".
(Putnam also demolished the rather silly Goedelian argument against artificial intelligence that's commonly attributed to J R Lucas before Lucas even published it. Oh, and he was one of the key players in solving Hilbert's 10th problem. Quite a clever chap.)
↑ comment by PaulUK · 2009-05-30T02:16:26.775Z · LW(p) · GW(p)
As the author of this article, I will reply to this, though it is hard to make much of a reply here, though. (I actually got here our of curiosity when I saw the site logs). I am, however, always pleased to discuss issues like this with people. One issue with this reply is that it is not just randomness we have to worry about. If we are basing a computational interpretation on randomness, yes, we may need to make the computational interpretation progressively more extreme, but Searle's famous WordStar running in a wall example is just one example. We may not even have the computational interpretation based on randomness: it could conceivably be based on structure in something else, even though that structure would not be considered to be running the computer program except under a very forced interpretation. Where would we draw the line? Another point - why should it matter if we use a progressively more extreme interpretation? We might, for example, just want to say that a computation ran for 10 seconds, which relies on a fixed intertreptation (if a complex one), and what happens after that may not interest us. Where would we draw the line? Another issue is that the main argument had been about statistical issues with combining computers when considering probability issues - the whole thing had not been based on Searle - who would not take me any more seriously by the way.
Replies from: SilasBarta↑ comment by SilasBarta · 2009-05-30T04:36:13.591Z · LW(p) · GW(p)
We may not even have the computational interpretation based on randomness: it could conceivably be based on structure in something else, even though that structure would not be considered to be running the computer program except under a very forced interpretation. Where would we draw the line?
We would draw the line where our good old friend mutual information comes in. If learning the results of the other phenomenon tells you something about the results of the algorithm you want to run, then there is mutual information, and the phenomenon counts as a (partial) implementation of the algorithm.
Replies from: PaulUK↑ comment by PaulUK · 2009-05-30T05:23:34.220Z · LW(p) · GW(p)
This is an approach I considered back in 1990something actually, and at the time I actually considered it correct. I get the idea. We say that the "finding algorithm" somehow detracts from what is running. The problem is, this does not leave a clearly defined algorithm as the one being found. if X is found by F, you might say that all that runs is a "partial version of X" and that X only exists when found by F. This, however, would not just apply to deeply hidden algorithms. I could equally well apply it your brain. I would have to run some sort of algorithm, F, on your brain to work out that some algorithm corresponding to you, X, is running. Clearly, that would be nothing like as severe as the extreme situations discussed in that article, but what does it mean for your status? Does it mean that the X corresponding to you does not exist? Are you "not all there" in some sense?
Here is a thought experiment:
A mind running in a VR system (suppose the two are one software package to make this easier) gradually encrypts itself. By this I mean that it goes through a series of steps, each intended to make it slightly more difficult to realize that the mind is there. There is no end to this. When does the mind cease to exist? When it is so hard to find that you would need a program as long as the one being hidden to find it? I say that is arbitrary.
You suggest that maybe the program running the mind just exists "partially" in some way, which I fully understand. What would the experience be like for the mind as the encryption gets more and more extreme? I say this causes issues, which are readily resolved if we simply say that the mind's measure decreases.
I can also add a statistical issue to this, which I have not written up yet. (I have a lot to add on this subject. It may be obvious that I need to argue that this applies to everything, and not just minds, to avoid some weird kind of dualism.).
Suppose we have two simulations of you, running in VRs. One is about to look in a box and see a red ball. The other will see a blue ball. We subject the version that will see the blue ball to some process that makes it slightly harder to find. You don't know which version you are. How much will you expect to see a blue ball when you look in the box? Do you say it is 50/50 that you will see a red ball or a blue ball? We keep increasing the "encryption" a bit each time I ask the question. If your idea that somehow the mind is only "partial" by needing the finding algorithm to find it is right, I suggest we end up with statistical incoherency. We can only say that the probability is 50/50 when the situations are exactly the same, but that will never be the case in any real situation. For any situation, one mind will need a bit more finding than the other.
In other words, if you think the length of the finding algorithm makes the algorithm running a mind somehow "partial", in a statistical question in which you had two possibilities, one in which your mind was harder to find than the other, and you don't know which situation you are in, when would you eliminate the "partial" mind as a possibility? If you say, "Never. As the encryption increases I would just say I am less and less likely to be in that situation" you have effectively agreed with me by adopting an approach where each mind is as valid as the other (you accept either as a candidate for your situation but treat them differently with regard to statistics - which is what I do). If you say that one mind cannot be a candidate for your situation then you have the issue of cut-off point. What cut-off point? When would you say, “This mind is real. This mind is only partial so cannot be a candidate for my experience. Therefore, I am the first mind?”
I would point out that I do not ignore these issues. I address them by using measure. I take the view that a mind which takes more finding exists with less measure, because a smaller proportion of the set of all possible algorithms that could be used to find something like it will find something like it.
Finally, this only deals with one issue. There is also the issue of combining computers in the statistical thought experiments that I mentioned in the first article of that series. My intention in that series is to try to show that these various issues demand that we take a particular view about minds and reality to maintain statistical coherency.
Replies from: loqi↑ comment by loqi · 2009-05-30T18:38:35.784Z · LW(p) · GW(p)
When does the mind cease to exist? [...] I take the view that a mind which takes more finding exists with less measure, because a smaller proportion of the set of all possible algorithms that could be used to find something like it will find something like it.
I'm running into trouble with the concept of "existence" as it's being applied here. Surely existence of abstract information and processes must be relative to a chosen reference frame? The "possible algorithms" need to be specified relative to a chosen data set and initial condition, like "observable physical properties of Searle's wall given sufficient locality". Clearly an observer outside of our light cone couldn't discern anything about the wall, regardless of algorithm.
An encrypted mind "existing less" doesn't seem to carry any subjective consequences for the mind itself. What if a mind encrypts itself but shares the key with a few others? Wouldn't its "existence" depend on whether or not the reference frame has access to the key?
If you've read it, I'm curious to know what you think of the "dust hypothesis" from Egan's Permutation City in this context.
Replies from: PaulUK↑ comment by PaulUK · 2009-05-30T18:57:14.131Z · LW(p) · GW(p)
"Less measure" is only meant to be of significance statistically, not subjectively. For example, if you could exist in one of two ways, one with measure X and one with measure of 0.001X, I would say you should think it more likely you are in the first situation. In other words, I am agreeing (if you are arguing for this) that there should be no subjective difference for the mind in the extreme situation. I just think we should think that that situation corresponds to "less" observers in some way.
My own argument is actually a justification of something a bit like the dust hypothesis in "Permutation City". However, there are some significant differences, so that analogy should not apply too much. I would say that the characters in Greg Egan's novel undergo a huge decrease in measure, which could cause philosophical issues - though it would not feel different after it had happened to you.
I think we should consider this in terms of measure because there are "more ways to find you" in some situations than in others. It is almost like you have more minds in one situation than another - though there are no absolute numbers and really it should be considered in terms of density. If you want to see why I think measure is important, this first article may help: http://www.paul-almond.com/Substrate1.htm.
Replies from: loqi↑ comment by loqi · 2009-05-30T20:16:56.749Z · LW(p) · GW(p)
For example, if you could exist in one of two ways, one with measure X and one with measure of 0.001X, I would say you should think it more likely you are in the first situation. [...] I just think we should think that that situation corresponds to "less" observers in some way.
This seems tautological to me. Your measure needs to be defined relative to a given set of observers.
I think we should consider this in terms of measure because there are "more ways to find you" in some situations than in others.
More ways for who to find you?
If you want to see why I think measure is important, this first article may help
Very interesting piece. I'll be thinking about the Mars colony scenario for a while. I do have a couple of immediate responses.
How likely is it that you are in Computer A, B or C?
As long as the simulations are identical and interact identically (from the simulation's point of view) with the external world, I don't think the above question is meaningful. A mind doesn't have a geographical location, only implementations of it embedded in a coordinate space do. So A, B, and C are not disjoint possibilities, which means probability mass isn't split between them.
The more redundancy in a particular implementation of a version you, then the more likely it is that that implementation is causing your experiences.
I see this the other way around. The more redundancy in a particular implementation, the more encodings of your own experiences you will expect to find embedded within your accessible reality, assuming you have causal access to the implementation-space. If you are causally disconnected from your implementation (e.g., run on hypothetical tamper-proof hardware without access to I/O), do you exist with measure zero? If you share your virtual environment with millions of other simulated minds with whom you can interact, do they all still exist with measure zero?
Replies from: PaulUK↑ comment by PaulUK · 2009-05-30T20:31:54.256Z · LW(p) · GW(p)
"As long as the simulations are identical and interact identically (from the simulation's point of view) with the external world, I don't think the above question is meaningful. A mind doesn't have a geographical location, only implementations of it embedded in a coordinate space do. So A, B, and C are not disjoint possibilities, which means probability mass isn't split between them."
I dealt with this objection in the second article of the series. It would be easy to say that there are two simulations, in which slightly different things are going to happen. For example, we could have one simulation in which you are going to see a red ball when you open a box and one where you are going to see a blue ball. We could have lots of computers running the red ball situation and then combine them and discuss how this affects probability (if at all).
"The more redundancy in a particular implementation of a version you, then the more likely it is that that implementation is causing your experiences."
Does this mean that if we had a billion identical simulations of you in a VR where you were about to see a red ball and one (different) simulation of you in a VR where you are about to see a blue ball, and all these were running on separate computers, and you did not know which situation you were in, you would not think it more likely you were going to see a red ball? (and I know a common answer here is that it is still 50/50 - that copies don't count - which I can answer if you say that and which is addressed in the second article - I am just curious what you would say about that.)
" see this the other way around. The more redundancy in a particular implementation, the more encodings of your own experiences you will expect to find embedded within your accessible reality, assuming you have causal access to the implementation-space. If you are causally disconnected from your implementation (e.g., run on hypothetical tamper-proof hardware without access to I/O), do you exist with measure zero? If you share your virtual environment with millions of other simulated minds with whom you can interact, do they all still exist with measure zero?"
I am not making any suggestion that there is any connection between measure, redundancy and whether or not you are connected to I/O. Whether you are connected to I/O does not interest me much. However, some particularly low measure situations may be hard to connect to I/O if they are associated with very extreme interpretations.
Replies from: loqi↑ comment by loqi · 2009-05-31T00:02:19.083Z · LW(p) · GW(p)
I dealt with this objection in the second article of the series. It would be easy to say that there are two simulations, in which slightly different things are going to happen.
While this is also a valid and interesting scenario to consider, I don't think it "deals with the objection". The idea that "which computer am I running on?" is a meaningful question for someone whose experiences have multiple encodings in an environment seems pretty central to the discussion.
Does this mean that if we had a billion identical simulations of you in a VR where you were about to see a red ball and one (different) simulation of you in a VR where you are about to see a blue ball, and all these were running on separate computers, and you did not know which situation you were in, you would not think it more likely you were going to see a red ball?
I actually don't have a good answer to this, and the flavor of my confusion leads me to suspect the definitions involved. I think the word "you" in this context denotes something of an unnatural category. To consider the question of anticipating different experiences, I have to assume a specific self exists prior to copying. Are the subsequent experiences of the copies "mine" relative to this self? If so, then it is certain that "I" will experience both drawing a red ball and drawing a blue ball, and the question seems meaningless. I feel that I may be missing a simple counter-example here.
I know a common answer here is that it is still 50/50 - that copies don't count - which I can answer if you say that and which is addressed in the second article
50/50 makes sense to me only as far it represents a default state of belief about a pair of mutually exclusive possibilities in the absence of any relevant information, but the exclusivity troubles me. I read objection 9, and I'm not bothered by the "strange" conclusion of sensitivity to minor alterations (perhaps this leads to contradictions elsewhere that I haven't perceived?). I agree that counting algorithms is just a dressed-up version of counting machines, because the entire question is predicated on the algorithms being subjectively isomorphic (they're only different in that some underlying physical or virtual machine is behaving differently to encode the same experience).
Of course, this leads to the problem of interpretation, which suggests to me that "information" and "algorithm" may be ill-defined concepts except in terms of one another. This is why I think I/O is important, because a mind may depend on a subjective environment to function. If this is the case, removal of the environment is basically removal of the mind. A mind of this sort, subjectively dependent on its own substrate, can be "destroyed" relative to observers of the environment, as they now have evidence for the following reasoning:
- Mind M cannot logically exist except as self-observably embedded in environment E. So if E lacks such an encoding, M cannot exist.
- I have observed E, and have sound reasons (local to E) to doubt the existence of a suitable encoding of M.
- Therefore, M does not exist.
So far, this is the only substrate dependence argument I find convincing, but it requires the explicit dependence of M on E, which requires I/O.
Replies from: PaulUK↑ comment by PaulUK · 2009-05-31T00:33:50.142Z · LW(p) · GW(p)
"Are the subsequent experiences of the copies "mine" relative to this self? If so, then it is certain that "I" will experience both drawing a red ball and drawing a blue ball, and the question seems meaningless. I feel that I may be missing a simple counter-example here."
No. Assume you have already been copied and you know you are one of the software versions. (Some proof of this has been provided). What you don't know is whether you are in a red ball simulation or a blue ball simulation. You do know that there are a lot of (identical - in the digital sense) red ball simulations and one blue ball simulation. My view on this is that you should presume yourself more likely to be in the red ball simulation.
Some people say that the probability is 50/50 because copies don't count. I would make these points:
- sensitivity, which you clearly know about.
- it is hard to say where each program starts and ends. For example, we could say that the room with each with red ball simulation computer in it is a simulation of a room with a red ball simulation computer in it - in other words, the physical environment around the computer could validly be considered part of the program. It is trivial to argue that a physical system is a valid simulation of itself. As each computer is going to be in a slightly different physical environment, it could be argued that this means that all the programs are different, even if the digital representation put into the box by the humans is the same. The natural tendency of humans is just to to focus on the 1s and 0s - which is just a preferred interpretation.
- Humans may say that each program is "digitally" the same but we might interpret the data slightly differently. For example, one program run may have a voltage of 11.964V in a certain switch at a certain time. Another program run may have a voltage of 11.985V to represent the same binary value. It could be argued that this makes them different programs, each of which is simulating a computer with an uploaded mind on it with different voltages in the switches (again, using the idea that a thing is also a computer simulation of that thing if we are going to start counting simulations).
I just think that when we try to go for 50/50 (copies don't count) we can get into a huge mess that a lot of people can miss. While I don't think you agree with me, I think maybe you can see this mess.
"While this is also a valid and interesting scenario to consider, I don't think it "deals with the objection". The idea that "which computer am I running on?" is a meaningful question for someone whose experiences have multiple encodings in an environment seems pretty central to the discussion."
I think the suggested scenario makes it meaningful. There is also the issue of turning off some of the machines. If you know you are running on a billipn identical machines, and that 90% of them are about to be turned off then it could then become an important issue for you. It would make things very similar to what is regarded as "quantum suicide".
We can also consider another situation:
You have a number of computers, all running the same program, and something in the external world is going to affect these computers, for example a visitor from the outside world will "login" and visit you - we could discuss the probability of meeting the visitor while the simulations are all identical.
"This is why I think I/O is important, because a mind may depend on a subjective environment to function. If this is the case, removal of the environment is basically removal of the mind."
I don't know if I fully understood that - are you suggesting that a reclusive AI or uploaded brain simulation would not exist as a conscious entity?
As you asked me about Permutation City (Greg Egan's novel) before, I will elaborate on that a bit.
The "dust hypothesis" in Permutation City was the idea that all the bits of reality could be stuck together in different ways, to get different universe. The idea here is that every interpretation of an object, or part of an object, that can be made, in principle, by an interpretative algorithm, exists as an object in its own right. This argument applies it to minds, but I would clearly have to claim it applies to everything to avoid being some kind of weird dualist. It is therefore a somewhat more general view. Egan's cosmology requires a universe to exist to get scrambled up in different ways. With a view like this, you don't need to assume anything exists. While a lot of people would find this counter-intuitive, if you accept that interpretations that produce objects produce real objects, there is nothing stopping you producing an object by interpreting very little data, or no data at all. In this kind of view, even if you had nothing except logic, interpretation algorithms that could be applied in principle with no input - on nothing at all - would still describe objects, which this kind of cosmology would say would have to exist as abstractions of nothing. Further objects would exist that would be abstractions of these. In other words, if we take the view that every abstraction of any object physically exists as a definition of the idea of physical existence, it makes the existence of a physical reality mandatory.
"Of course, this leads to the problem of interpretation, which suggests to me that "information" and "algorithm" may be ill-defined concepts except in terms of one another. This is why I think I/O is important, because a mind may depend on a subjective environment to function."
and I simply take universal realizability at face value. That is my response to this kind of issue. It frees me totally from any concerns about consistency - and the use of measure even makes things statistically predictable.
Replies from: loqi↑ comment by loqi · 2009-05-31T01:27:26.216Z · LW(p) · GW(p)
Assume you have already been copied and you know you are one of the software versions. (Some proof of this has been provided). What you don't know is whether you are in a red ball simulation or a blue ball simulation. You do know that there are a lot of (identical - in the digital sense) red ball simulations and one blue ball simulation. My view on this is that you should presume yourself more likely to be in the red ball simulation.
Ah, this does more precisely address the issue. However, I don't think it changes my inconclusive response. As my subjective experiences are still identical up until the ball is drawn, I don't identify exclusively with either substrate and still anticipate a future where "I" experience both possibilities.
As each computer is going to be in a slightly different physical environment, it could be argued that this means that all the programs are different, even if the digital representation put into the box by the humans is the same.
If this is accepted, it seems to rule out the concept of identity altogether, except as excruciatingly defined over specific physical states, with no reliance on a more general principle.
The natural tendency of humans is just to to focus on the 1s and 0s - which is just a preferred interpretation.
Maybe sometimes, but not always. The digital interpretation can come into the picture if the mind in question is capable of observing a digital interpretation of its own substrate. This relies on the same sort of assumption as my previous example involving self-observability.
I just think that when we try to go for 50/50 (copies don't count) we can get into a huge mess that a lot of people can miss. While I don't think you agree with me, I think maybe you can see this mess.
I'm not sure if we're thinking of the same mess. It seems to me the mess arises from the assumptions necessary to invoke probability, but I'm willing to be convinced of the validity of a probabilistic resolution.
If you know you are running on a billipn identical machines, and that 90% of them are about to be turned off then it could then become an important issue for you. It would make things very similar to what is regarded as "quantum suicide".
They do seem similar. The major difference I see is that quantum suicide (or its dust analogue, Paul Durham running a lone copy and then shutting it down) produces near-certainty in the existence of an environment you once inhabited, but no longer do. Shutting down extra copies with identical subjective environments produces no similar outcome. The only difference it makes is that you can find fewer encodings of yourself in your environment.
The visitor scenario seems isomorphic to the red ball scenario. Both outcomes are guaranteed to occur.
I don't know if I fully understood that - are you suggesting that a reclusive AI or uploaded brain simulation would not exist as a conscious entity?
No, I was pointing out the only example I could synthesize where substrate dependence made sense to me. A reclusive AI or isolated brain simulation by definition doesn't have access to the environment containing its substrate, so I can't see what substrate dependence even means for them.
In other words, if we take the view that every abstraction of any object physically exists as a definition of the idea of physical existence, it makes the existence of a physical reality mandatory.
I don't think I followed this. Doesn't any definition of the idea of physical existence mandate a physical reality?
I simply take universal realizability at face value. That is my response to this kind of issue. It frees me totally from any concerns about consistency - and the use of measure even makes things statistically predictable.
I still don't see where you get statistics out of universal realizability. It seems to imply that observers require arbitrary information about a system in order to interpret that system as performing a computation, but if the observers themselves are defined to be computations, the "universality" is at least constrained by the requirement for correlation (information) between the two computations. I admit I find this pretty confusing, I'll read your article on interpretation.
↑ comment by steven0461 · 2009-05-29T22:01:09.420Z · LW(p) · GW(p)
That's his "objection 4", if I'm not mistaken. Complexity of interpretation comes in degrees. How much of the complexity needs to be in the interpretation and not in the computer, before you can say the algorithm isn't really being implemented?
Incidentally, I only linked to part 3 because it has links to part 1 and part 2. I should probably have made this clear.
Replies from: SilasBarta↑ comment by SilasBarta · 2009-05-29T22:26:03.213Z · LW(p) · GW(p)
Objection 4 (and the response) treat it as an issue of the absolute length (or complexity) of the interpreter. It's not the same as my point, which is about that the interpreter must be continually expanded in order to map random data onto an algorithm's output. That's why I conclude you can distinguish them: some interpretations necessarily expand as time progresses, others don't.
Also, Almond's response frames it as a problem of how to say "Length L or greater is impermissible". He doesn't address the alternative of asking if the interpreter is longer than the algorithm it's finding, focusing instead on the absolute length.
Replies from: steven0461↑ comment by steven0461 · 2009-05-29T22:31:37.064Z · LW(p) · GW(p)
the alternative of asking if the interpreter is longer than the algorithm it's finding
still sounds to me like it involves an unreasonable discontinuous jump at a certain complexity level.
I haven't read these articles recently, by the way, so I'm not committing to defend their content. (I don't think of that as a sufficient reason not to have posted the links, but I may be wrong on that.)
Replies from: PaulUK↑ comment by PaulUK · 2009-05-30T02:21:27.032Z · LW(p) · GW(p)
I hope it is okay for me to reply to all these. Right, yes, that is my position steven. When the interpreter algorithm length hits the length of the algorithm it is finding, nothing of any import happened. Would we seriously say, for example, that a mind corresponding to a 10^21 bit computer program would be fine, any enjoying a conscious existence, if it was "findable" by a 10^21 bit program, but would suddenly cease to exist if it was findable by only a 10^21+1 bit program? I would say, no. However, I can understand that that is always how people see it. For some reason, the point at which one algorithmic length exceeds the other is the point at which people think things are going too far.
Replies from: SilasBarta↑ comment by SilasBarta · 2009-05-30T04:27:28.357Z · LW(p) · GW(p)
Thanks for joining the discussion, PaulUK/Paul Almond. (I'll refer to you with the former.)
Would we seriously say, for example, that a mind corresponding to a 10^21 bit computer program would be fine, any enjoying a conscious existence, if it was "findable" by a 10^21 bit program, but would suddenly cease to exist if it was findable by only a 10^21+1 bit program? I would say, no.
Well, then I'm going to apply Occam's razor back onto this. If you require a 10^21+1 bit program to extract a known 10^21 bit program, we should prefer the explanation:
a) "You wrote a program one bit too long."
rather than,
b) "You found a naturally occurring instance of a 10^21 bit algorithm that just happens to need a 10^21+1 bit algorithm in order to map it to the known 10^21 bit algorithm."
See the problem?
The whole point of explaining a phenomenon as implementing an algorithm is that, given the phenomenon, we don't need to do the whole algorithm separately. What if I sold you a "computer" with the proviso that "you have to manually check each answer it gives you"?
Replies from: PaulUK↑ comment by PaulUK · 2009-05-30T04:54:26.361Z · LW(p) · GW(p)
Either name is fine (since it is hardly a secret who I am here).
Yes, I see the problem, but this was very much in my mind when I wrote all this. I could have hardly missed the issue. I would have to accept it or deny it, and in fact I considered it a great deal. It is the first thing you would need to consider. I still maintain that there is nothing special about this algorithm length. I actually think your practical example of buying the computer, if anything counts against it. Suppose you sold me a computer and it "allegedly" ran a program 10^21 bits long, but I had to use another computer running a program that was (10^21)+1 bits long to analyze what it was doing and get any useful output. Would I want my money back? Of course I would. However, I would also want my money back if I needed a (10^21)-1 bit program to analyze the computer – and so would you. As a consumer, the thing would be practically useless anyway. In one case I am having to do all the computers job, and a tiny bit more, just to get any output. In the other case I am having to do a tiny bit less than the computer's job to get any output: it would hardly make a practical difference. There is no sudden point at which I would want my money back: I would want it back long before we got near 10^21 bits. Can you show that 10^21 bits is special? I would say that to have it as special you pretty much have to postulate it and I want to work with a minimum of postulates: it is my whole approach, though it causes some conclusions I hardly find comfortable.
You have mentioned Occam’s razor, but we may disagree on how it should be applied. What Occam originally said was probably too vague to help much in these matters, so we should go with what seems a reasonable “modernization” of Occam’s razor. I do not think Occam’s razor tells us to reduce the amount of stuff we accept. Rather, I think it tells us to reduce the amount of stuff we accept as intrinsically existing. I would not, for example, regarding Occam's razor as arguing against the many-worlds interpretation of quantum mechanics, as many people would. I would say that Occam's razor would argue against having some arbitrary wavefunction collapse mechanism if we need not assume one.
I would also say, as well, that this does not resolve the issue of combining computers and probability that I raised in the first article. My intention was to put a number of such issues together and show that we needing to do the sort of thing I said to get round difficult issues.
comment by Brian_Tomasik · 2015-06-14T02:26:34.997Z · LW(p) · GW(p)
Paul's site has been offline since 2013. Hopefully it will come back, but in the meanwhile, here are links to most of his pieces on Internet Archive.
comment by [deleted] · 2009-05-30T05:17:48.164Z · LW(p) · GW(p)
Hoping this is a good place to ask questions, Paul, what do you think of Friendly AI--things like how necessary, how possible, and how desirable it is?
Replies from: PaulUK↑ comment by PaulUK · 2009-05-30T05:37:59.757Z · LW(p) · GW(p)
I think it is possible and inevitable (though I am unsure of timescale). I think it has some risks (I think these are understated by some people who make false analogies between simple systems and ones with minds which may be designed to adapt to an environment and which may be given simple goals and make more sophisticated goals to satisfy these, which humans may not even have specified) and would need extreme caution, but I don't think these risks are avoided in any way if only dangerous people are left to do it. I would also say - please don't view me as authoritative in any way on this.
comment by PaulAlmond · 2010-08-16T22:07:30.972Z · LW(p) · GW(p)
(This is my new username. I was formerly PaulUK.) Just a quick note to say that, after leaving the "Minds, Measure, Substrate and Value" series for a while, I am currently doing Part 4, which will deal with some of the objections that have been made by, among others, Less Wrong members, Part 5, hopefully, will generalize into a cosmological view, as opposed to one that is just about minds.
comment by CarlShulman · 2009-05-29T18:28:13.491Z · LW(p) · GW(p)
Good find, who is Paul Almond other than the author of those articles?
Replies from: PaulUK, steven0461↑ comment by PaulUK · 2009-05-30T02:24:55.400Z · LW(p) · GW(p)
I am just someone who has an interest in these issues Carl, and they are all written in a private capacity: I am not, for example, anyone who works at a university. I have worked as a programmer, and as a teacher of computing, in the past.I think machineslikeus describes me as an "independent researcher" or something like that... which means, I suppose, that I write articles.
Replies from: CarlShulman↑ comment by CarlShulman · 2009-05-30T03:22:54.597Z · LW(p) · GW(p)
Thanks for enriching the infosphere with some nice work, Paul.
↑ comment by steven0461 · 2009-05-29T18:38:56.835Z · LW(p) · GW(p)
Good find, who is Paul Almond other than the author of those articles?
comment by hrishimittal · 2009-05-29T18:21:03.550Z · LW(p) · GW(p)
Thanks. That looks like a really interesting body of work. This one on ethics is quite a fun read.
comment by timtyler · 2009-05-29T22:02:01.731Z · LW(p) · GW(p)
I read the anthropic explanation why space is 3D - one of the more-promising-sounding titles. I did not find it terribly convincing.
Replies from: PaulUK, timtyler↑ comment by PaulUK · 2009-05-30T02:04:06.173Z · LW(p) · GW(p)
I do not find the article on 3D space terribly convincing either - and I am the author of it - so I would have to be understanding if you don't. It is generally my policy, though, that my articles reflect how I think of things at the time I wrote them and I don't remove them if my views change - though I might occasionally add notes after. I do think that an anthropic explanation still works for this: I just don't think mine was a particularly good one.
Replies from: timtyler↑ comment by timtyler · 2009-05-30T08:01:16.439Z · LW(p) · GW(p)
It's a difficult topic. Life (e.g. self-replicating CA) exist fine in 2, 3 and 4 dimensions, though there is still the issue of evolving intelligence. Some say that three dimensions is the only number that permits you to tie knots, though the significance of knots is unclear. I am not convinced that 3 is terribly special - and I'm not sure we know enough about physics and biology to coherently address the issue yet.
Replies from: Nick_Tarleton↑ comment by Nick_Tarleton · 2009-05-30T08:17:55.440Z · LW(p) · GW(p)
With physical laws of character similar to ours (not a CA), though, there are further reasons to think life requires 3 space dimensions (and 1 of time).
Max Tegmark: On the dimensionality of spacetime [PDF]
Replies from: timtyler, rwallace↑ comment by timtyler · 2009-05-30T15:38:55.413Z · LW(p) · GW(p)
I don't really see how you can build an anthropic argument out of that, though. The idea that if you make a radical mutation in one aspect of a life-supporting universe, then it no longer supports life is probably not particularly unusual. For example, if you make the game of life 3D using the same totalistic rule then it no longer supports life either. That is just a consequence of dead universes being more common than living ones, and doesn't have anything to do with there being something special about the dimensionality of our space-time.
↑ comment by rwallace · 2009-05-30T11:29:21.441Z · LW(p) · GW(p)
And cellular automata don't select for intelligence, so it is at least reasonable to conjecture that most observers evolve under physical laws of character similar to ours (and therefore, by the orbit stability argument, in three dimensions of space).
Replies from: PaulUK↑ comment by timtyler · 2009-05-29T22:20:48.639Z · LW(p) · GW(p)
I looked at "What is a Low Level Language" too.
That is a topic I am more interested in. The discussion has some mertis, though it seems rather long and repetitive. At least he understands that there is an issue here. He seems to think it proved that a LLL is the best thing for Occam's razor. I am less sure about that.
Replies from: None↑ comment by [deleted] · 2009-05-30T08:36:36.368Z · LW(p) · GW(p)
I looked at "What is a Low Level Language" too.
I looked at that one and had to skip to the end. What he's getting at doesn't match what I, or most programmers I know, mean by "low level."
Low level just means it's far from how humans think. It's low because we like to put things on top of it that are easier for us to deal with. The idea that it's "close to the machine" just comes from the fact that that's the most popular reason to deliberately make something low level. (Making it easy to analyze is probably the second most popular.)
Replies from: PaulUK↑ comment by PaulUK · 2009-05-30T13:39:09.315Z · LW(p) · GW(p)
But I wasn't trying to argue that "low level" does mean "close to the machine". That, however is a way it is often expressed. I merely listed that as one idea of "low level". If I had not mentioned that in the article someone would have simply said "A low level language is close to the machine" and thought that dealt with it, so I had to deal with it out of completeness. I was not saying that "low level" as "close to the machine" was a formal, official idea - and I actually argued that it isn't. I was after a language which is, as far as possible, free from prejudice towards particular applications and I was arguing that it can be non-trivial to get one. You might dispute my use of the word "low level" for this, but I would say that this is largely a semantic issue and that there is still a need for knowing that we can get a language with these properties. What I was proposing was a way of taking two languages, and testing them against each other without any reference to any third language, any hardware or any physics to determine which of them is most free of any prejudice towards particular uses - in a way, which of them is as close as possible to being free of any information content.
Replies from: None↑ comment by [deleted] · 2009-05-30T19:42:26.232Z · LW(p) · GW(p)
But I wasn't trying to argue that "low level" does mean "close to the machine".
I didn't think, and didn't mean to imply that I thought, you were. I mentioned it for the same reason you did: to help describe my meaning of "low level" by its connection to something related.
I was after a language which is, as far as possible, free from prejudice towards particular applications and I was arguing that it can be non-trivial to get one. You might dispute my use of the word "low level" for this, but I would say that this is largely a semantic issue and that there is still a need for knowing that we can get a language with these properties.
I don't think that's what you're really after. When you describe what you want, it sounds like a language that is prejudiced against describing things that are complicated in reality, so the complexity of the description matches the complexity of the reality.
It's not just a semantic problem that you're calling it "low level." "Low level" means it's far from how humans think, which tends to remove human prejudice. You call it "low level" because you think you can find it by removing prejudice. You actually need to switch from one prejudice to another to get what you want.
(Also, thanks for the reply. Sorry I didn't read the whole thing, but I got to the list of methods you had rejected, and it was just too much. It feels a lot longer to someone who thinks the basic idea behind all the methods is off base.)
comment by Will_Newsome · 2010-08-06T02:10:22.440Z · LW(p) · GW(p)
Occam's Razor Part 7: Hierarchy and Ontology
OM NOM NOM NOM NOM NOM NOM! Thank you Paul Almond for writing this, Justin for pointing me to this, and Steven for enabling Justin to point me to this. Great stuff, and very relevant to what I've been thinking about lately.
comment by Lightwave · 2009-05-29T22:36:03.268Z · LW(p) · GW(p)
Another fun read: Civilization-Level Quantum Suicide
Replies from: steven0461, alvarojabril, HalFinneyQuantum Suicide Reality Editing
If you accept the idea of quantum suicide then you should be open to the idea of using it for editing reality. You could construct some system that monitored events for you and would immediately cause you to cease to exist if events did not happen as you wanted them. The idea would be that you would continue to exist only in those future worlds in which events happened as desired, so that from your point of view, events would always happen as you wanted. You would be using quantum suicide to control your reality.
Quantum Suicide Computing
Suppose you had some computing problem which would take a long time to solve, but you have some way of checking possible answers. You could set up some system which uses quantum events to generate a random answer to the computation and then automatically causes you to cease to exist if the answer is not the correct answer, or if it is not better, in some sense, than the previous answer that you obtained. The idea would be that future worlds would exist in which all possible answers were generated and you would only exist in those worlds where the answer was correct or better than previously generated hours, thereby giving you the perception of having enormous computing power.
↑ comment by steven0461 · 2009-05-29T22:42:52.479Z · LW(p) · GW(p)
Scott Aaronson calls this anthropic computing (under the heading The Anthropic Principle).
↑ comment by alvarojabril · 2009-05-29T22:51:24.552Z · LW(p) · GW(p)
Could be a pretty wild dystopia for the people who aren't hooked up - elites constantly disappearing and the clocks are all wrong. Come to think of it, did I say DYStopia?
↑ comment by HalFinney · 2009-05-31T04:33:36.108Z · LW(p) · GW(p)
Another idea along these lines is mentioned in this blog post:
http://blogs.discovermagazine.com/discoblog/2008/08/11/will-the-lhc’s-future-cancel-out-its-past/
"Now two physicists claim in a new study that no matter how hard we try, we may never turn the LHC on at all. The study is authored by Holger Nielsen and Masao Ninomiya, who argue that the very particles the LHC produces will prevent the accelerator from ever being used. Harvard post-doc and CERN collaborator Kevin Black relates their argument to the grandfather paradox—that a particle like the Higgs boson goes back in time and prevents its own birth (i.e. the future changes the events of the present)."
It's not exactly quantum suicide, but a similar effect is claimed to actually reach into the past to cancel out any branch where lots of Higgs bosons are produced, as the LHC arguably will do. The prediction is that the LHC (nor any similarly powerful collider) will never successfully operate at full power, and so far it's coming true!
(Original paper at http://arxiv.org/abs/0802.2991 )
comment by Vladimir_Nesov · 2009-05-29T21:39:12.844Z · LW(p) · GW(p)
What features make it a worthwhile read, compared all the other clueless/useless philosophy (of mind)?
Replies from: steven0461↑ comment by steven0461 · 2009-05-29T22:27:33.859Z · LW(p) · GW(p)
Almond (like Yudkowsky and (so I hear) Drescher), looks at these topics through much more of an AI/CS lens than most who write about them. There's also stuff in there of interest from the futurist/transhumanist perspective that's common on LW. I'll admit that many of the essays are long, and I won't say you'll want to read the whole things, but I remember finding some of those that I read insightful and carefully-argued.
comment by SilasBarta · 2009-05-29T18:42:29.299Z · LW(p) · GW(p)
Just the list of topics has me drooling ... hope it's worth the read.
comment by mathemajician · 2009-05-29T22:48:54.856Z · LW(p) · GW(p)
The title is a bit misleading. "Algorithmic complexity" is about the time and spaces resources required for computations (P != NP? etc...), whereas this web site seems to be more about "Algorithmic Information Theory", also known as "Kolmogorov Complexity Theory".
Replies from: steven0461↑ comment by steven0461 · 2009-05-29T22:52:03.486Z · LW(p) · GW(p)
Are you sure? I always thought that was "computational complexity". This seems to agree with my usage, as does this.
Replies from: mathemajician↑ comment by mathemajician · 2009-05-29T23:53:14.639Z · LW(p) · GW(p)
One group within the community calls it "Algorithmic Information theory" or AIT, and another "Kolmogorov complexity". I talked to Hutter when we was writing that article for Scholarpedia that you cite. He decided to use the more neutral term "algorithmic complexity" so as not to take sides on this issue. Unfortunately, "algorithmic complexity" is more typically taken as meaning "computational complexity theory". For example, if you search for it under Wikipedia you will get redirected. I know, it's all kind of ridiculous and confusing...
Replies from: steven0461↑ comment by steven0461 · 2009-05-30T00:12:37.939Z · LW(p) · GW(p)
Changed title.
To make things more confusing, this says algorithmic/Kolmogorov complexity is a subfield of AIT.
Replies from: mathemajician↑ comment by mathemajician · 2009-05-30T01:03:02.594Z · LW(p) · GW(p)
Yeah... I know :-\ There are various political forces at work within this community that I try to stay clear of.