Comment by Toby_Ord2 on Continuous Improvement · 2009-01-11T17:32:25.000Z · LW · GW

the value of this memory card, was worth more than the rest of the entire observable universe minus the card

I doubt this would be true. I think the value of the card would actually be close to zero (though I'm not completely sure). It does let one solve the halting problem up to 10,000 states, but it does so in time and space complexity O(busy_beaver(n)). In other words, using the entire observable universe as computing power and the card as an oracle, you might be able to solve the halting problem for 7 state machines or so. Not that good... The same goes for having the first 10,000 bits of Omega. What you really want are the bits of Tau, which directly encode whether the nth machine halts. Sure you need exponentially more of them, but your computation is then much faster.

Comment by Toby_Ord2 on Free to Optimize · 2009-01-04T22:09:55.000Z · LW · GW

OK. That makes more sense then. I'm not sure why you call it 'Fun Theory' though. It sounds like you intend it to be a theory of 'the good life', but a non-hedonistic one. Strangely it is one where people having 'fun' in the ordinary sense is not what matters, despite the name of the theory.

This is a moral theory about what should be fun

I don't think that can be right. You are not saying that there is a moral imperative for certain things to be fun, or to not be fun, as that doesn't really make sense (at least I can't make sense of it). You are instead saying that certain conditions are bad, even when the person is having fun (in the ordinary sense). Maybe you are saying that what is good for someone mostly maps to their fun, but with several key exceptions (which the theory then lists).

In any event, I agree with Z.M. Davis that you should capitalize your 'Fun' when you are using it in a technical sense, and explaining the sense in more detail or using a different word altogether might also help.

Comment by Toby_Ord2 on Free to Optimize · 2009-01-04T19:32:01.000Z · LW · GW


Are you saying that one's brain state can be identical in two different scenarios but that you are having a different amount of fun in each? If so, I'm not sure you are talking about what most people call fun (ie a property of your experiences). If not, then what quantity are you talking about in this post where you have less of it if certain counterfactuals are true?

Comment by Toby_Ord2 on Complexity and Intelligence · 2008-11-04T14:20:12.000Z · LW · GW

I would drop dead of shock

Eliezer, just as it was interesting to ask what probability estimate 'Nuts!' amounted to, I think it would be very useful for the forum of Overcoming Bias to ask what your implicit probability estimate for a 500 state TM being able to solve the halting problem for all TMs of up to 50 states.

I imagine that 'I would drop dead of shock' was intended to convey a probability of less than 1 in 10,000, or maybe 1 in 1,000,000?

Comment by Toby_Ord2 on Economic Definition of Intelligence? · 2008-10-29T22:07:42.000Z · LW · GW

Sorry, I didn't see that you had answered most of this question in the other thread where I first asked it.

Toby, if you were too dumb to see the closed-form solution to problem 1, it might take an intense effort to tweak the bit on each occasion, or perhaps you might have trouble turning the global criterion of total success or failure into a local bit-fixer; now imagine that you are also a mind that finds it very easy to sing MP3s...

The reason you think one problem is simple is that you perceive a solution in closed form; you can imagine a short program, much shorter than 10 million bits, that solves it, and the work of inventing this program was done in your mind without apparent effort. So this problem is very trivial on the meta-level because the program that solves it optimally appears very quickly in the ordering of possible programs and is moreover prominent in that ordering relative to our instinctive transformations of the problem specification.

But if you were trying random solutions and the solution tester was a black box, then the alternating-bits problem would indeed be harder - so you can't be measuring the raw difficulty of optimization if you say that one is easier than the other.

This is why I say that the human notion of "impressiveness" is best constructed out of a more primitive notion of "optimization".

We also do, legitimately, find it more natural to talk about "optimized" performance on multiple problems than on a single problem - if we're talking about just a single problem, then it may not compress the message much to say "This is the goal" rather than just "This is the output."

I take it then that you agree that (1) is a problem of 9,999,999 bits and that the travelling salesman version is as well. Could you take these things and generate an example which doesn't just give 'optimization power', but 'intelligence' or maybe just 'intelligence-without-adjusting-for-resources-spent'. You say over a set of problem domains, but presumably not over all of them given the no-free-lunch theorems. Any example, or is this vague?

Comment by Toby_Ord2 on Economic Definition of Intelligence? · 2008-10-29T22:01:56.000Z · LW · GW


I'm afraid that I'm not sure precisely what your measure is, and I think this is because you have given zero precise examples: even of its subcomponents. For example, here are two optimization problems:

1) You have to output 10 million bits. The goal is to output them so that no two consecutive bits are different.

2) You have to output 10 million bits. The goal is to output them so that when interpreted as an MP3 file, they would make a nice sounding song.

Now, the solution space for (1) consists of two possibilities (all 1s, all 0s) out of 2^10000000, for a total of 9,999,999 bits. The solution space for (2) is millions of times wider, leading to fewer bits. However, intuitively, (2) is a much harder problem and things that optimized (2) are actually doing more of the work of intelligence, after all (1) can be achieved in a few lines of code and very little time or space, while (2) takes much more of these resources.

(2) is a pretty complex problem, but can you give some specifics for (1)? Is it exactly 9,999,999 bits? If so, is this the 'optimization power'? Is this a function of the size of the solution space and the size of the problem space only? If there was another program attempting to produce a sequence of 100 million bits coding some complex solution to a large travelling salesman problem, such that only two bitstrings suffice, would this have the same amount of optimization power?, or is it a function of the solution space itself and not just its size?

Without even a single simple example, it is impossible to narrow down your answer enough to properly critique it. So far I see it as no more precise than Legg and Hutter's definition.

Comment by Toby_Ord2 on Measuring Optimization Power · 2008-10-28T17:06:09.000Z · LW · GW

I agree with David's points about the roughness of the search space being a crucial factor in a meaningful definition of optimization power.

Comment by Toby_Ord2 on Measuring Optimization Power · 2008-10-28T17:04:11.000Z · LW · GW

I'm not sure that I get this. Perhaps I understand the maths, but not the point of it. Here are two optimization problems:

1) You have to output 10 million bits. The goal is to output them so that no two consecutive bits are different.

2) You have to output 10 million bits. The goal is to output them so that when interpreted as an MP3 file, they would make a nice sounding song.

Now, the solution space for (1) consists of two possibilities (all 1s, all 0s) out of 2^10000000, for a total of 9,999,999 bits. The solution space for (2) is millions of times wider, leading to fewer bits. However, intuitively, (2) is a much harder problem and things that optimized (2) are actually doing more of the work of intelligence, after all (1) can be achieved in a few lines of code and very little time or space, while (2) takes much more of these resources.

Comment by Toby_Ord2 on Ethics Notes · 2008-10-22T11:22:28.000Z · LW · GW

But if you say "Shut up and do what seems impossible!", then that, to me, sounds like dispelling part of the essential message - that what seems impossible doesn't look like it "seems impossible", it just looks impossible.

"Shut up and do what seems impossible!" is the literally correct message. The other one is the exaggerated form. Sometimes exaggeration is a good rhetorical device, but it does turn off some serious readers.

"Don't do it, even if it seems right" sounds merely clever by comparison

This was my point. This advice is useful and clever, though not profound. This literal presentation is both more clear in what it is saying and clear that it is not profound. I would have thought that the enterprise of creating statements that sound more profound than they are is not a very attractive one for rationalists. Memorable statements are certainly a good thing, but making them literally false and spuriously paradoxical does not seem worth it. This isn't playing fair. Any statement can be turned into a pseudo-profundity with these methods: witness many teachings of cults throughout the ages. I think these are the methods of what you have called 'Dark Side Epistemology'.

Comment by Toby_Ord2 on Prices or Bindings? · 2008-10-22T09:47:08.000Z · LW · GW


Crossman and Crowley make very good points above, delineating three possible types of justification for some of the things you say:

1) Don't turn him in because the negative effects of the undermining of the institution will outweigh the benefits

2) Don't turn him in because [some non-consequentialist reason on non-consequentialist grounds]

3) Don't turn him in because you will have rationally/consequentialistly tied yourself to the mast making it impossible to turn him in to achieve greater benefits.

(1) and (3) are classic pieces of consequentialism, the first dating back at least to Mill. If your reason is like those, then you are probably a consequentialist and there is no need to reinvent the wheel: I can provide some references for you. If you support (2), perhaps on some kind of Newcomb's problem grounds, then this deserves a clear explanation. Why, on account of a tricky paradoxical situation that may not even be possible, will you predictably start choosing to make things worse in situations that are not Newcomb situations? Unless you are explicit about your beliefs, we can't help debug them effectively, and you then can't hold them with confidence for they won't have undergone peer scrutiny. [The same still goes for your meta-ethical claims].

Comment by Toby_Ord2 on Ethical Injunctions · 2008-10-21T10:33:22.000Z · LW · GW

You should never, ever murder an innocent person who's helped you, even if it's the right thing to do

Shut up and do the impossible!

As written, both these statements are conceptually confused. I understand that you didn't actually mean either of them literally, but I would advise against trading on such deep-sounding conceptual confusions.

You should never, ever do X, even if if you are exceedingly confident that it is the right thing to do

This sounds less profound, but will actually be true for some value of X, unlike the first sentence or its derivatives. It sounds as profound as it is, and no more. I believe this is the right standard.

Comment by Toby_Ord2 on Brief Break · 2008-09-01T18:32:08.000Z · LW · GW


It is certainly similar to those problems, but slightly different. For example, justifying Occam's Razor requires a bit more than we need here. In our case, we are just looking for a canonical complexity measure for finite strings. For Occam's Razor we also need to show that we have reason to prefer theories expressible by simpler strings to those specified by more complex strings. As an example, we already have such a canonical complexity measure for infinite strings. It is not perfect, as you might want some complexity measure defined with o-machines instead, or with finite state automata or whatever. These would give different complexity measures, but at least the Turing machine level one marks out a basic type of complexity, rather than an infinite set of complexity measures, as for finite strings.


Where are you going to put all this?

Is that a physical question? If so, it is just complexity-relative-to-physics or to be more precise: complexity-relative-to-physical-size. If you mean that it seems more complex: of course it does to us, but we have no canonical way of showing that, as opposed to measures based on its rival machines from the same computability class (which is just begging the question). As I said, there might be a canonical complexity measure for finite strings, but if so, this hasn't been proven yet. I don't know what the upshot of all this is, and in many practical cases I'm sure the stuff I'm saying can be safely put aside, but it is worth being aware of it, particularly when trying to get appropriately general and theoretical results (such as the AIXI stuff). If we talk about AIXI(M) where AIXI is a function of machine M, and call my pathological machine P, then AIXI(TM) looks pretty much the same as AIXI(LambdaCalculus) and AIXI(Java) and every other sensible language we use, but it looks completely different to AIXI(P) until we start looking at strings of order 3^^^3. Whether that is a problem depends upon the questions being asked.

Comment by Toby_Ord2 on Brief Break · 2008-09-01T14:05:04.000Z · LW · GW


Why not the standard approach of using Shannon's state x symbol complexity for Turing machines?

Why choose a Turing machine? They are clearly not a canonical mathematical entity, just a historical artifact. Their level of power is a canonical mathematical entity, but there are many Turing-equivalent models of computation. This just gets us simplicity relative to Turing machines where what we wanted was simplicity simpliciter (i.e. absolute simplicity). If someone came to you with a seemingly bizarre Turing-complete model, where the shortest program for successor was 3^^^3 bits and all the short programs were reserved for things that look crazy to us, how can you show him that Turing machines are the more appropriate model? Of course, it is obvious to us that Turing machines are in some sense a better judge of the notion of simplicity that we want, but can we show this mathematically? If we can, it hasn't yet been done. Turing machines might look simple to Turing machines (and human brains) but this new model might look simple to itself (e.g. has a 1 bit universal machine etc.).

It looks like we have to settle for a non-canonical concept of simplicity relative to human intuitions or simplicity relative to physics or the like. I think this is a deep point, which is particularly puzzling and not sufficiently acknowledged in Kolmolgorov complexity circles. It feels like there must be some important mathematically canonical measure of complexity of finite strings, just like there is for infinite strings, but all we have are 'complexity-relative-to-X' and perhaps this is all there is.

Comment by Toby_Ord2 on Brief Break · 2008-09-01T10:33:34.000Z · LW · GW


That's why a tiny reference machine is used.

I think that Tim is pointing out that there is no available mathematical measure for the 'tinyness' of this machine which is not circular. You seem to be saying that the machine looks simple to most people and that all other machines which people class as simple could be simulated on this machine within a few hundred bits. This has two problems. Firstly, it is not provable that all other machines which we class as similarly simple will be simulated within a few hundred bits as it is an empirical question which other machines people find simple. I'll grant that we can be reasonably confident though. The second problem is that we haven't defined any canonical mathematical measure of simplicity, just a measure of 'simplicity relative to the empirical facts about what humans find simple'. Perhaps we could use physics instead of humans and look at physically small Turing machines and then have 'simplicity relative to the empirical facts about what can be done in small volumes'. These are no doubt interesting, but are still concepts of relative simplicity, not a canonical absolute simplicity/complexity. No such measure has been discovered, and perhaps there can be no such measure. We can contrast this with complexity of infinite strings, where there is a convergence between all base machines and thus an absolute measure of simplicity. The problem is that we are now looking for something to deal with finite strings, not infinite ones.

Comment by Toby_Ord2 on Abstracted Idealized Dynamics · 2008-08-13T13:17:41.000Z · LW · GW

Great! Now I can see several points where I disagree or would like more information.

1) Is X really asserting that Y shares his ultimate moral framework (i.e. that they would converge given time and arguments etc)?

If Y is a psychopath murderer who will simply never accept that he shouldn't kill, can I still judge that Y should refrain from killing? On the current form, to do so would involve asserting that we share a framework, but even people who know this to be false can judge that he shouldn't kill, can't they?

2) I don't know what it means to be the solution to a problem. You say:

'I should Z' means that Z answers the question, "What will save my people? How can we all have more fun? How can we get more control over our own lives? What's the funniest jokes we can tell? ..."

Suppose Z is the act of saying "no". How does this answer the question (or 'solve the problem')? Suppose it leads you to have a bit less fun and others to have a bit more fun and generally has positive effects on some parts of the question and negative on others. How are these integrated? As you phrased it, it is clearly not a unified question and I don't know what makes one act rather than another an answer to a list of questions (when presumably it doesn't satisfy each one in the list). Is there some complex and not consciously known weighting of the terms? I thought you denied that earlier in the series. This part seems very non-algorithmic at the moment.

3) The interpretation says 'implicitly defined by the machinery ... which they both use to make desirability judgments'?

What if there is not such machinery that they both use? I thought only X's machinery counted here as X is the judger.

4) You will have to say more about 'implicitly defined by the machinery ... use[d] to make desirability judgments'. This is really vague. I know you have said more on this, but never in very precise terms, just by analogy.

5) Is the problem W meant to be the endpoint of thought (i.e. the problem that would be arrived at), or is it meant to be the current idea which involves requests for self modification (e.g. 'Save a lot of lives, promote happiness, and factor in whatever things I have not thought of but could be convinced of.') It is not clear from the current statement (or indeed your previous posts), but would be made clear by a solution to (4).

Comment by Toby_Ord2 on Abstracted Idealized Dynamics · 2008-08-12T20:54:26.000Z · LW · GW


I didn't mean that most philosophy papers I read have lots of mathematical symbols (they typically don't), and I agree with you that over-formalization can occur sometimes (though it is probably less common in philosophy than under-formalization). What I meant is the practice of clear and concise statements of the main points and attendant qualifications in the kind of structured English that good philosophers use. For example, I gave the following as a guess at what you might be meaning:

When X judges that Y should Z, X is judging that were she fully informed, she would want Y to Z

This allows X to be incorrect in her judgments (if she wouldn't want Y to Z when given full information). It allows for others to try to persuade X that her judgment is incorrect (it preserves a role for moral argument). It reduces 'should' to mere want (which is arguably simpler). It is, however, a conception of should that is judger-dependent: it could be the case that X correctly judges that Y should Z, while W correctly judges that Y should not Z.

The first line was a fairly clear and concise statement of a meta-ethical position (which you said you don't share, and nor do I for that matter). The next few sentences describe some of its nice features as well as a downside. There is very little technical language -- just 'judge', 'fully informed' and 'want'. In the previous comment I gave a sentence or two saying what was meant by 'fully informed' and if challenged I could have described the other terms. Given that you think it is incorrect, could you perhaps fix it, providing a similar short piece of text that describes your view with a couple of terms that can bear the brunt of further questioning and elaboration.

Comment by Toby_Ord2 on Abstracted Idealized Dynamics · 2008-08-12T10:37:05.000Z · LW · GW


I agree with most of the distinctions and analogies that you have been pointing out, but I still doubt that I agree with your overall position. No-one here can know whether they agree with your position because it is very much underdetermined by your posts. I can have a go at formulating what I see as the strongest objections to your position if you clearly annunciate it in one place. Oddly enough, the philosophy articles that I read tend to be much more technically precise than your posts. I don't mean that your couldn't write more technically precise posts on metaethics, just that I would like you to.

In the same way as scientific theories need to be clear enough to allow concrete prediction and potential falsification, so philosophical theories need to be clear enough that others can use them without any knowledge of their author to make new claims about their subject matter. Many people here may feel that you have made many telling points (which you have), but I doubt that they understand your theory in the sense that they could apply it in wide range of situations where it is applicable. I would love a short post consisting of at most a paragraph of introduction, then a bi-conditional linking a person's judgement about what another person should do in a given situation to some naturalistic facts and then a paragraph or two helping resolve any ambiguities. Then others can actually argue against it and absence of argument could start to provide some evidence in its favour (though of course, surviving the criticisms of a few grad-student philosophers would still not be all that much evidence).

Comment by Toby_Ord2 on Morality as Fixed Computation · 2008-08-08T12:04:04.000Z · LW · GW


Sorry for not being more precise. I was actually asking what a given person's Q_P is, put in terms that we have already defined. You give a partial example of such a question, but it is not enough for me to tell what metaethical theory you are expressing. For example, suppose Mary currently values her own pleasure and nothing else, but that were she exposed to certain arguments she would come to value everyone's pleasure (in particular, the sum of everyone's pleasure) and that no other arguments would ever lead her to value anything else. This is obviously unrealistic, but I'm trying to determine what you mean via a simple example. Would Q_Mary be 'What maximizes Mary's pleasure?' or 'What maximizes the sum of pleasure?' or would it be something else? On my attempted summary, Q_Mary would be the second of these questions as that is what she would want if she knew all relevant arguments. Also, does it matter whether we suppose that Mary is open to change to her original values or if she is strongly opposed to change to her original values?

(Items marked in bold have to be morally evaluated.)

I don't think so. For example, when I said 'incorrect' I meant 'made a judgement which was false'. When I said 'best' arguments, I didn't mean the morally superior arguments, just the ones that are most convincing (just as the 'best available scientific theory' is not a moral claim). Feel free to replace that with something like 'if she had access to all relevant arguments', or 'if there exists an argument which would convince her' or the like. There are many ways this could be made precise, but it is not my task to do so: I want you to do so, so that I can better see and reply to your position.

Regarding the comment about assessing future Q_Ps from the standpoint of old ones, I still don't see a precise answer here. For example, if Q_P,T1 approves of Q_P,T2 which approves of Q_P,T3 but Q_P,T1 doesn't approve of Q_P,T3, then what are we to say? Did two good changes make a bad change?

Comment by Toby_Ord2 on Morality as Fixed Computation · 2008-08-08T10:25:11.000Z · LW · GW

Thanks for responding to my summary attempt. I agree with Robin that it is important to be able to clearly and succinctly express your main position, as only then can it be subject to proper criticism to see how well it holds up. In one way, I'm glad that you didn't like my attempted summary as I think the position therein is false, but it does mean that we should keep looking for a neat summary. You currently have:

'I should X' means that X answers the question, "What will save my people? How can we all have more fun? How can we get more control over our own lives? What's the funniest jokes we can tell? ..."

But I'm not clear where the particular question is supposed to come from. I understand that you are trying to make it a fixed question in order to avoid deliberate preference change or self-fulling questions. So lets say that for each person P, there is a specific question Q_P such that:

For a person P, 'I should X', means that X answers the question Q_P.

Now how is Q_P generated? Is it what P would want were she given access to all the best empirical and moral arguments (what I called being fully informed)? If so, do we have to time index the judgment as well? i.e. if P's preferences change at some late time T1, then did the person mean something different by 'I should X' before and after T1 , or was the person just incorrect at one of those times? What if the change is just through acquiring better information (empirical or moral)?

Comment by Toby_Ord2 on The Meaning of Right · 2008-07-30T23:53:00.000Z · LW · GW

To cover cases where people are making judgments about what others should do, I could also extend this summary in a slightly more cumbersome way:

When X judges that Y should Z, X is judging that were she fully informed, she would want Y to Z

This allows X to be incorrect in her judgments (if she wouldn't want Y to Z when given full information). It allows for others to try to persuade X that her judgment is incorrect (it preserves a role for moral argument). It reduces 'should' to mere want (which is arguably simpler). It is, however, a conception of should that is judger-dependent: it could be the case that X correctly judges that Y should Z, while W correctly judges that Y should not Z.

Comment by Toby_Ord2 on The Meaning of Right · 2008-07-30T23:18:00.000Z · LW · GW


I've just reread your article and was wondering if this is a good quick summary of your position (leaving apart how you got to it):

'I should X' means that I would attempt to X were I fully informed.

Here 'fully informed' is supposed to include complete relevant empirical information and also access to all the best relevant philosophical arguments.

Comment by Toby_Ord2 on Interpersonal Morality · 2008-07-29T22:50:22.000Z · LW · GW

If there's a standard alternative term in moral philosophy then do please let me know.

As far as I know, there is not. In moral philosophy, when deontologists talk about morality, they are typically talking about things that are for the benefit of others. Indeed, they even have conversations about how to balance between self-interest and the demands of morality. In contrast, consequentialists have a theory that already accounts for the benefit of the agent who is doing the decision making: it counts just as much as anyone else. Thus for consequentialists, there is typically no separate conflict between self-interest and morality: morality for them already takes this into account. So in summary, many moral philosophers are aware of the distinction, but I don't know of any pre-existing terms for it.

By the way, don't worry too much about explaining all pre-requisites before making a post. Explaining some of them afterwards in response to comments can be a more engaging way to do it. In particular, it means that us readers can see which parts we are skeptical of and then just focus our attention on posts which defend that aspect, skimming the ones that we already agree with. Even when it comes to the book, it will probably be worth giving a sketch of where you want to end up early on, with forward references to the appropriate later chapters as needed. This will let the readers read the pre-requisite chapters in a more focused way.

Comment by Toby_Ord2 on The Meaning of Right · 2008-07-29T13:51:49.000Z · LW · GW

wrongness flows backward from the shooting, as rightness flows backward from the button, and the wrongness outweighs the rightness.

I suppose you could say this, but if I understand you correctly, then it goes against common usage. Usually those who study ethics would say that rightness is not the type of thing that can add with wrongness to get net wrongness (or net rightness for that matter). That is, if they were talking about that kind of thing, they wouldn't use the word 'rightness'. The same goes for 'should' or 'ought'. Terms used for this kind of stuff that can add together: [goodness / badness], [pro tanto reason for / pro tanto reason against].

If you merely meant that any wrong act on the chain trumps any right act further in the future, then I suppose these words would be (almost) normal usage, but in this case it doesn't deal with ethical examples very well. For instance, in the consequentialist case above, we need to know the degree of goodness and badness in the two events to know whether the child-saving event outweighs the person-shooting event. Wrongness trumping rightness is not a useful explanation of what is going on if a consequentialist agent was considering whether to shoot the person. If you want the kind of additivity of value that is relevant in such a case, then call it goodness, not rightness/shouldness. And if this is the type of thing you are talking about, then why not just look at each path and sum the goodness in it, choosing the path with the highest sum. Why say that we sum the goodness in a path in reverse chronological order? How does this help?

Regarding the terms 'ethics' and 'morality', philosophers use them to mean the same thing. Thus, 'metamorality' would mean the same thing as 'metaethics', it is just that no-one else uses the former term (overcoming bias is the top page on google for that term). There is nothing stopping you from using 'ethics' and 'morality' to mean different things, but since it is not standard usage and it would lead to a lot of confusion when trying to explain your views.

Comment by Toby_Ord2 on The Meaning of Right · 2008-07-29T12:08:02.000Z · LW · GW

There are some good thoughts here, but I don't think the story is a correct and complete account of metamorality (or as the rest of the world calls it: metaethics). I imagine that there will be more posts on Eliezer's theory later and more opportunities to voice concerns, but for now I just want to take issue with the account of 'shouldness' flowing back through the causal links.

'Shouldness' doesn't always flow backwards in the way Eliezer mentioned. e.g. Suppose that in order to push the button, you need to shoot someone who will fall down on it. This would make the whole thing impermissible. If we started by judging saving the child as something we should do, then the backwards chain prematurely terminates when we come to the only way to achieve this involving killing someone. Obviously, we would really want to consider not just the end state of the chain when working out whether we should save the child, but to evaluate the whole sequence in the first place. For if the end state is only possible given something that is impermissible then it wasn't something we should bring about in the first place. Indeed, I think the following back from 'should' is a rather useless description. It is true that if we should (all things considered) do X, then we should do all the things necessary for X, but we can only know whether we should do X (all things considered) if we have already evaluated the other actions in the chain. It is a much more fruitful account to look forward, searching the available paths and then selecting the best one. This is how it is described by many philosophers, including a particularly precise treatment by Fred Feldman in his paper World Utilitarianism and his book Doing the Best We Can.

(Note also that this does not assume consequentialism is true: deontologists can define the goodness of paths in a way that involves things other than the goodness of the consequences of the path.)

Comment by Toby_Ord2 on The Genetic Fallacy · 2008-07-11T09:52:09.000Z · LW · GW

One thing to be aware of when considering logical fallacies is that there are two ways in which people consider something to be a fallacy. On the strict account, it is a form of argumentation that doesn't rule out all cases in which the conclusion is false. Appeals to authority and considerations of the history of a claim are obviously fallacious in this sense. The loose account is a form of argumentation that is deeply flawed. It is in this sense that appeal to authority and considerations of the history of a claim may not be fallacious, for they sometimes give us some useful reasons to believe or disbelieve in the claim. Certain considerations don't give deductive (logical) validity, but do give Bayesian support.

Comment by Toby_Ord2 on The Outside View's Domain · 2008-06-21T10:52:48.000Z · LW · GW

Well said.

Comment by Toby_Ord2 on Living in Many Worlds · 2008-06-05T15:11:06.000Z · LW · GW

It all adds up to normality, in all the worlds.

Eliezer, you say this, and similar things a number of times here. They are, of course, untrue. There are uncountably many instances where, for example, all coins in history flip tails every time. You mean that it almost always adds up to normality and this is true. For very high abnormality, the measure of worlds where it happens is equal to the associated small probability.

Regarding average utilitarianism, I also think this is a highly suspect conclusion from this evidence (and this is coming from a utilitarian philosopher). We can talk about this when you are in Oxford if you want: perhaps you have additional reasons that you haven't given here.

Comment by Toby_Ord2 on Identity Isn't In Specific Atoms · 2008-04-19T21:04:44.000Z · LW · GW

Suppose I take two atoms of helium-4 in a balloon, and swap their locations via teleportation.

For a book version, you will definitely want to be more precise here. I assumed they were in different quantum states (this seems a very reasonable assumption failing a specification to the contrary). Perhaps they had different spins, energies, momenta, etc. This means that the swapping did make sense.

Comment by Toby_Ord2 on The Quantum Arena · 2008-04-15T23:28:13.000Z · LW · GW


Very minor quibble/question. I assume you mean 2^Aleph_0 rather than Aleph_1. Unless one is doing something with the cardinals/ordinals themselves, it is almost always the numbers Aleph_0, 2^Aleph_0, 2^2^Aleph_0... that come up rather than Aleph_n. You may therefore like the convenient Beth numbers instead, where:

Beth_0 = Aleph_0 Beth_n+1 = 2^Beth_n

Comment by Toby_Ord2 on Newcomb's Problem and Regret of Rationality · 2008-02-01T12:36:26.000Z · LW · GW

I think Anonymous, Unknown and Eliezer have been very helpful so far. Following on from them, here is my take:

There are many ways Omega could be doing the prediction/placement and it may well matter exactly how the problem is set up. For example, you might be deterministic and he is precalculating your choice (much like we might be able to do with an insect or computer program), or he might be using a quantum suicide method, (quantum) randomizing whether the million goes in and then destroying the world iff you pick the wrong option (This will lead to us observing him being correct 100/100 times assuming a many worlds interpretation of QM). Or he could have just got lucky with the last 100 people he tried it on.

If it is the deterministic option, then what do the counterfactuals about choosing the other box even mean? My approach is to say that 'You could choose X' means that if you had desired to choose X, then you would have. This is a standard way of understanding 'could' in a deterministic universe. Then the answer depends on how we suppose the world to be different to give you counterfactual desires. If we do it with a miracle near the moment of choice (history is the same, but then your desires change non-physically), then you ought two-box as Omega can't have predicted this. If we do it with an earlier miracle, or with a change to the initial conditions of the universe (the Tannsjo interpretation of counterfactuals) then you ought one-box as Omega would have predicted your choice. Thus, if we are understanding Omega as extrapolating your deterministic thinking, then the answer will depend on how we understand the counterfactuals. One-boxers and Two-boxers would be people who interpret the natural counterfactual in the example in different (and equally valid) ways.

If we understand it as Omega using a quantum suicide method, then the objectively right choice depends on his initial probabilities of putting the million in the box. If he does it with a 50% chance, then take just one box. There is a 50% chance the world will end either choice, but this way, in the case where it doesn't, you will have a million rather than a thousand. If, however, he uses a 99% chance of putting nothing in the box, then one-boxing has a 99% chance of destroying the world which dominates the value of the extra money, so instead two-box, take the thousand and live.

If he just got lucky a hundred times, then you are best off two-boxing.

If he time travels, then it depends on the nature of time-travel...

Thus the answer depends on key details not told to us at the outset. Some people accuse all philosophical examples (like the trolley problems) of not giving enough information, but in those cases it is fairly obvious how we are expected to fill in the details. This is not true here. I don't think the Newcomb problem has a single correct answer. The value of it is to show us the different possibilities that could lead to the situation as specified and to see how they give different answers, hopefully illuminating the topic of free-will, counterfactuals and prediction.

Comment by Toby_Ord2 on But There's Still A Chance, Right? · 2008-01-07T01:26:55.000Z · LW · GW

Unknown, I agree entirely with your comments about the distinction between the idealised calculable probabilities and the actual error prone human calculations of them.

Nominull, I think you are right that the problem feels somewhat paradoxical. Many things do when considering actual human rationality (a species of 'bounded rationality' rather than ideal rationality). However, there is no logical problem with what you are saying. For most real world claims, we cannot have justifiable degrees of beliefs greater than one minus a billionth. Moreover, I don't have a justifiable degree of belief greater than one minus a billionth in my last statement being true (I'm pretty sure, but I could have made a mistake...). This lack of complete certainty about our lack of complete certainty is just one of the disadvantages of having resource bounds (time, memory, accuracy) on our reasoning. On a practical note, while we cannot completely correct ourselves, merely proposing a safe upper bound to confidence in typical situations, memorizing it as a simple number, and then using it in practice is fairly safe, and likely to improve our confidence estimates.


Comment by Toby_Ord2 on But There's Still A Chance, Right? · 2008-01-06T17:11:08.000Z · LW · GW

Carl, that is a good point. I'm not quite sure what to say about such cases. One thing that springs to mind though, is that in realistic examples you couldn't have investigated each of those options to see if it was a real option and even if you could, you couldn't be sure of all of that at once. You must know it through some more general principle whereby there is, say, an option per natural number up to a trillion. However, how certain can you be of that principle? That is isn't really up to only a million?

Hmmmm... Maybe I have an example that I can assert with confidence greater than one minus a billionth:

'The universe does not contain precisely 123,456,678,901,234,567,890 particles.'

I can't think of a sensible, important claim like Eliezer's original one though, and I stand by my advice to be very careful about claiming less than a billionth probability of error, even for a claim about the colour of a piece of paper held in front of you.


Comment by Toby_Ord2 on But There's Still A Chance, Right? · 2008-01-06T16:24:36.000Z · LW · GW

"The odds of that are something like two to the power of seven hundred and fifty million to one."

As Eliezer admitted, it is a very bad idea to ascribe probabilities like this to real world propositions. I think that the strongest reason is that it is just too easy for the presuppositions to be false or for your thinking to have been mistaken. For example, if I gave a five line logical proof of something, that would supposedly mean that there is no chance that its conclusion is true given the premisses, but actually the chance that I would make a logical error (even a transcription error somewhere) is at least on in a billion (~ 1 in 2^30). There is at least this much chance that either Elizer's reasoning or the basic scientific assumptions were seriously flawed in some way. Given the chance of error in even the simplest logical arguments (let alone the larger chance that the presuppositions about genes etc are false), we really shouldn't ascribe probabilities smaller than 1 in a billion to factual claims at all. Better to say that the probability of this happening by chance given the scientific presuppositions is vanishingly small. Or that the probability of it happening by chance pretty much equals the probability of the presuppositions being false.


Comment by Toby_Ord2 on Fake Utility Functions · 2007-12-07T00:26:31.000Z · LW · GW

There are certainly a lot of people who have been working on this problem for a long time. Indeed, since before computers were invented. Obviously I'm talking about moral philosophers. There is a lot of bad moral philosophy, but there is also a fair amount of very good moral philosophy tucked away in there -- more than one lifetime worth of brilliant insights. It is tucked away well enough that I doubt Eliezer has encountered more than a little of it. I could certainly understand people thinking it is all rubbish by taking a reasonably large sample and coming away only with poorly thought out ethics (which happens all too often), but there really is some good stuff in there.

My advice would be to read Reasons and Persons (by Derek Parfit) and The Methods of Ethics (by Henry Sidgwick). They are good starting places and someone like Eliezer would probably enjoy reading them too.

The post implies that utilitarianism is obviously false, but I don't think this is so. Even if it were false, do you really think it would be so obviously false? Utilitarians have unsurprisingly been aware of these issues for a very long time and have answers to them. Happiness being the sole good (for humans at least) is in no way invalidated by the complexity of relationship bonds. It is also not invalidated by the fact that people sometimes prefer outcomes which make them less happy (indeed there is one flavour of utilitarianism for happiness and one for preferences and they each have adherents).

It is certainly difficult to work out the exhaustive list of what has intrinsic value (I agree with that!), and I would have strong reservations about putting 'happiness' into the AI given my current uncertainty and the consequences of being mistaken, but it is far from being obviously false. In particular, it has the best claim I know of to fitting your description of the property that is necessary in everything that is good ('what use X without leading to any happiness?').

Comment by Toby_Ord2 on Not for the Sake of Happiness (Alone) · 2007-11-23T17:42:53.000Z · LW · GW

g, you have suggested a few of my reasons. I have thought quite a lot about this and could write many pages, but I will just give an outline here.

(1) Almost everything we want (for ourselves) increases our happiness. Many of these things evidently have no intrinsic value themselves (such as Eliezer's Ice-cream case). We often think we want them intrinsically, but on closer inspection, if we really ask whether we would want them if they didn't make us happy we find the answer is 'no'. Some people think that certain things resist this argument by having some intrinsic value even without contirbuting to happiness. I am not convinced by any of these examples and have an alternative explanation as to my opponents' views: they are having difficulty really imagining the case without any happiness accruing.

(2) I think that our lives cannot go better based on things that don't affect our mental states (such as based on what someone else does behind closed doors). If you accept this, that our lives are a function of our mental states, then happiness (broadly construed) seems the best explanation of what it is about our mental states that makes a possible life more valuable than another.

(3) I have some sympathy with preference accounts, but they are liable to count too many preferences, leading to double counting (my wife and I each prefer the other's life to go better even if we never find out, so do we count twice as much as single people?) and preferences based on false beliefs (to drive a ferrari because they are safer). Once we start ruling out the inappropriate preference types and saying that only the remaining ones count, it seems to me that this just leads back to hedonism.

Note that I'm saying that I think happiness is the only factor in determining whether a life goes well in a particular sense, this needn't be the same as the most interesting life or the most ethical life. Indeed, I think the most ethical life is the one that leads to the greatest sum of happiness across all lifes (utilitarianism). I'm not completely convinced of any of this, but am far more convinced than I am by any rival theories.

Comment by Toby_Ord2 on Not for the Sake of Happiness (Alone) · 2007-11-23T11:33:46.000Z · LW · GW

Wei, yes my comment was less clear than I was hoping. I was talking about the distinction between 'psychological hedonism' and 'hedonism' and I also mentioned the many person versions of these theories ('psychological utilitarianism' and 'utilitarianism'). Lets forget about the many person versions for the moment and just look at the simple theories.

Hedonism is the theory that the only thing good for each individual is his or her happiness. If you have two worlds, A and B and the happiness for Mary is higher in world A, then world A is better for Mary. This is a theory of what makes someone's life go well, or to put it another way, about what is of objective value in a person's life. It is often used as a component of an ethical theory such as utilitarianism.

Psychological hedonism is the theory that people ultimately aim to increase their happiness. Thus, if they can do one of two acts, X and Y and realise that X will increase their happiness more than Y, they will do X. This is not a theory of what makes someone's life go well, or a theory of ethics. It is merely a theory of psychological motivation. In other words, it is a scientific hypothesis which says that people are wired up so that they are ultimately pursuing their own happiness.

There is some connection between these theories, but it is quite possible to hold one and not the other. For example, I think that hedonism is true but psychological hedonism is false. I even think this can be a good thing since people get more happiness when not directly aiming at it. Helping your lover because you love them leads to more happiness than helping them in order to get more happiness. It is also quite possible to accept psychological hedonism and not hedonism. You might think that people are motivated to increase their happiness, but that they shouldn't be. For example, it might be best for them to live a profound life, not a happy one.

Each theory says that happiness is the utlimate thing of value in a certain sense, but these are different senses. The first is about what I would call actual value: it is about the type of value that is involved in a 'should' claim. It is normative. The second is about what people are actually motivated to do. It is involved in 'would' claims.

Eliezer has shown that he does care about some of the things that make him happy over and above the happiness they bring, however he asked:

'The question, rather, is whether we should care about the things that make us happy, apart from any happiness they bring.'

Whether he would do something and whether he should are different things, and I'm not satisfied that he has answered the latter.

Comment by Toby_Ord2 on Not for the Sake of Happiness (Alone) · 2007-11-22T11:41:48.000Z · LW · GW


There is potentially some confusion on the term 'value' here. Happiness is not my ultimate (personal) end. I aim at other things which in turn bring me happiness and as many have said, this brings me more happiness than if I aimed at it. In this sense, it is not the sole object of (personal) value to me. However, I believe that the only thing that is good for a person (including me) is their happiness (broadly construed). In that sense, it is the only thing of (personal) value to me. These are two different senses of value.

Psychological hedonists are talking about the former sense of value: that we aim at personal happiness. You also mentioned that others ('psychological utilitarians', to coin a term) might claim that we only aim at the sum of happiness. I think both of these are false, and in fact probably no-one solely aims at these things. However, I think that the most plausible ethical theories are variants of utilitarianism (and fairly sophisticated ones at that), which imply that the only thing that makes an individual's life go well is that individual's happiness (broadly construed).

You could quite coherently think that you would fight to avoid the pill and also that if it were slipped in your drink that your life would (personally) go better. Of course the major reason not to take it is that your real scientific breakthroughs benefit others too, but I gather that we are supposed to be bracketing this (obvious) possibility for the purposes of this discussion, and questioning whether you would/should take it in the absence of any external benefits. I'm claiming that you can quite coherently think that you wouldn't take it (because that is how your psychology is set up) and yet that you should take it (because it would make your life go better). Such conflicts happen all the time.

My experience in philosophy is that it is fairly common for philosophers to expouse psychological hedonism, though I have never heard anyone argue for psychological utilitarianism. You appear to be arguing against both of these positions. There is a historical tradition of arguing for (ethical) utilitarianism. Even there, the trend is strongly against it these days and it is much more common to hear philosophers arguing that it is false. I'm not sure what you think of this position. From your comments above, it makes it look like you think it is false, but that may just be confusion about the word 'value'.

Comment by Toby_Ord2 on "Can't Say No" Spending · 2007-10-18T13:33:16.000Z · LW · GW


Which correlation studies are you talking about? We would actually need quite some evidence to suggest that aid is net harmful, or very inefficient. I haven't seen anything to suggest this. Even if it has net zero financial effect, that doesn't mean it isn't amazingly efficient at health effects etc. I was very unimpressed with the standard of those Spiegel pieces, especially the interview.

I certainly think we need much more focus on efficiency of aid (as you know I'm spending much of my time starting an organization to see to this) and also more randomized trials to assess the impact of various interventions. However, the strongest negative claim at the moment that we can really make is that it is quite possible aid has had net negative effect, but we would need to look into it much more. Going any further strikes me as overconfident.

Comment by Toby_Ord2 on "Can't Say No" Spending · 2007-10-18T11:20:52.000Z · LW · GW

Jeff Gray:

It is easy to get blinded by large numbers, but trillions of dollars over 50 years over billions of people is not very much -- just $20 per person per year or so. It is not surprising that this hasn't industrialised the rest of the world over that period of time. It is an enormous problem and even if tackled very efficiently, it will take trillions more before the gap closes. I strongly suggest using 'dollars per person per year' as the unit to see the relative scales of things.

Comment by Toby_Ord2 on "Can't Say No" Spending · 2007-10-18T11:12:53.000Z · LW · GW

Eliezer: I'm afraid you've got this one quite wrong. I can elaborate further in the future, but for now I'll just expand upon what Carl wrote:

Total aid to Sub-Saharan Africa (SSA) from 1950 onwards = $568 billion (according to Easterly)

(I'm just going to look at things up to 1990 as life expectancy data gets skewed by AIDS at that point. Thus $568 billion is a conservative overestimate of money spent until 1990)

Average population in SSA (1950-1990) = 317 million

Life Expectancy in SSA according to World Population Prospects (ie. the UN estimates) = 37.6 in 1950-55 = 49.9 in 1985-90

(the World Bank estimates include a larger life expectancy increase, but I'll use the conservative data)

Gains in Life expectancy (1950-1990): = 12.3 years = 33%

Costs per person in SSA = $1,791

Cost per person in SSA per year = $36

So $36 per person per year has been associated with a 33% life expectancy increase. That is just staggering.

Even if only a tenth of this increase is due to aid, and there were no morbidity advantages and no economic advantages, it would be a fantastic success compared to spending the money on almost all projects in the developed world.

I think the balance of evidence is actually that aid has done tremendous good in bulk, and certainly that aid by intelligent informed givers at the margin is vastly better again (and there are much better statistics for that one).

It certainly takes more than a single study claiming that aid has no finincial benefits to show that aid has been wasted, especially in light of such a strong prima facie case that aid has helped