Posts
Comments
This is an interesting theorem which helps illuminate the relationship between unbounded utilities and St Petersburg gambles. I particularly appreciate that you don't make an explicit assumption that the values of gambles must be representable by real numbers which is very common, but unhelpful in a setting like this. However, I do worry a bit about the argument structure.
The St Petersburg gamble is a famously paradox-riddled case. That is, it is a very difficult case where it isn't clear what to say, and many theories seem to produce outlandish results. When this happens, it isn't so impressive to say that we can rule out an opposing theory because in that paradox-riddled situation it would lead to strange results. It strikes me as similar to saying that a rival theory leads to strange result in variable population-size cases so we can reject it (when actually, all theories do), or that it leads to strange results in infinite population cases (when again, all theories do).
Even if one had a proof that an alternative theory doesn't lead to strange conclusions in the St Petersburg gamble, I don't think this would count all that much in its favour. As it seems plausible to me that various rules of decision theory that were developed in the cleaner cases of finite possibility spaces (or well-behaved infinite spaces) need to be tweaked to account for more pathological possibility spaces. For a simple example, I'm sympathetic to the sure thing principle, but it directly implies that the St Petersburg Gamble is better than itself, because an unresolved gamble is better than a resolved one, no matter how the latter was resolved. My guess is that this means the sure thing principle needs to have its scope limited to exclude gambles whose value is higher than that of any of their resolutions.
Regarding your question, I don't see theoretical reasons why one shouldn't be making deals like that (assuming one can and would stick to them etc). I'm not sure which decision theory to apply to them though.
The Moral Parliament idea generally has a problem regarding time. If it is thought of as making decisions for the next action (or other bounded time period), with new distribution of votes etc when the next choice comes up, then there are intertemporal swaps (and thus pareto improvements according to each theory) that it won't be able to achieve. This is pretty bad, as it at least appears to be getting pareto dominated by another method. However, if it is making one decision for all time over all policies for resolving future decisions, then (1) it is even harder to apply in real life than it looked, and (2) it doesn't seem to be able to deal with cases where you learn more about ethics (i.e. update your credence function over moral theories) -- at least not without quite a bit of extra explanation about how that works. I suppose the best answer may well be that the policies over which the representatives are arguing include branches dealing with all ways the credences could change, weighted by their probabilities. This is even more messy.
My guess is that of these two broad options (decide one bounded decision vs decide everything all at once) the latter is better. But either way it is a bit less intuitive than it first appears.
This is a good idea, though not a new one. Others have abandoned the idea of a formal system for this on the grounds that:
1) It may be illegal 2) Quite a few people think it is illegal or morally dubious (whether or not it is actually illegal or immoral)
It would be insane to proceed with this without confirming (1). If illegal, it would open you up to criminal prosecution, and more importantly, seriously hurt the movements you are trying to help. I think that whether or not it turns out to be illegal, (2) is sufficient reason to not pursue it. It may cause serious reputational damage to the movement which I'd expect to easily outweigh the financial benefits.
I also think that the 10% to 20% boost is extremely optimistic. That would only be achieved if almost everyone was using it and they all wanted to spend most of their money funding charities that don't operate in their countries. I'd expect something more like a boost of a few percent.
Note that there are also very good alternatives. One example is a large effort to encourage people to informally do this in a non-matched way by donating to the subset of effective charities that are tax deductable in their country. This could get most of the benefits for none of the costs.
This is a really nice and useful article. I particularly like the list of problems AI experts assumed would be AI-complete, but turned out not to be.
I'd add that if we are trying to reach the conclusion that "we should be more worried about non-general intelligences than we currently are", then you don't need it to be true that general intelligences are really difficult. It would be enough that "there is a reasonable chance we will encounter a dangerous non-general one before a dangerous general one". I'd be inclined to believe that even without any of the theorising about possibility.
I think one reason for the focus on 'general' in the AI Safety community is that it is a stand in for the observation that we are not worried about path planners or chess programs or self-driving cars etc. One way to say this is that these are specialised systems, not general ones. But you rightly point out that it doesn't follow that we should only be worried about completely general systems.
Thanks for bringing this up Luke. I think the term 'friendly AI' has become something of an albatross around our necks as it can't be taken seriously by people who take themselves seriously. This leaves people studying this area without a usable name for what they are doing. For example, I talk with parts of the UK government about the risks of AGI. I could never use the term 'friendly AI' in such contexts -- at least without seriously undermining my own points. As far as I recall, the term was not originally selected with the purpose of getting traction with policy makers or academics, so we shouldn't be too surprised if we can see something that looks superior for such purposes. I'm glad to hear from your post that 'AGI safety' hasn't rubbed people up the wrong way, as feared.
It seems from the poll that there is a front runner, which is what I tend to use already. It is not too late to change which term is promoted by MIRI / FHI etc. I think we should.
This is quite possibly the best LW comment I've ever read. An excellent point with a really concise explanation. In fact it is one of the most interesting points I've seen within Kolmogorov complexity too. Well done on independently deriving the result!
Without good ways to overcome selection bias, it is unclear that data like this can provide any evidence of outsized impact of unconventional approaches. I would expect a list of achievements as impressive as the above whether or not there was any correlation between the two.
Carl,
You are completely right that there is a somewhat illicit factor-of-1000 intuition pump in a certain direction in the normal problem specification, which makes it a bit one-sided. Will McAskill and I had half-written a paper on this and related points regarding decision-theoretic uncertainty and Newcomb's problem before discovering that Nozick had already considered it (even if very few people have read or remembered his commentary on this).
We did still work out though that you can use this idea to create compound problems where for any reasonable distribution of credences in the types of decision theory, you should one-box on one of them and two-box on the other: something that all the (first order) decision theories agree is wrong. So much the worse for them, we think. I've stopped looking into this, but I think Will has a draft paper where he talks about this alongside some other issues.
Regarding (2), this is a particularly clean way to do it (with some results of my old simulations too).
http://www.amirrorclear.net/academic/papers/sipd.pdf http://www.amirrorclear.net/academic/ideas/dilemma/index.html
We can't use the universal prior in practice unless physics contains harnessable non-recursive processes. However, this is exactly the situation in which the universal prior doesn't always work. Thus, one source of the 'magic' is through allowing us to have access to higher levels of computation than the phenomena we are predicting (and to be certain of this).
Also, the constants involved could be terrible and there are no guarantees about this (not even probabilistic ones). It is nice to reach some ratio in the limit, but if your first Graham's number of guesses are bad, then that is very bad for (almost) all purposes.
Phil,
It's not actually that hard to make a commitment to give away a large fraction of your income. I've done it, my wife has done it, several of my friends have done it etc. Even for yourself, the benefits of peace of mind and lack of cognitive dissonance will be worth the price, and by my calculations you can make the benefits for others at least 10,000 times as big as the costs for yourself. The trick is to do some big thinking and decision making about how to live very rarely (say once a year) then limit your salary through regular giving. That way you don't have to agonise at the hairdresser's etc, you just live within your reduced means. Check out my site on this, http://www.givingwhatwecan.org -- if you haven't already.
I didn't watch the video, but I don't see how that could be true. Occam's razor is about complexity, while the conjunction fallacy is about logical strength.
Sure 'P & Q' is more complex than 'P', but 'P' is simpler than '(P or ~Q)' despite it being stronger in the same way (P is equivalent to (P or ~Q) & (P or Q)).
(Another way to see this is that violating Occam's razor does not make things fallacies).
This certainly doesn't work in all cases:
There is a hidden object which is either green, red or blue. Three people have conflicting opinions about its colour, based on different pieces of reasoning. If you are the one who believes it is green, you have to add up the opponents who say not-green, despite the fact that there is no single not-green position (think of the symmetry -- otherwise everyone could have too great confidence). The same holds true if these are expert opinions.
The above example is basically as general as possible, so in order for your argument to work it will need to add specifics of some sort.
Also, the Koran/Bible case doesn't work. By symmetry, the Koran readers can say that they don't need to add up the Bible readers and the atheists, since they are heterogeneous, so they can keep their belief in the Koran...
I don't think I can persuaded.
I have many good responses to the comments here, and I suppose I could sketch out some of the main arguments against anti-realism, but there are also many serious demands on my time and sadly this doesn't look like a productive discussion. There seems to be very little real interest in finding out more (with a couple of notable exceptions). Instead the focus is on how to justify what is already believed without finding out any thing else about what the opponents are saying (which is particularly alarming given that many commenters are pointing out that they don't understand what the opponents are saying!).
Given all of this, I fear that writing a post would not be a good use of my time.
You are correct that it is reasonable to assign high confidence to atheism even if it doesn't have 80% support, but we must be very careful here. Atheism is presumably the strongest example of such a claim here on Less Wrong (i.e. one which you can tell a great story why so many intelligent people would disagree etc and hold a high confidence in the face of disagreement). However, this does not mean that we can say that any other given view is just like atheism in this respect and thus hold beliefs in the face of expert disagreement, that would be far too convenient.
Roko, you make a good point that it can be quite murky just what realism and anti-realism mean (in ethics or in anything else). However, I don't agree with what you write after that. Your Strong Moral Realism is a claim that is outside the domain of philosophy, as it is an empirical claim in the domain of exo-biology or exo-sociology or something. No matter what the truth of a meta-ethical claim, smart entities might refuse to believe it (the same goes for other philosophical claims or mathematical claims).
Pick your favourite philosophical claim. I'm sure there are very smart possible entities that don't believe this and very smart ones that do. There are probably also very smart entities without the concepts needed to consider it.
I understand why you introduced Strong Moral Realism: you want to be able to see why the truth of realism would matter and so you came up with truth conditions. However, reducing a philosophical claim to an empirical one never quite captures it.
For what its worth, I think that the empirical claim Strong Moral Realism is false, but I wouldn't be surprised if there was considerable agreement among radically different entities on how to transform the world.
Thanks for looking that up Carl -- I didn't know they had the break-downs. This is the more relevant result for this discussion, but it doesn't change my point much. Unless it was 80% or so in favour of anti-realism, I think holding something like 95% credence in anti-realism this is far too high for non-experts.
You are entirely right that the 56% would split up into many subgroups, but I don't really see how this weakens my point: more philosophers support realist positions than anti-realist ones. For what its worth, the anti-realists are also fragmented in a similar way.
In metaethics, there are typically very good arguments against all known views, and only relatively weak arguments for each of them. For anything in philosophy, a good first stop is the Stanford Encyclopedia of Philosophy. Here are some articles on the topic at SEP:
I think the best book to read on metaethics is:
There are a lot of posts here that presuppose some combination of moral anti-realism and value complexity. These views go together well: if value is not fundamental, but dependent on characteristics of humans, then it can derive complexity from this and not suffer due to Occam's Razor.
There are another pair of views that go together well: moral realism and value simplicity. Many posts here strongly dismiss these views, effectively allocating near-zero probability to them. I want to point out that this is a case of non-experts being very much at odds with expert opinion and being clearly overconfident. In the Phil Papers survey for example, 56.3% of philosophers lean towards or believe realism, while only 27.7% lean towards or accept anti-realism.
http://philpapers.org/surveys/results.pl
Given this, and given comments from people like me in the intersection of the philosophical and LW communities who can point out that it isn't a case of stupid philosophers supporting realism and all the really smart ones supporting anti-realism, there is no way that the LW community should have anything like the confidence that it does on this point.
Moreover, I should point out that most of the realists lean towards naturalism, which allows a form of realism that is very different to the one that Eliezer critiques. I should also add that within philosophy, the trend is probably not towards anti-realism, but towards realism. The high tide of anti-realism was probably in the middle of the 20th Century, and since then it has lost its shiny newness and people have come up with good arguments against it (which are never discussed here...).
Even for experts in meta-ethics, I can't see how their confidence can get outside the 30%-70% range given the expert disagreement. For non-experts, I really can't see how one could even get to 50% confidence in anti-realism, much less the kind of 98% confidence that is typically expressed here.
Why?
Whenever you deviate from maximizing expected value (in contexts where this is possible) you can normally find examples where this behaviour looks incorrect. For example, we might be value-pumped or something.
(And why do you find it odd, BTW?)
For one thing, negentropy may well be one of the most generally useful resources, but it seems somewhat unlikely to be intrinsically good (more likely it matters what you do with it). Thus, the question looks like one of descriptive uncertainty, just as if you had asked about money: uncertainty about whether you value that according to a particular function is descriptive uncertainty for all plausible theories. Also, while evaluative uncertainty does arise in self-interested cases, this example is a strange case of self-interest for reasons others have pointed out.
Thanks for the post Wei, I have a couple of comments.
Firstly, the dichotomy between Robin's approach and Nick and mine is not right. Nick and I have always been tempted to treat moral and descriptive uncertainty in exactly the same way insofar as this is possible. However, there are cases where this appears to be ill-defined (eg how much happiness for utilitarians is worth breaking a promise for Kantians?), and to deal with these cases Nick and I consider methods that are more generally applicable. We don't consider the bargaining/voting/market approach to be very plausible as a contender for a unique canonical answer, but as an approach to at least get the hard cases mostly right instead of remaining silent about them.
In the case you consider (which I find rather odd...) Nick and I would simply multiply it out. However, even if you looked at what our bargaining solution would do, it is not quite what you say. One thing we know is that simple majoritarianism doesn't work (it is equivalent to picking the theory with the highest credence in two-theory cases). We would prefer to use a random dictator model, or allow bargaining over future situations too, or all conceivable situations, such that the proponent of the square-root view would be willing to offer to capitulate in most future votes in order to win this one.
Not quite. It depends on your beliefs about how the calculation could go wrong and how much this would change the result. If you are very confident in all parts except a minor correcting term, and are simply told that there is an error in the calculation, then you can still have some kind of rough confidence in the result (you can see how to spell this out in maths). If you know the exact part of the calculation that was mistaken, then the situation is slightly different, but still not identical to reverting to your prior.
You may wish to check out the paper we wrote at the FHI on the problem of taking into account mistakes in one's own argument. The mathematical result is the same as the one here, but the proof is more compelling. Also, we demonstrate that when applied to the LHC, the result is very different to the above analysis.
http://www.fhi.ox.ac.uk/__data/assets/pdf_file/0006/4020/probing-the-improbable.pdf
We talk about this a bit at FHI. Nick has written a paper which is relevant:
Everybody knows blue/green are correct categories, while grue/bleen are not.
Philosophers invented grue/bleen in order to be obviously incorrect categories, yet difficult to formally separate from the intuitively correct ones. There are of course less obvious cases, but the elucidation of the problem required them to come up with a particularly clear example.
You can also listen to an interview with one of Sarah Lichtenstein's subjects who refused to make his preferences consistent even after the money-pump aspect was explained:
http://www.decisionresearch.org/publications/books/construction-preference/listen.html
Calling fuzzy logic "truth functional" sounds like you're changing the semantics;
'Truth functional' means that the truth value of a sentence is a function of the truth values of the propositional variables within that sentence. Fuzzy logic works this way. Probability theory does not. It is not just that one is talking about degrees of truth and the other is talking about probabilities. The analogue to truth values in probability theory are probabilities, and the probability of a sentence is not a function of the probabilities of the variables that make up that sentence (as I pointed out in the preceding comment, when A and B have the same probability, but A^A has a different probability to A^B). Thus propositional fuzzy logic is inherently different to probability theory.
You might be able to create a version of 'fuzzy logic' in which it is non truth-functional, but then it wouldn't really be fuzzy logic anymore. This would be like saying that there are versions of 'mammal' where fish are mammals, but we have to understand 'mammal' to mean what we normally mean by 'animal'. Sure, you could reinterpret the terms in this way, but the people who created the terms don't use them that way, and it just seems to be a distraction.
At least that is as far as I understand. I am not an expert on non-classical logic, but I'm pretty sure that fuzzy logic is always understood so as to be truth-functional.
You can select your "fuzzy logic" functions (the set of functions used to specify a fuzzy logic, which say what value to assign A and B, A or B, and not A, as a function of the values of A and B) to be consistent with probability theory, and then you'll always get the same answer as probability theory.
How do you do this? As far as I understand, it is impossible since probability is not truth functional. For example, suppose A and B both have probability 0.5 and are independent. In this case, the probability of 'A^B' is 0.25, while the probability of 'A^A' is 0.5. You can't do this in a (truth-functional) logic, as it has to produce the same value for both of these expressions if A and B have the same truth value. This is why minimum and maximum are used.
I agree that there are very interesting questions here. We have quite natural ways of describing uncomputable functions very far up the arithmetical hierarchy, and it seems that they can be described in some kind of recursive language even if the things they describe are not recursive (using recursive in the recursion theory sense both times). Turing tried something like this in Systems of Logic Based on Ordinals (Turing, 1939), but that was with formal logic and systems where you repeatedly add the Godel sentence of a system into the system as an axiom, repeating this into the transfinite. A similar thing could be done describing levels of computability transfinitely far up the arithmetical hierarchy using recursively represented ordinals to index them. However, then people like you and I will want to use certain well defined but non-recursive ordinals to do the indexing, and it seems to degenerate in the standard kind of way, just a lot further up the hierarchy than before.
I outlined a few more possibilities on Overcoming Bias last year:
There are many ways Omega could be doing the prediction/placement and it may well matter exactly how the problem is set up. For example, you might be deterministic and he is precalculating your choice (much like we might be able to do with an insect or computer program), or he might be using a quantum suicide method, (quantum) randomizing whether the million goes in and then destroying the world iff you pick the wrong option (This will lead to us observing him being correct 100/100 times assuming a many worlds interpretation of QM). Or he could have just got lucky with the last 100 people he tried it on.
If it is the deterministic option, then what do the counterfactuals about choosing the other box even mean? My approach is to say that 'You could choose X' means that if you had desired to choose X, then you would have. This is a standard way of understanding 'could' in a deterministic universe. Then the answer depends on how we suppose the world to be different to give you counterfactual desires. If we do it with a miracle near the moment of choice (history is the same, but then your desires change non-physically), then you ought two-box as Omega can't have predicted this. If we do it with an earlier miracle, or with a change to the initial conditions of the universe (the Tannsjo interpretation of counterfactuals) then you ought one-box as Omega would have predicted your choice. Thus, if we are understanding Omega as extrapolating your deterministic thinking, then the answer will depend on how we understand the counterfactuals. One-boxers and Two-boxers would be people who interpret the natural counterfactual in the example in different (and equally valid) ways.
If we understand it as Omega using a quantum suicide method, then the objectively right choice depends on his initial probabilities of putting the million in the box. If he does it with a 50% chance, then take just one box. There is a 50% chance the world will end either choice, but this way, in the case where it doesn't, you will have a million rather than a thousand. If, however, he uses a 99% chance of putting nothing in the box, then one-boxing has a 99% chance of destroying the world which dominates the value of the extra money, so instead two-box, take the thousand and live.
If he just got lucky a hundred times, then you are best off two-boxing.
If he time travels, then it depends on the nature of time-travel...
Thus the answer depends on key details not told to us at the outset. Some people accuse all philosophical examples (like the trolley problems) of not giving enough information, but in those cases it is fairly obvious how we are expected to fill in the details. This is not true here. I don't think the Newcomb problem has a single correct answer. The value of it is to show us the different possibilities that could lead to the situation as specified and to see how they give different answers, hopefully illuminating the topic of free-will, counterfactuals and prediction.