Posts

Comments

Comment by thrawnca on Newcomb's Problem and Regret of Rationality · 2016-11-29T02:19:05.492Z · LW · GW

A: Live 500 years and then die, with certainty. B: Live forever, with probability 0.000000001%; die within the next ten seconds, with probability 99.999999999%

If this was the only chance you ever get to determine your lifespan - then choose B.

In the real world, it would probably be a better idea to discard both options and use your natural lifespan to search for alternative paths to immortality.

Comment by thrawnca on Newcomb's Problem and Regret of Rationality · 2016-11-29T00:44:26.111Z · LW · GW

somewhat confident of Omega's prediction

51% confidence would suffice.

  • Two-box expected value: 0.51 $1K + 0.49 $1.001M = $491000
  • One-box expected value: 0.51 $1M + 0.49 $0 = $510000
Comment by thrawnca on Counterfactual Mugging · 2016-11-28T05:17:30.070Z · LW · GW

At some point you'll predictably approach death

I'm pretty sure that decision theories are not designed on that basis. We don't want an AI to start making different decisions based on the probability of an upcoming decommission. We don't want it to become nihilistic and stop making decisions because it predicted the heat death of the universe and decided that all paths have zero value. If death is actually tied to the decision in some way, then sure, take that into account, but otherwise, I don't think a decision theory should have "death is inevitably coming for us all" as a factor.

Comment by thrawnca on Counterfactual Mugging · 2016-11-20T22:18:33.990Z · LW · GW

How do you resolve that tension?

Well, as previously stated, my view is that the scenario as stated (single-shot with no precommitment) is not the most helpful hypothetical for designing a decision theory. An iterated version would actually be more relevant, since we want to design an AI that can make more than one decision. And in the iterated version, the tension is largely resolved, because there is a clear motivation to stick with the decision: we still hope for the next coin to come down heads.

Comment by thrawnca on Ends Don't Justify Means (Among Humans) · 2016-11-20T22:13:42.700Z · LW · GW

what happens when the consequences grow large? Say 1 person to save 500, or 1 to save 3^^^^3?

If 3^^^^3 lives are at stake, and we assume that we are running on faulty or even hostile hardware, then it becomes all the more important not to rely on potentially-corrupted "seems like this will work".

Comment by thrawnca on Evolutions Are Stupid (But Work Anyway) · 2016-10-31T02:18:26.632Z · LW · GW

Well, humans can build calculators. That they can't be the calculators that they create doesn't demand an unusual explanation.

Yes, but don't these articles emphasise how evolution doesn't do miracles, doesn't get everything right at once, and takes a very long time to do anything awesome? The fact that humans can do so much more than the normal evolutionary processes can marks us as a rather significant anomaly.

Comment by thrawnca on Counterfactual Mugging · 2016-10-31T02:06:56.107Z · LW · GW

Your decision is a result of your decision theory

I get that that could work for a computer, because a computer can be bound by an overall decision theory without attempting to think about whether that decision theory still makes sense in the current situation.

I don't mind predictors in eg Newcomb's problem. Effectively, there is a backward causal arrow, because whatever you choose causes the predictor to have already acted differently. Unusual, but reasonable.

However, in this case, yes, your choice affects the predictor's earlier decision - but since the coin never came down heads, who cares any more how the predictor would have acted? Why care about being the kind of person who will pay the counterfactual mugger, if there will never again be any opportunity for it to pay off?

Comment by thrawnca on Evolutions Are Stupid (But Work Anyway) · 2016-10-28T05:33:55.605Z · LW · GW

Humans can do things that evolutions probably can't do period over the expected lifetime of the universe.

This does beg the question, How, then, did an evolutionary process produce something so much more efficient than itself?

(And if we are products of evolutionary processes, then all our actions are basically facets of evolution, so isn't that sentence self-contradictory?)

Comment by thrawnca on Counterfactual Mugging · 2016-10-28T05:20:17.127Z · LW · GW

there is no distinction between making the decision ahead of time or not

Except that even if you make the decision, what would motivate you to stick to it once it can no longer pay up?

Your only motivation to pay is the hope of obtaining the $10000. If that hope does not exist, what reason would you have to abide by the decision that you make now?

Comment by thrawnca on The Robbers Cave Experiment · 2016-10-28T05:11:52.859Z · LW · GW

I didn't mean to suggest that the existence of suffering is evidence that there is a God. What I meant was, the known fact of "shared threat -> people come together" makes the reality of suffering less powerful evidence against the existence of a God.

Comment by thrawnca on Counterfactual Mugging · 2016-10-14T03:21:25.443Z · LW · GW

we want a rigorous, formal explanation of exactly how, when, and why you should or should not stick to your precommitment

Well, if we're designing an AI now, then we have the capability to make a binding precommitment, simply by writing code. And we are still in a position where we can hope for the coin to come down heads. So yes, in that privileged position, we should bind the AI to pay up.

However, to the question as stated, "is the decision to give up $100 when you have no real benefit from it, only counterfactual benefit, an example of winning?" I would still answer, "No, you don't achieve your goals/utility by paying up." We're specifically told that the coin has already been flipped. Losing $100 has negative utility, and positive utility isn't on the table.

Alternatively, since it's asking specifically about the decision, I would answer, If you haven't made the decision until after the coin comes down tails, then paying is the wrong decision. Only if you're deciding in advance (when you still hope for heads) can a decision to pay have the best expected value.

Even if deciding in advance, though, it's still not a guaranteed win, but rather a gamble. So I don't see any inconsistency in saying, on the one hand, "You should make a binding precommitment to pay", and on the other hand, "If the coin has already come down tails without a precommitment, you shouldn't pay."

If there were a lottery where the expected value of a ticket was actually positive, and someone comes to you offering to sell you their ticket (at cost price), then it would make sense in advance to buy it, but if you didn't, and then the winners were announced and that ticket didn't win, then buying it no longer makes sense.

Comment by thrawnca on Counterfactual Mugging · 2016-09-12T04:52:43.381Z · LW · GW

We are told no such thing. We are told it's a fair coin and that can only mean that if you divide up worlds by their probability density, you win in half of them. This is defined.

No, take another look:

in the overwhelming measure of the MWI worlds it gives the same outcome. You don't care about a fraction that sees a different result, in all reality the result is that Omega won't even consider giving you $10000, it only asks for your $100.

Comment by thrawnca on Counterfactual Mugging · 2016-09-11T22:57:45.110Z · LW · GW

I wouldn't trust myself to accurately predict the odds of another repetition, so I don't think it would unravel for me. But this comes back to my earlier point that you really need some external motivation, some precommitment, because "I want the 10K" loses its power as soon as the coin comes down tails.

Comment by thrawnca on Counterfactual Mugging · 2016-09-11T22:54:29.074Z · LW · GW

if your decision theory pays up, then if he flips tails, you pay $100 for no possible benefit.

But in the single-shot scenario, after it comes down tails, what motivation does an ideal game theorist have to stick to the decision theory?

Like Parfit's hitchhiker, although in advance you might agree that it's a worthwhile deal, when it comes to the point of actually paying up, your motivation is gone, unless you have bound yourself in some other way.

Comment by thrawnca on Game Theory As A Dark Art · 2016-08-30T05:19:06.667Z · LW · GW

It should also be possible to milk the scenario for publicity: "Our opponents sold out to the evil plutocrat and passed horrible legislation so he would bankroll them!"

I wish I were more confident that that strategy would actually work...

Comment by thrawnca on Counterfactual Mugging · 2016-08-18T02:52:47.733Z · LW · GW

is the decision to give up $100 when you have no real benefit from it, only counterfactual benefit, an example of winning?

No, it's a clear loss.

The only winning scenario is, "the coin comes down heads and you have an effective commitment to have paid if it came down tails."

By making a binding precommitment, you effectively gamble that the coin will come down heads. If it comes down tails instead, clearly you have lost the gamble. Giving the $100 when you didn't even make the precommitment would just be pointlessly giving away money.

Comment by thrawnca on Counterfactual Mugging · 2016-08-18T02:42:02.810Z · LW · GW

The beggars-and-gods formulation is the same problem.

I don't think so; I think the element of repetition substantially alters it - but in a good way, one that makes it more useful in designing a real-world agent. Because in reality, we want to design decision theories that will solve problems multiple times.

At the point of meeting a beggar, although my prospects of obtaining a gold coin this time around are gone, nonetheless my overall commitment is not meaningless. I can still think, "I want to be the kind of person who gives pennies to beggars, because overall I will come out ahead", and this thought remains applicable. I know that I can average out my losses with greater wins, and so I still want to stick to the algorithm.

In the single-shot scenario, however, my commitment becomes worthless once the coin comes down tails. There will never be any more 10K; there is no motivation any more to give 100. Following my precommitment, unless it is externally enforced, no longer makes any sense.

So the scenarios are significantly different.

Comment by thrawnca on Counterfactual Mugging · 2016-08-18T02:34:22.370Z · LW · GW

Sorry, but I'm not in the habit of taking one for the quantum superteam. And I don't think that it really helps to solve the problem; it just means that you don't necessarily care so much about winning any more. Not exactly the point.

Plus we are explicitly told that the coin is deterministic and comes down tails in the majority of worlds.

Comment by thrawnca on Counterfactual Mugging · 2016-08-17T07:52:19.068Z · LW · GW

I think that what really does my head in about this problem is, although I may right now be motivated to make a commitment, because of the hope of winning the 10K, nonetheless my commitment cannot rely on that motivation, because when it comes to the crunch, that possibility has evaporated and the associated motivation is gone. I can only make an effective commitment if I have something more persistent - like the suggested $1000 contract with a third party. Without that, I cannot trust my future self to follow through, because the reasons that I would currently like it to follow through will no longer apply.

MBlume stated that if you want to be known as the sort of person who'll do X given Y, then when Y turns up, you'd better do X. That's a good principle - but it too can't apply, unless at the point of being presented with the request for $100, you still care about being known as that sort of person - in other words, you expect a later repetition of the scenario in some form or another. This applies as well to Eliezer's reasoning about how to design a self-modifying decision agent - which will have to make many future decisions of the same kind.

Just wanting the 10K isn't enough to make an effective precommitment. You need some motivation that will persist in the face of no longer having the possibility of the 10K.

Comment by thrawnca on Counterfactual Mugging · 2016-08-16T22:29:35.680Z · LW · GW

This is an attempt to examine the consequences of that.

Yes, but if the artificial scenario doesn't reflect anything in the real world, then even if we get the right answer, therefore what? It's like being vaccinated against a fictitious disease; even if you successfully develop the antibodies, what good do they do?

It seems to me that the "beggars and gods" variant mentioned earlier in the comments, where the opportunity repeats itself each day, is actually a more useful study. Sure, it's much more intuitive; it doesn't tie our brains up in knots, trying to work out a way to intend to do something at a point when all our motivation to do so has evaporated. But reality doesn't have to be complicated. Sometimes you just have to learn to throw in the pebble.

Comment by thrawnca on 0 And 1 Are Not Probabilities · 2016-08-16T01:25:40.970Z · LW · GW

Perhaps the only appropriate uses for probability 0 and 1 are to refer to logical contradictions (eg P & !P) and tautologies (P -> P), rather than real-world probabilities?

Comment by thrawnca on 0 And 1 Are Not Probabilities · 2016-08-16T01:17:11.730Z · LW · GW

Nope. He's saying that based on his best analysis, it appears to be the case.

Comment by thrawnca on Counterfactual Mugging · 2016-08-15T22:44:15.634Z · LW · GW

Does this particular thought experiment really have any practical application?

I can think of plenty of similar scenarios that are genuinely useful and worth considering, but all of them can be expressed with much simpler and more intuitive scenarios - eg when the offer will/might be repeated, or when you get to choose in advance whether to flip the coin and win 10000/lose 100. But with the scenario as stated - what real phenomenon is there that would reward you for being willing to counterfactually take an otherwise-detrimental action for no reason other than qualifying for the counterfactual reward? Even if we decide the best course of action in this contrived scenario - therefore what?

Comment by thrawnca on Counterfactual Mugging · 2016-08-15T22:38:43.378Z · LW · GW

Also, there is the possibility of future scenarios arising in which Bob could choose to take comparable actions, and we want to encourage him in doing so. I agree that the cases are not exactly analogous.

Comment by thrawnca on Asch's Conformity Experiment · 2016-07-27T03:52:56.915Z · LW · GW

people who are say, religious, or superstitious, or believe in various other obviously false things

Why do you think you know this?

Comment by thrawnca on Universal Law · 2016-07-27T02:41:16.808Z · LW · GW

A while ago, I came across a mathematics problem involving the calculation of the length of one side of a triangle, given the internal angles and the lengths of the other two sides. Eventually, after working through the trigonometry of it (which I have now forgotten, but could re-derive if I had to), I realised that it incorporated Pythagoras' Theorem, but with an extra term based on the cosine of one of the angles. The cosine of 90 degrees is zero, so in a right-angled triangle, this extra term disappears, leaving Pythagoras' Theorem as usual.

The older law that I knew turned out to be a special case of the more general law.

Comment by thrawnca on The Least Convenient Possible World · 2016-07-27T02:08:55.247Z · LW · GW

If the hypothetical Omega tells you that they're is indeed a maximum value for happiness, and you will certainly be maximally happy inside the box: do you step into the box then?

This would depend on my level of trust in Omega (why would I believe it? Because Omega said so. Why believe Omega? That depends on how much Omega has demonstrated near-omniscience and honesty). And in the absence of Omega telling me so, I'm rather skeptical of the idea.

Comment by thrawnca on Guardians of the Truth · 2016-07-27T01:45:53.618Z · LW · GW

If you believe in G-d then you believe in a being that can change reality just by willing it

OK, so by that definition...if you instead believe in a perfect rationalist that has achieved immortality, lived longer than we can meaningfully express, and now operates technology that is sufficiently advanced to be indistinguishable from magic, including being involved in the formation of planets, then - what label should you use instead of 'G-d'?

Comment by thrawnca on Mere Messiahs · 2016-07-27T01:18:49.670Z · LW · GW

I know what a garage would behave like if it contained a benevolent God

Do you, though? What if that God was vastly more intelligent than us; would you understand all of His reasons and agree with all of His policy decisions? Is there not a risk that you would conclude, on balance, "There should be no 'banned products shops'", while a more knowledgeable entity might decide that they are worth keeping open?

Comment by thrawnca on Mere Messiahs · 2016-07-27T01:14:14.649Z · LW · GW

it greatly changes the "facts" in your "case study".

Actually, does it not add another level of putting Jesus on a pedestal above everyone else?

It changes the equation when comparing Jesus to John Perry (indicating that Jesus' suffering was greatly heroic after all), but perhaps intensifies the "Alas, somehow it seems greater for a hero to have steel skin and godlike powers."

(Btw I'm one of the abovementioned Christians. Just thought I'd point out that the article's point is not greatly changed.)

Comment by thrawnca on Harry Potter and the Methods of Rationality discussion thread, part 16, chapter 85 · 2016-07-24T22:33:10.930Z · LW · GW

you cannot be destroyed

In the sense that your mind and magic will hang around, yes. But your material form can still be destroyed, and material destruction of a Horcrux will destroy its ability to anchor the spirit.

So, if two people are mutual Horcruxen, you can still kill person 1, at which point s/he will become a disembodied spirit dependent on person 2, but will cease to be an effective horcrux for person 2. You can then kill person 2, which will permanently kill both of them.

All you really achieve with mutual Horcruxen is to make your Horcrux portable and fragile (subject to illness, aging, accident, etc).

Comment by thrawnca on The Robbers Cave Experiment · 2016-07-22T03:06:16.319Z · LW · GW

It appears the key issue in creating conflict is that the two groups must not be permitted to get to know each other and become friendly

Because then, of course, they might start attributing each other's negative actions to environmental factors, instead of assuming them to be based on inherent evil.

Comment by thrawnca on The Robbers Cave Experiment · 2016-07-22T02:57:21.697Z · LW · GW

If those are the unfortunate downsides of policies that are worthwhile overall, then I don't think that qualifies for 'supervillain' status.

I mean, if you're postulating the existence of God, then that also brings up the possibility of an afterlife, etc, so there could well be a bigger picture and higher stakes than threescore years and ten. Sometimes it's rational to say, That is a tragedy, but this course of action is still for the best. Policy debates should not appear one-sided.

If anything, this provides a possible answer to the atheist's question, "Why would God allow suffering?"

Comment by thrawnca on Religion's Claim to be Non-Disprovable · 2016-04-08T04:27:24.049Z · LW · GW

Isn't this over-generalising?

"religion makes claims, not arguments, and then changes its claims when they become untenable." "claims are all religion has got" "the religious method of claiming is just 'because God said so'"

Which religion(s) are you talking about? I have a hard time accepting that anyone knows enough to talk about all of them.

Comment by thrawnca on The Least Convenient Possible World · 2016-03-22T02:29:30.888Z · LW · GW

The happiness box is an interesting speculation, but it involves an assumption that, in my view, undermines it: "you will be completely happy."

This is assuming that happiness has a maximum, and the best you can do is top up to that maximum. If that were true, then the happiness box might indeed be the peak of existence. But is it true?

Comment by thrawnca on How to Convince Me That 2 + 2 = 3 · 2016-03-07T04:49:21.860Z · LW · GW

Email sent about a week ago. Did it get spam-filtered?

Comment by thrawnca on Privileging the Hypothesis · 2016-03-01T10:27:43.537Z · LW · GW

I tend to think that the Bible and the Koran are sufficient evidence to draw our attention to the Jehovah and Allah hypotheses, respectively. Each is a substantial work of literature, claiming to have been inspired by direct communication from a higher power, and each has millions of adherents claiming that its teachings have made them better people. That isn't absolute proof, of course, but it sounds to me like enough to privilege the hypotheses.

Comment by thrawnca on The ethics of eating meat · 2016-02-19T10:43:11.585Z · LW · GW

I think it's pretty clear that animals can feel pain, distress, etc. So we should aim for practices that minimise those things. It's certainly possible - though harder on a mass scale like factory farming.

Also, from a utilitarian perspective, it's clear that eating plants is much more ecologically efficient than feeding plants to animals and then eating the animals. On the other hand, as Elo points out, there are crops and terrain that are not well suited to human food, and might more profitably be used to raise edible animals.

So I'd say that there could be an equilibrium, a point where our overall meat consumption is about right; less would be basically a wasted opportunity; more would be an inefficient use of resources and a risk of oppressive practices. And I'd say that that point is much lower than current overall consumption.

Comment by thrawnca on How to Convince Me That 2 + 2 = 3 · 2016-02-19T01:42:59.939Z · LW · GW

"if, for example, there was an Islamic theologian who offered to debate the issues with me then I would be inclined to do it and follow where the belief updates lead."

Is that an open offer to theologians of all stripes?

Comment by thrawnca on How to Convince Me That 2 + 2 = 3 · 2016-02-17T00:57:48.345Z · LW · GW

In discussing Newcomb's problem, Eliezer at one point stated, "Be careful of this sort of argument, any time you find yourself defining the "winner" as someone other than the agent who is currently smiling from on top of a giant heap of utility."

This aligns well with a New Testament statement from Jesus, "Ye shall know them by their fruits...every good tree bringeth forth good fruit, but a corrupt tree bringeth forth evil fruit."

So, I'm only a novice of the Bayesian Conspiracy, but I can calculate the breast cancer percentages and the red vs blue pearl probabilities. To answer Eliezer's question, to convince me of the truth of Islam, it would have to show me better outcomes, better fruit, than Christianity, across the board. In many cases its principles don't conflict with Christianity; but where they do, I would have to establish that resolving the conflict in favor of Islam will lead to better outcomes.

Consider, as just one example, the fruits identified by LW's own Swimmer963, at http://lesswrong.com/lw/4pg/positive_thinking/ Would following Islam give me a better foundation than that for positive thinking, resilience, motivation for mutual help? Then I would be interested. Not convinced by that alone, but interested.

If Islam consistently offered a more coherent worldview that more effectively helped me to become a better person and achieve my goals, then I would have cause to consider that it might have more truth than what I believe now. As far as I have yet determined, it doesn't; its teachings seem to be more limited, less useful, than what I already have.