Posts
Comments
One way to do it to get to the desired outcome is to replace U(x) with U(x,p) (with x being the money reward and p the probability to get it), and define U(x,p)=2x if p=1 and U(x,p)=x, otherwise. I doubt that this is a useful model of reality, but mathematically, it would do the trick. My stated opinion is that this special case should be looked at in the light of more general startegies/heuristics applied over a variety of situations, and this approach would still fall short of that.
I know Settlers of Catan, and own it. It's been awhile since I last played it, though.
Your point about games made me aware of a crucial difference between real life and games, or other abstract problems of chance: in the latter, chances are always known without error, because we set the game (or problem) up to have certain chances. In real life, we predict events either via causality (100% chance, no guesswork involved, unless things come into play we forgot to consider), or via experience / statistics, and that involves guesswork and margins of error. If there's a prediction with a 100% chance, there is usually a causal relationship at the bottom of it; with a chance less than 100%, there is no such causal chain; there must be some factor that can thwart the favorable outcome; and there is a chance that this factor has been assessed wrong, and that there may be other factors that were overlooked. Worst case, a 33/34 chance might actually only be 30/34 or less, and then I'd be worse off taking the chance. Comparing a .33 with a .34 chance makes me think that there's gotta be a lot of guesswork involved, and that, with error margins and confidence intervals and such, there's usually a sizeable chance that the underlying probabilities might be equal or reversed, so going for the higher reward makes sense.
[rewritten] Imagine you are a mathematical advisor to a king who asks you to advise him of a course of action and to predict the outcome. In situation, you can pretty much advise whatever, because you'll predict a failure; the outcome either confirms your prediction, or is a lucky windfall, so the king will be content with your advice in hindsight. In situation 2, you'll predict a gain; if you advised A, your prediction will be confirmed, but if you advised B, there's a chance it won't be, with the king angry at you because he didn't make the money you predicted he would. Your career is over. -- Now imagine a collection of autonomous agents, or a bundle of heuristics fighting for Darwinist survival, and you'll see what strategy survives. [If you like stereotypes, imagine the "king" as "mathematician's non-mathematical spouse". ;-)]
That's a neat trick, however, I am not sure I understand you correctly. You seem to be saying that risk-avoidance does not explain the 1A/2B preference, because you say your assignment captures risk-avoidance, and it doesn't lead to that. (It does lead to your take of the term though - your preference isn't 1A/2B, though).
Your assignment looks like "diminishing utility", i.e. a utility function where the utility scales up subproprotionally with money (e.g. twice the money must have less than twice the utility). Do you think diminishing utility is equivalent to risk-avoidance? And if yes, can you explain why?
You seem to have examples in mind?
The utility function has as its input only the monetary reward in this particular instance. Your idea that risk-avoidance can have utility (or that 1% chances are useless) cannot be modelled with the set of equations given to analyse the situation (the percentage is no input to the U() function) - the model falls short because the utility attaches only to the money and nothing else. (Another example of a group of individuals for whom the risk might out-utilize the reward are gambling addicts.) Security is, all other things being equal, preferred over insecurity, and we could probably devise some experimental setup to translate this into a utility money equivalent (i.e. how much is the test subject prepared to pay for security and predictability? that is the margin of insurance companies, btw). :-P
I wanted to suggest that a real-life utility function ought to consider even more: not just to the single case, but the strategies used in this case - do these strategies or heuristics have better utility in my life than trying to figure out the best possible action for each problem? In that case, an optimal strategy may well be suboptimal in some cases, but work well re: a realistic lifetime filled with probable events, even if you don't contrive a $24000 life-or-death operation. (Should I spend two years of my life studying more statistics, or work on my father's farm? The farm might profit me more in the long run, even if I would miss out if somebody made me the 1A/1B offer, which is very unlikely, making that strategy the rational one in the larger context, though it appears irrational in the smaller one.)
The problem as stated is hypothetical: there is next to no context, and it is assumed that the utility scales with the monetary reward. Once you confront real people with this offer, the context expands, and the analysis of the hypothetical situation falls short of being an adequate representation of reality, not necessarily because of a fault of the real people.
Many real people use a strategy of "don't gamble with money you cannot afford to lose"; this is overall a pretty successful strategy (and if I was looking to make some money, my mark would be the person who likes to take risks - just make him subsequently better offers until he eventually loses, and if he doesn't, hit him over the head, take the now substantial amount of money and run). To abandon this strategy just because in this one case it looks as if it is somewhat less profitable might not be effective in the long run. (In other circumstances, people on this site talk about self-modification to counter some expected situations as one-boxing vs. dual-boxing; can we consider this strategy such a self-modification?)
Another useful real-life strategy is, "stay away from stuff you don't understand" - $24,000 free and clear is easier to grasp than the other offer, so that strategy favors 1A as well, and doesn't apply to 2A vs. 2B because they're equally hard to understand. The framing of offer two also suggests that the two offers might be compared by multiplying percentage and values, while offer 1 has no such suggestion in branch 1A.
We're looking at a hypothetical situation, analysed for an ideal agent with no past and no future - I'm not surprised the real world is more complex than that.
After a good night's sleep, here are some more thoughts:
the idea is that you pay up because you feel obligated to your counterfactual self.
To feel obligated to my counterfactual self, which exists only in the "mind" of Omega, and not feel obligated to Omega doesn't make any sense to me.
Your additional assumptions about Omega destroy the utility that the $100 had - in the original version, $100 is $100 to both me and Omega, but in your version it is nothing to Omega. Your amended version of the problem amounts to "would I throw $100 into an incinerator on the basis of some thought experiment", and that is clearly not even a zero-sum game if you consider the whole system - the original problem is zero-sum, and that gives me more freedom of choice.
Why would I not hold them responsible? They are the ones who are trying to make us responsible by giving us an opportunity to act, but their opportunities are much more direct - after all, they created the situation that exerts the pressure on us. This line of thought is mainly meant to be argued in Fred's terms, who has a problem with feeling responsible for this suffering (or non-pleasure) - it offers him an out of the conundrum without relinquishing his compassion for humanity (i.e. I feel the ending as written is illogical, and I certainly think "Michael" is acting very unprofessionally for a psychoanalyst). ["Relinquish the compassion" is also the conclusion you seem to have drawn, thus my response here.]
Of course, the alien strategy might not be directed at our sense of responsibility, but at some sort of game theoretic utility function that proposes the greater good for the greater number - these utility functions are always sort of arbitrary (most of them on lesswrong center around money, with no indication why money should be valuable), and the arbitrariness in this case consists in including the alien simulations, but not the aliens themselves. If the aliens are "rational agents", then not rewarding their behaviour will make them stop it if it has a cost, while rewarding it will make them continue. (Haven't you ever wondered how many non-rational entities are trying to pose conundrums to rational agents on here? ;)
I don't have a theory of quantifyable responsibility, and I don't have a definite answer for you. Let's just say there is only a limited amount of stuff we can do in the time that we have, so we have to make choices what to do with our lives. I hope that Fred comes to feel that he can accomplish more with his life than to indirectly die for a tortured simulation that serves alien interests.
The central problem in all of these thought experiments is the crazy notion that we should give a shit about the welfare of other minds simply because they exist and experience things analogously to the way we experience things.
Well, I see the central problem in the notion that we should care about something that happens to other people if we're not the ones doing it to them. Clearly, the aliens are sentient; they are morally responsible for what happens to these humans. While we certainly should pursue possible avenues to end the suffering, we shouldn't act as if we were.
I don't see how your points apply: I would have paid had I lost. Except if my hypothetical self is so much in debt that it can't reasonably spend $100 on an investment such as this - in which case Omega would have known in advance, and understands my nonpayment.
I do not consider the future existence of Omega as a factor at all, so it doesn't matter whether it self-destructs or not. And it is also a given that Omega is absolutely trustworthy (more than I could say for myself).
My view is that this may well be one of the undecidable theorems that Goedel has shown must exist in any reasonably complex formal system. The only way to make it decidable is to think out of the box, and in this case it means that I consider that someone else is somehow still "me" (at least under ethical aspects) - there are other threads on here that involve splitting myself and still remaining the same person somehow, so it's not intrinsically irrational or anything. My reference to Buddhism was merely meant to show that the concept is mainstream enough to be part of a major world religion, though most other religions and the UN charta of human rights have it as well, though not as pronounced, as "brotherhood" - not a factual, but an ethical identity.
The problem is easier to decide with a small change that also makes it more practical. Suppose two competing laboratories design a machine intelligence and bid for a government contract to produce it. The government will evaluate the prototypes and choose one of them for mass-production (the "winner", getting multiplied); due to the R&D effort involved, the company who fails the bid will go into receivership, and the machine intelligence not chosen will be auctioned off, but never reproduced (the "loser").
The question is: should the developers anticipate mass-production? Should they instruct the machine intelligence to expect mass-production?
Assuming that after the evaluation process, both machine intelligences are turned off, to be turned on again after either mass-production or the auction has occurred, should the machine intelligence expect to be the original, or a copy?
The obvious answer: the developers will rationally both expect mass-production, and teach their machines to expect it, because of the machine intelligences that exist after this process, most will operate under the correct assumption, and only one will need to be taught that this assumption was wrong. The machine ought to expect to be a "winner".
It is bad to apply statistics when you don't in fact have large numbers - we have just one universe (at least until the many-world theory is better established - and anyway, the exposition didn't mention it).
I think the following problem is equivalent to the one posed: It is late at night, you're tired, and it's dark and you're driving down an unfamiliar road. Then you see two motels, one to the right of the street, one to the left, both advertising vacant rooms. You know from a visit years ago that one has 10 rooms, the other has 100, but you can't tell which is which (though you do remember that the larger one is cheaper). Anyway, you're tired, so you just choose the one on the right at random, check in, and go to sleep. As you wake up in the morning, what are your chances that you find yourself in the larger motel? Does the number of rooms come into it? (Assume both motels are 90% full.)
The paradox is that while the other hotel is not contrafactual, it might as well be - the problem will play out the same. Same with the universe - there aren't actually two universes with probabilities on which one you'll end up in.
For a version where the Bayesian update works, you'd not go to the motel directly, but go to a tourist information stall that directs vistors to either the smaller or the larger motel until both are full - in that case, expect to wake up in the larger one. In this case, we have not one world, but two, and then the reasoning holds.
But if there's only one motel, because the other burnt down (and we don't know which), we're back to 50/50.
I know that "fuzzy logic" tries to mix statistics and logic, and many AIs use it to deal with uncertain assertions, but statistics can be misapplied so easily that you seem to have a problem here.
"Suppose you have ten ideal game-theoretic selfish agents and a pie to be divided by majority vote. "
Well then, the statistical expected (average) share any agent is going to get long-term is 1/10th of the pie. The simplest solution that ensures this is the equal division; anticipating this from the start cuts down on negotiation costs, and if a majority agrees to follow this strategy (i.e agrees to not realize more than their "share"), it is also stable - anyone who ponders upsetting it risks to be the "odd man out" who eats the loss of an unsymmetric strategy.
In practice (i.e. in real life) there are other situations that are relatively stable, i.e. after a few rounds of "outsiders" bidding low to get in, there might be two powerful "insiders" who get large shares in liaison with four smaller insiders who agree to a very small share because it is better than nothing; the best the insiders can do then is to offer the four outsiders small shares also, so that each small-share individual wil be faced with the choice of cooperating and receiving a small share, or not cooperating and receiving nothing. Whether the two insiders can pull this off will depend on how they frame the problem, and how they present themselves ("we are the stabilizers that ensure that "social justice" is done and nobody has to starve").
How you can get an AI to understand setups like this (and if it wants to move past the singularity, it probably will have to) seems to be quite a problem; to recognize that statistically, it can realize no more than 1/10th, and to push for the simplest solution that ensures this seems far easier (and yet some commentators seem to think that this solution of "cutting everyone in" is somehow "inferior" as a strategy - puny humans ;-).
This is actually a parable on the boundaries of self (think a bit Buddhist here). Let me pose this another way: late last night in the pub, my past self committed to the drunken bet of $100 vs. $200 on the flip of a coin (the other guy was even more drunk than I was). My past self lost, but didn't have the money. This morning, my present self gets a phone call from the person it lost to. Does it honor the bet? Assuming, as in typical in these hypothetical problems, that we can ignore the consequences (else we'd have to assign a cost to them that might well offset the gains, so we'll just assign 0 and don't consider them), a utilitarian approach is that I should default on the bet if I can get away with it. Why should I be responsible for what I said yesterday?
However, as usual in utilitarian dilemmas, the effect that we get in real-life is that we have a conscience - can I live with myself being the kind of person that doesn't honor past commitments? So, most people will, out of one consideration or another, not think twice about paying up the $100.
Of Omega it is said that I can trust it more than I would myself. It knows more about me than I do myself. It would be part of myself if I didn't consider it seperate from myself. If I consider my ego and Omega part of the same all-encompassing self, then honoring the commitment that Omega committed itself to on my behalf should draw the same response as if I had done it myself. Only if I perceive Omega as a separate entity to whom I am not morally obligated can I justify not paying the $100. Only with this individualist viewpoint will I see someone whom I am not obligated to in any way demanding $100 of me.
If you manage to instill your AI with a sense of the "common good", a sense of brotherhood of all intelligent creatures, then it will, given the premises of trust etc., cooperate in this brotherhood - in fact, that is what I believe would be one of the meanings of "friendly".
I believe both of your computations are correct, and the fallacy lies in mixing up the payoff for the group with the payoff for the individual - which the frame of the problem as posed does suggest, with multiple identities that are actually the same person. More precisely, the probabilities for the individual are 90/10 , but the probabilities for the groups are 50/50, and if you compute payoffs for the group (+$12/-$52), you need to use the group probabilities. (It would be different if the narrator ("I") offered the guinea pig ("you") the $12/$52 odds individually.)
byrnema looked at the result from the group viewpoint; you get the same result when you approach it from the individual viewpoint, if done correctly, as follows:
For a single person, the correct payoff is not $12 vs. -$52, but rather ($1 minus $6/18 to reimburse the reds, making $0.67) 90% and ($1 minus $54/2 = -$26) 10%, so each of the copies of the guinea pig is going to be out of pocket by 2/3 0.9 + (-26) 0.1 = 0.6 - 2.6 = -2, on average.
The fallacy of Eliezer's guinea pigs is that each of them thinks they get the $18 each time, which means that the 18 goes into his computation twice (squared) for their winnings (18 * 18/20). This is not a problem with antropic reasoning, but with statistics.
A distrustful individual would ask themselves, "what is the narrator getting out of it", and realize that the narrator will see the -$12 / + $52 outcome, not the guinea pig - and that to the narrator, the 50/50 probability applies. Don't mix them up!
I don't understand how the examples given illustrate free-floating beliefs: they seem to have at least some predictive powers, and thus shape anticipation - (some comments by others below illustrate this better).
The phlogiston theory had predictive power (e.g. what kind of "air" could be expected to support combustion, and that substances would grow lighter when they burned), and it was falsifyable (and was eventually falsified). It had advantages over the theories it replaced and was replaced by another theory which represented a better understanding. (I base this reading on Jim Loy's page on Phlogiston Theory.
Literary genres don't have much predictive powers if you don't know anything about them - if you do, then they do. Classifying a writer as producing "science fiction" or "fantasy" creates anticipations that are statistically meaningful. For another comparison, saying some band plays "Death Metal" will shape our anticipation; somewhat differently for those who can distinguish Death Metal from Speed Metal as compared to those who merely know that "Metal" means "noise".
I can imagine beliefs leading to false anticipations, and they're obviously inferior to beliefs leading to more correct ones. That doesn't mean they're free-floating.
One example for the free-floating belief is actually about the tree falling in the forest: to believe that it makes a sound does not anticipate any sensory experience, since the tree falls explicitly where nobody is around to hear it, and whether there is sound or no sound will not change how the forest looks when we enter it later. However, to let go of the belief that the tree makes a sound does not seem to me to be very useful. What am I missing?
I understand that many beliefs are held not because they have predictive power, but because they generalize experiences (or thoughts) we have had into a condensed form: a sort of "packing algrithm" for the mind when we detect something common; and when we understand this commonality enough, we get to the point where we can make prediction, and if we don't yet, we can't, but may do so later. There is no belief or thought we can hold that we couldn't trace back to experiences; beliefs are not anticipatory, but formed from hindsight. They organize past experience. Can you predict which of these beliefs is not going to be helpful in organizing future experiences? How?
An explicit belief that you would not allow yourself to hold under these conditions would be that the tree which falls in the forest makes a sound - because no one heard it, and because we can't sense it afterwards, whether it made sound or not had no empirical consequence.
Every time I have seen this philosophical question posed on lesswrong, the two sophists that were arguing about it were in agreement that a sound would be produced (under the physical definition of the word), so I'd be really surprised if you could let go of that belief.
Yes, that's the post I was referring to. Thank you!
Of course, these analyses and exercises would also serve beautifully as use-cases and tests if you wanted to create an AI that can pass a Turing test for being rational. ;-)
beneath my notice
I'm referring to that. Sending that message is an implicit lie -- well, you could call it a "social fiction", if you like a less loaded word.
It is also a message that is very likely to be misunderstood (I don't yet know my way around lesswrong well enough to find it again, but I think there's an essay here someplace that deals with the likelyhood of recipients understanding something completely different than what you intended to mean, but you not being able to detect this because the interpretation you know shapes your perception of what you said).
So if your true reaction is "you are just trying to reduce my status, and I don't think it's worth it for me to discuss this further", my choice, given the option to not display it or to display it, would usually be to display it, if a reaction was expected of me.
I hope I was able to clarify my distinction between having a true reaction, and displaying it. In a nutshell, if you notice something, you have a reaction, and by not displaying it (when it is expected of you), you create an ambiguous situation that is not likely to communicate to the other person what you want it to communicate.
In another comment on this post, Eugine Nier linked to Schelling. I read that post, and the Slate page that mentions Schelling vs. Vietnam, and it became clear to me that acting moral acts as an "antidote" to these underhanded strategies that count on your opponent being rational. (It also serves as a Gödelian meta-layer to decide problems that can't be decided rationally.)
If, in Schellings example, the guy who is left with the working radio set is moral, he might reason that "the other guy doesn't deserve the money if he doesn't work for it", and from that moral strongpoint refuse to cooperate. Now if the rationalist knows he's working with a moralist, he'll also know that his immoral strategy won't work, so he won't attempt it in the first place - a victory for the moralist in a conflict that hasn't even occurred (in fact, the moralist need never know that the rationalist intended to cheat him).
This is different from simply acting irrationally in that the moralist's reaction remains predictable.
So it is possible that moral indignation helps me to prevent other people from manouevering me into a position where I don't want to be.
Well, it seems I misunderstand your statement, "It is possible to not control anger but instead never even feel it in the first place, without effort or willpower."
I know it is possible to experience anger, but control it and not act angry - there is a difference between having the feeling and acting on it. I know it is also possible to not feel anger, or to only feel anger later, when distanced from the situation. I'm ok with being aware of the feeling and not acting on it, but to get to the point where you don't feel it is where I'm starting to doubt whether it's really a net benefit.
And yes, I do understand that with understand / assumptions about other people, stuff that would have otherwise bothered me (or someone else) is no longer a source of anger. You changed your outlook and understanding of that type of situation so that your emotion is frustration and not anger. If that's what you meant originally, I understand now.
My opinion? I'd not lie. You've noticed the attempt, why claim you didn't? Display your true reaction.
And yet, not to feel an emotion in the first place may obscure you to yourself - it's a two-sided coin. To opt to not know what you're feeling when I struggle to find out seems strange to me.
The problem with the downvote is that it mixes the messages "I don't agree" with "I don't think others should see this". There is no way to say "I don't agree, but that post was worth thinking about", is there? Short of posting a comment of your own, that is.
Eliezer, you state in the intro that the 5-second-level is a "method of teaching rationality skills". I think it is something different.
First, the analysis phase is breaking down behaviour patterns into something conscious; this can apply to my own patterns as I figure out what I need to (or want to) teach, or to other people's patterns that I wish to emulate and instill into myself.
It breaks down "rationality" into small chunks of "behaviour" which can then be taught using some sort of conditioning - you're a bit unclear on how "teaching exercises" for this should be arrived at.
You suggest a form of self-teaching: The 5-second analysis identifies situations when I want some desired behaviour to trigger, and to pre-think my reaction to the point where it doesn't take me more than 5 seconds to use. In effect, I am installing a memory of thoughts that I wish to have in a future situation. (I could understand this as communcating with "future me" if I like science fiction. ;) Your method of limiting this to the "5-second-level" aims to make this pre-thinking specific enough so that it actually works. With practice, this response will trigger subconsciously, and I'll have modified my behaviour.
It would be nice if that would actually help to talk about rationality more clearly (but won't we be too specific and miss the big picture?), and it would be nice if that would help us arrive at a "rationality syllabus" and a way to teach it. I'm looking forward to reports of using this technique in an educational setting; what the experience of you and your students were in trying to implement this. Until your theory's tested in that kind of setting, it's no more than a theory, and I'm disinclined to believe your "you need to" from the first sentence in your article.
Is rationality just a behaviour, or is it more? Can we become (more) rational by changing our behaviour, and then have that changed behaviour change our mind?
Assuming the person who asks the question wants to learn something and not hold a socratic argument, what they need is context. They need context to anchor the new information (there's a word "red", in this case) to what they already know. You can give this context in the abstract and specific (the "one step up, one step down" method that jimrandomh descibes above achieves this), but it doesn't really matter. The more different ways you can find, the better the other person will understand, and the richer a concept they will take away from your conversation. (I'm obviously bad at doing this.)
An example is language learning: a toddler doesn't learn language by getting words explained, they learn language by hearing sounds used in certain contexts and recalling the association where appropriate.
I suspect that the habit of answering questions badly is being taught in school, where an answer is often not meant to transfer knowledge, but to display it. If asked "What is a car?", answering that is has wheels and an engine will get you a better grade than to state that your mom drives a Ford, even though talking about your experience with your mom's car would have helped a car-less friend to better understand what it means to have one.
So what we need to learn (and what good teachers have learned) is to take questions and, in a subconscious reaction, translate them to a realisation what the asking person needs to know: what knowledge they are missing that made them ask the question, and to provide it. And that depends on context as well: the question "what is red" could be properly answered by explaining when the DHS used to issue red alerts (they don't color code any more), it could be explaining the relation of a traffic light to traffic, it could be explaining what red means in Lüscher's color psychology or in Chinese chromotherapy. If I see a person nicknamed Red enter at the far side of the room wearing a red sweater, and I shudder and remark "I don't like red", then someone asks me "what do you mean, red" I ought to simply say that I meant the color - any talk of stop signs or fire engines would be very strange. To be specific, I would answer "that sweater".
To wrap this overlong post up, I don't think there's an innate superiority of the specific over the abstract. What I'll employ depends on what the person I'm explaining stuff to already understands. A 5-second "exercise" designed to emphasise the specific over the abstract can help me overcome a mental bias of not considering specifics in my explanations (possibly instilled by the education system). It widens the pool that I can draw my answers from, and that makes me a potentially better answerer.