Posts

St. Petersburg Mugging Implies You Have Bounded Utility 2011-06-07T15:06:55.021Z
Pancritical Rationalism Can Apply to Preferences and Behavior 2011-05-25T12:06:11.700Z
The Aliens have Landed! 2011-05-19T17:09:16.761Z
Why is my sister related to me only 50%? 2011-05-06T18:17:58.381Z
Leadership and Self Deception, Anatomy of Peace 2011-05-06T03:56:55.132Z
Friendly to who? 2011-04-16T11:43:06.959Z

Comments

Comment by TimFreeman on Models as definitions · 2015-04-22T05:52:47.069Z · LW · GW

Humans can be recognized inductively: Pick a time such as the present when it is not common to manipulate genomes. Define a human to be everyone genetically human at that time, plus all descendants who resulted from the naturally occurring process, along with some constraints on the life from conception to the present to rule out various kinds of manipulation.

Or maybe just say that the humans are the genetic humans at the start time, and that's all. Caring for the initial set of humans should lead to caring for their descendants because humans care about their descendants, so if you're doing FAI you're done. If you want to recognize humans for some other purpose this may not be sufficient.

Predicting human behavior seems harder than recognizing humans, so it seems to me that you're presupposing the solution of a hard problem in order to solve an easy problem.

An entirely separate problem is that if you train to discover what humans would do in one situation and then stop training and then use the trained inference scheme in new situations, you're open to the objection that the new situations might be outside the domain covered by the original training.

Comment by TimFreeman on "Spiritual" techniques that actually work thread · 2015-03-11T15:09:53.326Z · LW · GW

Hyperventilating leads to hallucinations instead of stimulation. I went to a Holotropic Breathwork session once. Some years before that, I went to a Sufi workshop in NYC where Hu was chanted to get the same result. I have to admit I cheated at both events -- I limited my breathing rate or depth so not much happened to me.

Listening to the reports from the other participants of the Holotropic Breathwork session made my motives very clear to me. I don't want any of that. I like the way my mind works. I might consider making purposeful and careful changes to how my mind works, but I do not want random changes. I don't take psychoactive drugs for the same reason.

Comment by TimFreeman on The Problem with AIXI · 2014-03-20T15:26:04.414Z · LW · GW

If you give up on the AIXI agent exploring the entire set of possible hypotheses and instead have it explore a small fixed list, the toy models can be very small. Here is a unit test for something more involved than AIXI that's feasible because of the small hypothesis list.

Comment by TimFreeman on Group Rationality Diary, January 16-31 · 2014-02-28T06:58:17.773Z · LW · GW

Getting a programming job is not contingent on getting a degree. There's an easy test for competence at programming in a job interview: ask the candidate to write code on a whiteboard. I am aware of at least one Silicon Valley company that does that and have observed them to hire people who never finished their BS in CS. (I'd rather ask candidates to write code and debug on a laptop, but the HR department won't permit it.)

Getting a degree doesn't hurt. It might push up your salary -- even if one company has enough sense to evaluate the competence of a programmer directly, the other companies offering jobs to that programmer are probably looking at credentials, so it's rational for a company to base salaries on credentials even if they are willing to hire someone who doesn't have them. Last I checked, a BS in CS made sense financially, a MS made some sense too, and a PhD was not worth the time unless you want a career writing research papers. I got a PhD apparently to postpone coming into contact with the real world. Do not do that.

If you can't demonstrate competent programming in a job interview (either due to stage fright or due to not being all that competent), getting a degree is very important. I interview a lot of people and see a lot of stage fright. I have had people I worked with and knew to be competent not get hired because of how they responded emotionally to the interview situation. What I'm calling "stage fright" is really cognitive impairment due to the emotional situation; it is usually less intense than the troubles of a thespian trying to perform on stage. Until you've done some interviews, you don't know how much the interview situation will impair you.

Does anyone know if ex-military people get stage fright at job interviews? You'd think that being trained to kill people would fix the stage fright when there's only one other person in the room and that person is reasonably polite, but I have not had the opportunity to observe both the interview of an ex-military person and their performance as a programmer in a realistic work environment.

Comment by TimFreeman on Lifestyle interventions to increase longevity · 2014-02-28T06:10:01.424Z · LW · GW

I have experienced consequences of donating blood too often.The blood donation places check your hemoglobin, but I have experienced iron deficiency symptoms when my hemoglobin was normal and my serum ferritin was low. The symptoms were twitchy legs when I was trying to sleep and insomnia, and iron deficiency was confirmed with a ferritin test. The iron deficiency symptoms went away and ferritin went back to normal when I took iron supplements and stopped donating blood, and I stopped the iron supplements after the normal ferritin test.

The blood donation places will encourage you to donate every 2 months, and according to a research paper I found when I was having this problem essentially everyone will have low serum ferritin if they do that for two years.

I have no reason to disagree with the OP's recommendation of donating blood every year or two.

Comment by TimFreeman on Quantum Mechanics and Personal Identity · 2013-10-27T19:39:51.998Z · LW · GW

Well, I suppose it's an improvement that you've identified what you're arguing against.

Unfortunately the statements you disagree with don't much resemble what I said. Specifically:

The argument you made was that copy-and-destroy is not bad because a world where that is done is not worse than our own.

I did not compare one world to another.

Pointing out that your definition of something, like harm, is shared by few people is not argumentum ad populum, it's pointing out that you are trying to sound like you're talking about something people care about but you're really not.

I did not define "harm".

The disconnect between what I said and what you heard is big enough that saying more doesn't seem likely to make things better.

The intent to make a website for the purpose of fostering rational conversation is good, and this one is the best I know, but it's still so cringe-inducing that I ignore it for months at a time. This dialogue was typical. There has to be a better way but I don't know what it is.

Comment by TimFreeman on Quantum Mechanics and Personal Identity · 2013-10-26T05:52:47.433Z · LW · GW

Nothing I have said in this conversation presupposed ignorance, blissful or otherwise.

I give up, feel free to disagree with what you imagine I said.

Check out Argumentum ad Populum. With all the references to "most people", you seem to be committing that fallacy so often that I am unable to identify anything else in what you say.

Comment by TimFreeman on Quantum Mechanics and Personal Identity · 2013-10-25T05:50:46.383Z · LW · GW

This reasoning can be used to justify almost any form of "what you don't know won't hurt you". For instance, a world where people cheated on their spouse but it was never discovered would function, from the point of view of everyone, as well as or better than the similar world where they remained faithful.

Your example is too vague for me to want to talk about. Does this world have children that are conceived by sex, children that are expensive to raise, and property rights? Does it have sexually transmitted diseases? Does it have paternity tests? Does it have perfect contraception? You stipulated that affairs are never discovered, so liberal use of paternity tests imply no children from the affairs.

I'm also leery of the example because I'm not sure it's relevant. If you turn off the children, in some scenarios you turn off the evolution so my idea of looking at evolution to decide what concepts are useful doesn't work. If you leave the children in the story, then for some values of the other unknowns jealousy is part of the evolutionarily stable strategy, so your example maybe doesn't work.

Can you argue your point without relying so much on the example? "Most of us think X is bad" is perhaps true for the person-copying scheme and if that's the entire content of your argument then we can't address the question of whether most of us should think X is bad.

Comment by TimFreeman on Quantum Mechanics and Personal Identity · 2013-10-21T15:44:03.887Z · LW · GW

OTOH, some such choices are worse than others.

If you have an argument, please make it. Pointing off to a page with a laundry list of 37 things isn't an argument.

One way to find useful concepts is to use evolutionary arguments. Imagine a world in which it is useful and possible to commute back and forth to Mars by copy-and-destroy. Some people do it and endure arguments about whether they are still the "same" person when they got back, some people don't do it because of philosophical reservations about being the "same" person. Since we hypothesized that visiting Mars this way is useful, the ones without the philosophical reservation will be better off, in the sense that if visiting Mars is useful enough they will be able to out-compete the people who won't visit Mars that way.

So if you want to say that going places by copy-and-destroy is a bad thing for the person taking the trip, you should be able to describe the important way in which this hypothetical world where copy-and-destroy is useful is different from our own. I can't do that, and I would be very interested if you can.

Freezing followed by destructive upload seems moderately likely to be useful in the next few decades, so this hypothetical situation with commuting to Mars is not irrelevant.

Comment by TimFreeman on Open Problems Related to Solomonoff Induction · 2013-06-23T18:34:35.814Z · LW · GW

Suppose we define a generalized version of Solomonoff Induction based on some second-order logic. The truth predicate for this logic can’t be defined within the logic and therefore a device that can decide the truth value of arbitrary statements in this logical has no finite description within this logic. If an alien claimed to have such a device, this generalized Solomonoff induction would assign the hypothesis that they're telling the truth zero probability, whereas we would assign it some small but positive probability.

I'm not sure I understand you correctly, but there are two immediate problems with this:

  • If the goal is to figure out how useful Solomonoff induction is, then "a generalized version of Solomonoff Induction based on some second-order logic" is not relevant. We don't need random generalizations of Solomonoff induction to work in order to decide whether Solomonoff induction works. I think this is repairable, see below.
  • Whether the alien has a device that does such-and-such is not a property of the world, so Solomonoff induction does not assign a probability to it. At any given time, all you have observed is the behavior of the device for some finite past, and perhaps what the inside of the device looks like, if you get to see. Any finite amount of past observations will be assigned positive probability by the universal prior so there is never a moment when you encounter a contradiction.

If I understand your issue right, you can explore the same issue using stock Solomonoff induction: What happens if an alien shows up with a device that produces some uncomputable result? The prior probability of the present situation will become progressively smaller as you make more observations and asymptotically approach zero. If we assume quantum mechanics really is nondeterministic, that will be the normal case anyway, so nothing special is happening here.

Comment by TimFreeman on Open Problems Related to Solomonoff Induction · 2013-06-23T18:02:11.551Z · LW · GW

Consider an arbitrary probability distribution P, and the smallest integer (or the lexicographically least object) x such that P(x) < 1/3^^^3 (in Knuth's up-arrow notation). Since x has a short description, a universal distribution shouldn't assign it such a low probability, but P does, so P can't be a universal distribution.

The description of x has to include the description of P, and that has to be computable if a universal distribution is going to assign positive probability to x.

If P has a short computable description, then yes, you can conclude that P is not a universal distribution. Universal distributions are not computable.

If the shortest computable description of P is long, then you can't conclude from this argument that P is not a universal distribution, but I suspect that it still can't be a universal distribution, since P is computable.

If there is no computable description of P, then we don't know that there is a computable description of x, so you have no contradiction to start with.

Comment by TimFreeman on Crisis of Faith · 2013-06-23T16:49:14.740Z · LW · GW

You're absolutely right that learning to lie really well and actually lying to one's family, the "genuinely wonderful people" they know, everyone in one's "social structure" and business, as well as one's husband and daughter MIGHT be the "compassionate thing to do". But why would you pick out exactly that option among all the possibilities?

Because it's a possibility that the post we're talking about apparently did not consider. The Litany of Gendlin was mentioned in the original post, and I think that when interpreted as a way to interact with others, the Litany of Gendlin is obviously the wrong thing to do in some circumstances.

Perhaps having these beautifully phrased things with a person's name attached is a liability. If I add a caveat that it's only about one's internal process, or it's only about communication with people that either aspire to be rational or that you have no meaningful relationship with, then it's not beautifully phrased anymore, and it's not the Litany of Gendlin anymore, and it seems hopeless for the resulting Litany of Tim to get enough mindshare to matter.

But where exactly is the boundary dividing those things that, however uncomfortable or even devastating, must be said or written and those things about which one can decieve or dupe those one loves and respects?

Actually it wasn't a rhetorical question. I was genuinely curious how you'd describe the boundary.

I'm not curious about that, and in the absence of financial incentives I'm not willing to try to answer that question. There is no simple description of how to deal with the world that's something a reasonable person will actually want to do.

Comment by TimFreeman on Crisis of Faith · 2013-06-22T19:55:59.132Z · LW · GW

You seem to think that if you can imagine even one possible short-term benefit from lying or not-disclosing something, then that's sufficient justification to do so.

That's not what I said. I said several things, and it's not clear which one you're responding to; you should use quote-rebuttal format so people know what you're talking about. Best guess is that you're responding to this:

[learning to lie really well] might be the compassionate thing to do, if you believe that the people you interact with would not benefit from hearing that you no longer believe.

You sharpened my "might be" to "is" just so you could disagree.

But where exactly is the boundary dividing those things that, however uncomfortable or even devastating, must be said or written and those things about which one can decieve or dupe those one loves and respects?

This is a rhetorical question, and it only makes sense in context if your point is that in the absence of such a boundary with an exact location that makes it clear when to lie, we should be honest. But if you can clearly identify which side of the boundary the alternative you're considering is on because it is nowhere close to the boundary, then the fact that you don't know exactly where the boundary is doesn't affect what you should do with that alternative.

You're doing the slippery slope fallacy.

Heretics have been burned at the stake before, so compassion isn't the only consideration when you're deciding whether to lie to your peers about your religious beliefs. My main point is that the Litany of Gendlin is sometimes a bad idea. We should be clear that you haven't cast any doubt on that, even though you're debating whether lying to one's peers is compassionate.

Given that religious relatives tend to fubar cryonics arrangements, the analogy with being burned at the stake is apt. Religious books tend to say nothing about cryonics, but the actual social process of religious groups tends to be strongly against it in practice.

(Edit: This all assumes that the Litany of Gendlin is about how to interact with others. If it's about internal dialogue, then of course it's not saying that one should or should not lie to others. IMO it is too ambiguous.)

Comment by TimFreeman on The Unfinished Mystery of the Shangri-La Diet · 2012-12-24T23:34:52.969Z · LW · GW

Just drink two tablespoons of extra-light olive oil early in the morning... don't eat anything else for at least an hour afterward... and in a few days it will no longer take willpower to eat less; you'll feel so full all the time, you'll have to remind yourself to eat.

...and then increase the dose to 4 tablespoons if that doesn't work, and then try some other stuff such as crazy-spicing your food if that doesn't work, according to page 62 and Chapter 6 of Roberts' "Shangri-La" Diet" book. I hope you at least tried the higher dose before giving up.

Comment by TimFreeman on Secrets of the eliminati · 2012-10-31T04:33:13.542Z · LW · GW

How do you add two utilities together?

They are numbers. Add them.

So are the atmospheric pressure in my room and the price of silver. But you cannot add them together (unless you have a conversion factor from millibars to dollars per ounce).

Your analogy is invalid, and in general analogy is a poor substitute for a rational argument. In the thread you're replying to, I proposed a scheme for getting Alice's utility to be commensurate with Bob's so they can be added. It makes sense to argue that the scheme doesn't work, but it doesn't make sense to pretend it does not exist.

Comment by TimFreeman on On Being Okay with the Truth · 2012-05-29T03:57:36.968Z · LW · GW

I would expect that peer pressure can make people stop doing evil things (either by force, or by changing their cost-benefit calculation of evil acts). Objective morality, or rather a definition of morality consistent within the group can help organize efficient peer pressure.

So in a conversation between a person A who believes in objective morality and a person B who does not, a possible motive for A is to convince onlookers by any means possible that objective morality exists. Convincing B is not particularly important, since effective peer pressure merely requires having enough people on board and not having any particular individual on board. In those conversations, I always had the role of B, and I assumed, perhaps mistakenly, that A's primary goal was to persuade me since A was talking to me. Thank you for the insight.

Comment by TimFreeman on Morality is not about willpower · 2011-10-10T18:56:45.483Z · LW · GW

The fallacy is the one I just described: attaching a utility function post hoc to what the system does and does not do.

A fallacy is a false statement. (Not all false statements are fallacies; a fallacy must also be plausible enough that someone is at risk of being deceived by it, but that doesn't matter here.) "Attaching a utility function post hoc to what the system does and does not do" is an activity. It is not a statement, so it cannot be false, and it cannot be a fallacy. You'll have to try again if you want to make sense here.

The TSUF is not a utility function.

It a function that maps world-states to utilities, so it is a utility function. You'll have to try again if you want to make sense here too.

We're nearly at the point where it's not worth my while to listen to you because you don't speak carefully enough. Can you do something to improve, please? Perhaps get a friend to review your posts, or write things one day and reread them the next before posting, or simply make an effort not to say things that are obviously false.

Comment by TimFreeman on Morality is not about willpower · 2011-10-10T18:30:16.304Z · LW · GW

With [the universal] prior, TSUF-like utility functions aren't going to dominate the set of utility functions consistent with the person's behavior

How do you know this? If that's true, it can only be true by being a mathematical theorem...

No, it's true in the same sense that the statement "I have hands" is true. That is, it's an informal empirical statement about the world. People can be vaguely understood as having purposeful behavior. When you put them in strange situations, this breaks down a bit and if you wish to understand them as having purposeful behavior you have to contrive the utility function a bit, but for the most part people do things for a comprehensible purpose. If TSUF's were the simplest utility functions that described humans, then human behavior would be random, which is isn't. Thus the simplest utility functions that describe humans aren't going to be TSUF-like.

Comment by TimFreeman on Morality is not about willpower · 2011-10-10T00:21:23.003Z · LW · GW

Some agents, but not all of them, determine their actions entirely using a time-invariant scalar function U(s) over the state space.

If we're talking about ascribing utility functions to humans, then the state space is the universe, right? (That is, the same universe the astronomers talk about.) In that case, the state space contains clocks, so there's no problem with having a time-dependent utility function, since the time is already present in the domain of the utility function.

Thus, I don't see the semantic misunderstanding -- human behavior is consistent with at least one utility function even in the formalism you have in mind.

(Maybe the state space is the part of the universe outside of the decision-making apparatus of the subject. No matter, that state space contains clocks too.)

The interesting question here for me is whether any of those alternatives to having a utility function mentioned in the Allais paradox Wikipedia article are actually useful if you're trying to help the subject get what they want. Can someone give me a clue how to raise the level of discourse enough so it's possible to talk about that, instead of wading through trivialities? PM'ing me would be fine if you have a suggestion here but don't want it to generate responses that will be more trivialities to wade through.

Comment by TimFreeman on Morality is not about willpower · 2011-10-09T23:59:47.163Z · LW · GW

This is the Texas Sharpshooter fallacy again. Labelling what a system does with 1 and what it does not with 0 tells you nothing about the system.

You say "again", but in the cited link it's called the "Texas Sharpshooter Utility Function". The word "fallacy" does not appear. If you're going to claim there's a fallacy here, you should support that statement. Where's the fallacy?

It makes no predictions. It does not constrain expectation in any way. It is woo.

The original claim was that human behavior does not conform to optimizing a utility function, and I offered the trivial counterexample. You're talking like you disagree with me, but you aren't actually doing so.

If the only goal is to predict human behavior, you can probably do it better without using a utility function. If the goal is to help someone get what they want, so far as I can tell you have to model them as though they want something, and unless there's something relevant in that Wikipedia article about the Allais paradox that I don't understand yet, that requires modeling them as though they have a utility function.

You'll surely want a prior distribution over utility functions. Since they are computable functions, the usual Universal Prior works fine here, so far as I can tell. With this prior, TSUF-like utility functions aren't going to dominate the set of utility functions consistent with the person's behavior, but mentioning them makes it obvious that the set is not empty.

Comment by TimFreeman on Morality is not about willpower · 2011-10-09T05:20:16.623Z · LW · GW

The Utility Theory folks showed that behavior of an agent can be captured by a numerical utility function iff the agent's preferences conform to certain axioms, and Allais and others have shown that human behavior emphatically does not.

A person's behavior can always be understood as optimizing a utility function, it just that if they are irrational (as in the Allais paradox) the utility functions start to look ridiculously complex. If all else fails, a utility function can be used that has a strong dependency on time in whatever way is required to match the observed behavior of the subject. "The subject had a strong preference for sneezing at 3:15:03pm October 8, 2011."

From the point of view of someone who wants to get FAI to work, the important question is, if the FAI does obey the axioms required by utility theory, and you don't obey those axioms for any simple utility function, are you better off if:

  • the FAI ascribes to you some mixture of possible complex utility functions and helps you to achieve that, or

  • the FAI uses a better explanation of your behavior, perhaps one of those alternative theories listed in the wikipedia article, and helps you to achieve some component of that explanation?

I don't understand the alternative theories well enough to know if the latter option even makes sense.

Comment by TimFreeman on A Rationalist's Tale · 2011-09-30T00:25:12.725Z · LW · GW

Before my rejection of faith, I was plagued by a feeling of impending doom.

I was a happy atheist until I learned about the Friendly AI problem and estimated the likely outcome. I am now plagued by a feeling of impending doom.

Comment by TimFreeman on Secrets of the eliminati · 2011-09-23T20:11:50.322Z · LW · GW

If everyone's inferred utility goes from 0 to 1, and the real-life utility monster cares more than the other people about one thing, the inferred utility will say he cares less than other people about something else. Let him play that game until the something else happens, then he loses, and that's a fine outcome.

That's not the situation I'm describing; if 0 is "you and all your friends and relatives getting tortured to death" and 1 is "getting everything you want," the utility monster is someone who puts "not getting one thing I want" at, say, .1 whereas normal people put it at .9999.

You have failed to disagree with me. My proposal exactly fits your alleged counterexample.

Suppose Alice is a utility monster where:

  • U(Alice, torture of everybody) = 0
  • U(Alice, everything) = 1
  • U(Alice, no cookie) = 0.1
  • U(Alice, Alice dies) = 0.05

And Bob is normal, except he doesn't like Alice:

  • U(Bob, torture of everybody) = 0
  • U(Bob, everything) = 1
  • U(Bob, Alice lives, no cookie) = 0.8
  • U(Bob, Alice dies, no cookie) = 0.9

If the FAI has a cookie it can give to Bob or Alice, it will give it to Alice, since U(cookie to Bob) = U(Bob, everything) + U(Alice, everything but a cookie) = 1 + 0.1 = 1.1 < U(cookie to Alice) = U(Bob, everything but a cookie) + U(Alice, everything) = 0.8 + 1 = 1.8. Thus Alice gets her intended reward for being a utility monster.

However, if the are no cookies available and the FAI can kill Alice, it will do so for the benefit of Bob, since U(Bob, Alice lives, no cookie) + U(Alice, Alice lives, no cookie) = 0.8 + 0.1 = 0.9 < U(Bob, Alice dies, no cookie) + U(Alice, Alice dies) = 0.9 + 0.05 = 0.95. The basic problem is that since Alice had the cookie fixation, that ate up so much of her utility range that her desire to live in the absence of the cookie was outweighed by Bob finding her irritating.

Another problem with Alice's utility is that it supports the FAI doing lotteries that Alice would apparently prefer but a normal person would not. For example, assuming the outcome for Bob does not change, the FAI should prefer 50% Alice dies + 50% Alice gets a cookie (adds to 0.525) over 100% Alice lives without a cookie (which is 0.1). This is a different issue from interpersonal utility comparison.

How do you add two utilities together?

They are numbers. Add them.

And if humans turn out to be adaption-executers, then utility is going to look really weird, because it'll depend a lot on framing and behavior.

Yes. So far as I can tell, if the FAI is going to do what people want, it has to model people as though they want something, and that means ascribing utility functions to them. Better alternatives are welcome. Giving up because it's a hard problem is not welcome.

If people dislike losses more than they like gains and status is zero-sum, does that mean the reasonable result of average utilitarianism when applied to status is that everyone must be exactly the same status?

No. If Alice has high status and Bob has low status, and the FAI takes action to lower Alice's status and raise Bob's, and people hate losing, then Alice's utility decrease will exceed Bob's utility increase, so the FAI will prefer to leave the status as it is. Similarly, the FAI isn't going to want to increase Alice's status at the expense of Bob. The FAI just won't get involved in the status battles.

I have not found this conversation rewarding. Unless there's an obvious improvement in the quality of your arguments, I'll drop out.

Edit: Fixed the math on the FAI-kills-Alice scenario. Vaniver continued to change the topic with every turn, so I won't be continuing the conversation.

Comment by TimFreeman on Moral enhancement · 2011-09-20T03:59:32.874Z · LW · GW

There seems to be an assumption here that empathy leads to morality. Sometimes, at least, empathy leads to being jerked around by the stupid goals of others instead of pursuing your own stupid goals, and in this case it's not all that likely to lead to something fitting any plausible definition of "moral behavior". Chogyam Trungpa called this "idiot compassion".

Thus it's important to distinguish caring about humanity as a whole from caring about individual humans. I read some of the links in the OP and did not see this distinction mentioned.

Comment by TimFreeman on Decision Fatigue, Rationality, and Akrasia. · 2011-09-19T21:22:45.190Z · LW · GW

I procrastinated when in academia, but did not feel particularly attracted to the job, so option 1 is not always true. Comparison with people not in academia makes it seem that option 3 is not true for me either.

Comment by TimFreeman on Questions for a Friendly AI FAQ · 2011-09-19T20:46:24.482Z · LW · GW

More questions to perhaps add:

What is self-modification? (In particular, does having one AI build another bigger and more wonderful AI while leaving "itself" intact count as self-modification? The naive answer is "no", but I gather the informed answer is "yes", so you'll want to clarify this before using the term.)

What is wrong with the simplest decision theory? (That is, enumerate the possible actions and pick the one for which the expected utility of the outcome is best. I'm not sure what the standard name for that is.) It's important to answer this so at some point you state the problem that timeless decision theory etc. are meant to solve.

I gather one of the problems with the simplest decision theory is that it gives the AI an incentive to self-modify under certain circumstances, and there's a perceived need for the AI to avoid routine self-modification. The FAQ question might be "How can we avoid giving the AI an incentive to self-modify?" and perhaps "What are the risks of allowing the AI to self-modify?"

What problem is solved by extrapolation? (This goes in the CEV section.)

What are the advantages and disadvantages of having a bounded utility function?

Can we just upload a moral person? (In the "Need for FAI" section. IMO the answer is a clear "no".)

I suggest rephrasing "What powers might it have?" in 1.10 to "What could we reasonably expect it to be able to do?". The common phrase "magical powers" gives the word "powers" undesired connotations in this context, makes us sound like loonies.

Comment by TimFreeman on Secrets of the eliminati · 2011-08-23T20:28:59.929Z · LW · GW

A common tactic in human interaction is to care about everything more than the other person does, and explode (or become depressed) when they don't get their way. How should such real-life utility monsters be dealt with?

If everyone's inferred utility goes from 0 to 1, and the real-life utility monster cares more than the other people about one thing, the inferred utility will say he cares less than other people about something else. Let him play that game until the something else happens, then he loses, and that's a fine outcome.

I doubt it can measure utilities

I think it can, in principle, estimate utilities from behavior. See http://www.fungible.com/respect.

simple average utilitarianism is so wracked with problems I'm not even sure where to begin.

The problems I'm aware of have to do with creating new people. If you assume a fixed population and humans who have comparable utilities as described above, are there any problems left? Creating new people is a more interesting use case than status conflicts.

Why do you find status uninteresting?

As I said, because maximizing average utility seems to get a reasonable result in that case.

Comment by TimFreeman on Secrets of the eliminati · 2011-08-18T03:51:14.037Z · LW · GW

It's understanding of you doesn't have to be more rigorous than your understanding of you.

It does if I want it to give me results any better than I can provide for myself.

No. For example, if it develops some diet drug that lets you safely enjoy eating and still stay skinny and beautiful, that might be a better result than you could provide for yourself, and it doesn't need any special understanding of you to make that happen. It just makes the drug, makes sure you know the consequences of taking it, and offers it to you. If you choose take it, that tells the AI more about your preferences, but there's no profound understanding of psychology required.

I also provided the trivial example of internal conflicts- external conflicts are much more problematic.

Putting an inferior argument first is good if you want to try to get the last word, but it's not a useful part of problem solving. You should try to find the clearest problem where solving that problem solves all the other ones.

How will a FAI deal with the status conflicts that develop?

If it can do a reasonable job of comparing utilities across people, then maximizing average utility seems to do the right thing here. Comparing utilities between arbitrary rational agents doesn't work, but comparing utilities between humans seems to -- there's an approximate universal maximum (getting everything you want) and an approximate universal minimum (you and all your friends and relatives getting tortured to death). Status conflicts are not one of the interesting use cases. Do you have anything better?

Comment by TimFreeman on Secrets of the eliminati · 2011-08-16T17:57:37.026Z · LW · GW

In some sense, the problem of FAI is the problem of rigorously understanding humans, and evo psych suggests that will be a massively difficult problem.

I think that bar is unreasonably high. If you have conflict between enjoying eating a lot vs being skinny and beautiful, and the FAI helps you do one or the other, then you aren't in a position to complain that it did the wrong thing. It's understanding of you doesn't have to be more rigorous than your understanding of you.

Comment by TimFreeman on Beware Trivial Inconveniences · 2011-08-13T20:56:10.725Z · LW · GW

For example, maybe you could chill the body rapidly to organ-donation temperatures, garrote the neck,..

It's worse than I said, by the way. If the patient is donating kidneys and is brain dead, the cryonics people want the suspension to happen as soon as possible to minimize further brain damage. The organ donation people want the organ donation to happen when the surgical team and recipient are ready, so there will be conflict over the schedule.

In any case, the fraction of organ donors is small, and the fraction of cryonics cases is much smaller, and the two groups do not have a history of working with each other. Thus even if the procedure is technically possible, I don't know of an individual who would be interested in developing the hybrid procedure. There's lots of other stuff that is more important to everyone involved.

Comment by TimFreeman on Secrets of the eliminati · 2011-08-12T20:53:02.109Z · LW · GW

I would think that knowing evo psych is enough to realize [having an FAI find out human preferences, and then do them] is a dodgy approach at best.

I don't see the connection, but I do care about the issue. Can you attempt to state an argument for that?

Human preferences are an imperfect abstraction. People talk about them all the time and reason usefully about them, so either an AI could do the same, or you found a counterexample to the Church-Turing thesis. "Human preferences" is a useful concept no matter where those preferences come from, so evo psych doesn't matter.

Similarly, my left hand is an imperfect abstraction. Blood flows in, blood flows out, flakes of skin fall off, it gets randomly contaminated from the environment, and the boundaries aren't exactly defined, but nevertheless it generally does make sense to think in terms of my left hand.

If you're going to argue that FAI defined in terms of inferring human preferences can't work, I hope that isn't also going to be an argument that an AI can't possibly use the concept of my left hand, since the latter conclusion would be absurd.

Comment by TimFreeman on Beware Trivial Inconveniences · 2011-08-11T21:07:53.828Z · LW · GW

The process of vitrifying the head makes the rest of the body unsuitable for organ donations. If the organs are extracted first, then the large resulting leaks in the circulatory system make perfusing the brain difficult. If the organs are extracted after the brain is properly perfused, they've been perfused too, and with the wrong substances for the purposes of organ donation.

Comment by TimFreeman on The Proper Use of Humility · 2011-08-11T13:04:49.200Z · LW · GW

If "humility" can be used to justify both activities and their opposites so easily, perhaps it's a useless concept and should be tabooed.

Comment by TimFreeman on [deleted post] 2011-08-09T12:49:15.304Z

PMing or emailing official SIAI people should get to link to safer avenues to discussing these kinds of basilisks.

Hmm, should I vote you up because what you're saying is true, or should I vote you down because you are attracting attention to the parent post which harmful to think about?

If an idea is guessable, then it seems irrational to think it is harmful to communicate it to somebody, since they could have guessed it themselves. Given that this is a website about rationality, IMO we should be able to talk about the chain of reasoning that leads to the decision that this guessable idea is harmful to communicate, since there's clearly a flaw in there somewhere.

Upvoted the parent because I think the harm here is imaginary. Absurdly large utilities do not describe non-absurdly-large brains, but they are not a surprising output from humans displaying fitness. (Hey, I know a large number! Look at me!)

These ideas have come up and were suppressed before, so this is not a specific criticism of the original post.

Comment by TimFreeman on Towards a New Decision Theory for Parallel Agents · 2011-08-09T12:13:32.584Z · LW · GW

Make sure that each CSA above the lowest level actually has "could", "should", and "would" labels on the nodes in its problem space, and make sure that those labels, their values, and the problem space itself can be reduced to the managing of the CSAs on the level below.

That statement would be much more useful if you gave a specific example. I don't see how labels on the nodes are supposed to influence the final result.

There's a general principle here that I wish I could state well. It's something like "general ideas are easy, specific workable proposals are hard, and you're probably wasting people's time if you're only describing a solution to the easy parts of the problem".

One cause of this is that anyone who can solve the hard part of the problem can probably already guess the easy part, so they don't benefit much from you saying it. Another cause is that the solutions to the hard parts of the problem tend to have awkward aspects to them that are best dealt with by modifying the easy part, so a solution to just the easy part is sure to be unworkable in ways that can't be seen if that's all you have.

I have this issue with your original post, and most of the FAI work that's out there.

Comment by TimFreeman on Towards a New Decision Theory for Parallel Agents · 2011-08-08T04:48:04.694Z · LW · GW

Well, one story is that humans and brains are irrational, and then you don't need a utility function or any other specific description of how it works. Just figure out what's really there and model it.

The other story is that we're hoping to make a Friendly AI that might make rational decisions to help people get what they want in some sense. The only way I can see to do that is to model people as though they actually want something, which seems to imply having a utility function that says what they want more and what they want less. Yes, it's not true, people aren't that rational, but if a FAI or anyone else is going to help you get what you want, it has to model you as wanting something (and as making mistakes when you don't behave as though you want something).

So it comes down to this question: If I model you as using some parallel decision theory, and I want to help you get what you want, how do I extract "what you want" from the model without first somehow converting that model to one that has a utility function?

Comment by TimFreeman on Dark Arts: Schopenhauer wrote The Book on How To Troll · 2011-08-04T04:59:29.162Z · LW · GW

Okay, I watched End of Evangelion and a variety of the materials leading up to it. I want my time back. I don't recommend it.

Comment by TimFreeman on Best career models for doing research? · 2011-07-13T17:17:45.430Z · LW · GW

So many people might be willing to go be a health worker in a poor country where aid workers are commonly (1 in 10,000) raped or killed, even though they would not be willing to be certainly attacked in exchange for 10,000 times the benefits to others.

I agree with your main point, but the thought experiment seems to be based on the false assumption that the risk of being raped or murdered are smaller than 1 in 10K if you stay at home. Wikipedia guesstimates that 1 in 6 women in the US are on the receiving end of attempted rape at some point, so someone who goes to a place with a 1 in 10K chance of being raped or murdered has probably improved their personal safety. To make a better thought experiment, I suppose you have to talk about the marginal increase in rape or murder rate when working in the poor country when compared to staying home, and perhaps you should stick to murder since the rape rate is so high.

Comment by TimFreeman on Dark Arts: Schopenhauer wrote The Book on How To Troll · 2011-07-13T16:53:45.084Z · LW · GW

The story isn't working for me. A boy or novice soldier, depending on how you define it, is inexplicably given the job of running a huge and difficult-to-use robot to fight with a sequence of powerful similarly huge aliens while trying not to do too much collateral damage to Tokyo in the process. In the original, I gather he was an unhappy boy. In this story, he's a relatively well-adjusted boy who hallucinates conversations with his Warhammer figurines. I don't see why I should care about this scenario or any similar scenarios, but maybe I'm missing something.

Can someone who read this or watched the original say something interesting that happens in it? Wikipedia mentions profound philosophical questions about the nature of reality, but it also mentions that the ending is widely regarded as incomprehensible. The quote about how every possible statement sounds profound if you get the rhetoric right seems to apply here. I don't want to invest multiple hours to end up reading (or watching) some pseudo-profound nonsense.

Comment by TimFreeman on [SEQ RERUN] Your Strength as a Rationalist · 2011-07-10T18:16:47.654Z · LW · GW

Your strength as a rationalist is your ability to be more confused by fiction than by reality.

Does that lead to the conclusion that Newcomb's problem is irrelevant? Mind-reading aliens are pretty clearly fiction. Anyone who says otherwise is much more likely to be schizophrenic than to have actual information about mind-reading aliens.

Comment by TimFreeman on Dark Arts: Schopenhauer wrote The Book on How To Troll · 2011-07-10T18:13:00.609Z · LW · GW

When dealing with trolls, whether on the Internet or in Real Life, no matter how absolutely damn sure you are of your point, you have no time to unravel their bullshit for what it is, and if you try it you will only bore your audience and exhaust their patience. Debates aren't battles of truth: there's publishing papers and articles for that. Debates are battles of status.

I agree. There's also the scenario where you're talking to a reasonable person for the purpose of figuring out the truth better than either of you could do alone. That's useful, and it's important to be able to distinguish that from debating with trolls for the purpose of gaining status. Trolls can be recognized by how often they use rhetoric that obviously isn't truth-seeking, and Schopenhauer is very good for that.

Well, actually, on the Internet you never gain status by debating with trolls. Even if I win an argument, I lose status to the extent my behavior justifies the conclusion "Tim wastes time posting to (LessWrong|SlashDot|whatever) instead of doing anything useful."

My ability to identify and stonewall trolls varies. Sometimes I catch them saying something silly and refuse to continue unless they correct themselves, and that stops the time-waste pretty quickly. Sometimes I do three-strikes-and-your-out, and the time-waste stops reasonably soon. Sometimes it takes me a long time to figure out if they're a troll, especially if they're hinting that they know something worthwhile. I wish I had a more stable rule of thumb for doing this right. Any suggestions?

Comment by TimFreeman on The Threat of Cryonics · 2011-07-09T00:41:21.814Z · LW · GW

Terror Management seems to explain the reactions to cryonics pretty well. I've only skimmed the OP enough to want to trot out the standard explanation, so I may have missed something, but so far as I can tell the Historical Death Meme and Terror Management make the same predictions.

It is in fact absolutely unacceptable, from a simple humanitarian perspective, that something as nebulous as the HDM -- however artistic, cultural, and deeply ingrained it may be -- should ever be substituted for an actual human life.

Accepting something is the first step to changing it, so you'll have to do better than that.

Comment by TimFreeman on Dark Arts: Schopenhauer wrote The Book on How To Troll · 2011-07-09T00:35:26.763Z · LW · GW

Please tell me you've at least read Methods Of Rationality and Shinji and Warhammer40k.

I read the presently existing part of MoR. I could read Shinji 40K. Why do you think it's worthwhile? Should I read or watch Neon Genesis Evangelion first?

Comment by TimFreeman on Dark Arts: Schopenhauer wrote The Book on How To Troll · 2011-07-08T23:13:35.947Z · LW · GW

I have a fear that becoming skilled at bullshitting others will increase my ability to bullshit myself. This is based on my informal observation that the people who bullshit me tend to be a bit confused even when manipulating me isn't their immediate goal.

However, I do find that being able to authoritatively blame someone else who is using a well-known rhetorical technique for doing that is very useful, and therefore I have found reading "Art of Controversy" to be very useful. The obviously useful skill is to be able to recognize each rhetorical technique and be able to find a suitable retort in real time; the default retort is to name the rhetorical technique.

Comment by TimFreeman on Behaviorism: Beware Anthropomorphizing Humans · 2011-07-06T22:03:02.461Z · LW · GW

minds are behavior-executors and not utility-maximizers

I think it would be more accurate to say that minds are more accurately and simply modeled as behavior-executors than as utility-maximizers.

There are situations where the most accurate and simple model isn't the one you want to use. For example, if I'm wanting to cooperate with somebody, one approach is to model them as a utility-maximizer, and then to search for actions that improve everybody's utility. If I model them as a behavior-executors then I'll be perceived as manipulative if I don't get it exactly right.

Specifically, people don't like to hear an explanation of my behavior of the form "Yes, I reasonably guessed that you would have preferred to get to the grocery store, and I knew that the only grocery store was to the south, but you were driving north so I helped you drive north." Thus, if I model people as behavior-executors, I have a much more complicated game to play because I have to anticipate what they'll do when they discover that I helped them to make a mistake.

Comment by TimFreeman on Topics to discuss CEV · 2011-07-06T21:07:33.761Z · LW · GW

An alternative to CEV is CV, that is, leave out the extrapolation.

You have a bunch of non-extrapolated people now, and I don't see why we should think their extrapolated desires are morally superior to their present desires. Giving them their extrapolated desires instead of their current desires puts you into conflict with the non-extrapolated version of them, and I'm not sure what worthwhile thing you're going to get in exchange for that.

Nobody has lived 1000 years yet; maybe extrapolating human desires out to 1000 years gives something that a normal human would say is a symptom of having mental bugs when the brain is used outside the domain for which it was tested, rather than something you'd want an AI to enact. The AI isn't going to know what's a bug and what's a feature.

There's also a cause-effect cycle with it. My future desires depend on my future experiences, which depend on my interaction with the CEV AI if one is deployed, so the CEV AI's behavior depends on its estimate of my future desires, which I suppose depends on its estimate of my future experiences, which in turn depends on its estimate of its future behavior. The straightforward way of estimating that has a cycle, and I don't see why the cycle would converge.

The example in the CEV paper about Fred wanting to murder Steve is better dealt with by acknowledging that Steve wants to live now, IMO, rather than hoping that an extrapolated version of Fred wouldn't want to commit murder.

ETA: Alternatives include my Respectful AI paper, and Bill Hibbard's approach. IMO your list of alternatives should include alternatives you disagree with, along with statements about why. Maybe some of the bad solutions have good ideas that are reusable, and maybe pointers to known-bad ideas will save people from writing up another instance of an idea already known to be bad.

IMO, if SIAI really wants the problem to be solved, SIAI should publish a taxonomy of known-bad FAI solutions, along with what's wrong with them. I am not aware that they have done that. Can anyone point me to such a document?

Comment by TimFreeman on A summary of Savage's foundations for probability and utility. · 2011-07-05T04:31:13.072Z · LW · GW

Peter Wakker apparently thinks he found a way to have unbounded utilities and obey most of Savage's axioms. See Unbounded utility for Savage's "Foundations of Statistics," and other models. I'll say more if and when I understand that paper.

Comment by TimFreeman on Science Doesn't Trust Your Rationality · 2011-06-20T18:09:03.919Z · LW · GW

We can't use Solomonoff induction - because it is uncomputable.

Generating hypotheses is uncomputable. However, once you have a candidate hypothesis, if it explains the observations you can do a computation to verify that, and you can always measure its complexity. So you'll never know that you have the best hypothesis, but you can compare hypotheses for quality.

I'd really like to know if there's anything to be known about the nature of the suboptimal predictions you'll make if you use suboptimal hypotheses, since we're pretty much certain to be using suboptimal hypotheses.

Comment by TimFreeman on Model Uncertainty, Pascalian Reasoning and Utilitarianism · 2011-06-17T22:33:28.553Z · LW · GW

I agree with jsteinhardt, thanks for the reference.

I agree that the reward functions will vary in complexity. If you do the usual thing in Solomonoff induction, where the plausibility of a reward function decreases exponentially with its size, so far as I can tell you can infer reward fuctions from behavior, if you can infer behavior.

We need to infer a utility function for somebody if we're going to help them get what they want, since a utility function is the only reasonable description I know of what an agent wants.

Comment by TimFreeman on Model Uncertainty, Pascalian Reasoning and Utilitarianism · 2011-06-17T22:23:09.747Z · LW · GW

Surely we can talk about rational agents in other ways that are not so confusing?

Who is sure? If you're saying that, I hope you are. What do you propose?

Either way, just because something is mathematically proven to exist doesn't mean that we should have to use it.

I don't think anybody advocated what you're arguing against there.

The nearest thing I'm willing to argue for is that one of the following possibilities hold:

  • We use something that has been mathematically proven to exist, now.

  • We might be speaking nonsense, depending on whether the concepts we're using can be mathematically proven to make sense in the future.