Stupid Questions Open Thread Round 4

post by lukeprog · 2012-08-27T00:04:37.740Z · LW · GW · Legacy · 181 comments

Previously: round 1, round 2, round 3

From the original thread:

This is for anyone in the LessWrong community who has made at least some effort to read the sequences and follow along, but is still confused on some point, and is perhaps feeling a bit embarrassed. Here, newbies and not-so-newbies are free to ask very basic but still relevant questions with the understanding that the answers are probably somewhere in the sequences. Similarly, LessWrong tends to presume a rather high threshold for understanding science and technology. Relevant questions in those areas are welcome as well.  Anyone who chooses to respond should respectfully guide the questioner to a helpful resource, and questioners should be appropriately grateful. Good faith should be presumed on both sides, unless and until it is shown to be absent.  If a questioner is not sure whether a question is relevant, ask it, and also ask if it's relevant.

Ask away!

181 comments

Comments sorted by top scores.

comment by JoshuaFox · 2012-08-27T06:16:24.696Z · LW(p) · GW(p)

Can anyone attest to getting real instrumental benefit from SI/LW rationality training, whether from SI bootcamps or just from reading LessWrong?

I don't just mean "feeling better about myself," but identifiable and definite improvements, like getting a good job in one week after two years without success.

Replies from: amitpamin, cousin_it, John_Maxwell_IV, Viliam_Bur, palladias, lukeprog, orthonormal, Dias, drethelin
comment by amitpamin · 2012-08-28T00:09:37.162Z · LW(p) · GW(p)

At the moment, LW has provided negative benefit to my life. I recently quit my job to start learning positive psychology. My initial goal was to blog about positive psychology, and eventually use my blog as a platform to sell a book.

LW has made me deeply uncertain of the accuracy of the research I read, the words I write on my blog, and the advice I am writing in the book I intend to sell. Long-term, the uncertainty will probably help me by making me more knowledgeable than my peers, but in the short-term, demotivates (e.g. if I was sure what I was learning was correct, I would enthusiastically proselytize, which is a much more effective blogging strategy).

Still, I read on, because I've passed the point of ignorance.

Replies from: coffeespoons, lukeprog, DaFranker
comment by coffeespoons · 2012-08-29T12:32:21.468Z · LW(p) · GW(p)

I also think that LW has provided negative benefit to my life. Since I decided that I wanted my beliefs to be true, rather than pleasing to me, I've felt less connected to my friendship group. I used to have certain politcal views that a lot of my friends approved of. Now, I think I was wrong about many things (not totally wrong, but I'm far less confident of the views that I continue to hold). Overall, I'd rather believe true things, but I think so far it's made me less happy.

Replies from: TheOtherDave
comment by TheOtherDave · 2012-08-29T21:09:30.217Z · LW(p) · GW(p)

Why would you rather believe true things?

Replies from: coffeespoons
comment by coffeespoons · 2012-08-30T08:41:45.104Z · LW(p) · GW(p)

1.I would just rather know the right answer!

2.I think believing true things has better consequences than the reverse, for many people. I'm not sure if it will for me.

3.It's too late. I can't decide to go back to believing things that aren't true to make me feel better, because I'd know that's what I was doing.

Would you not prefer to believe true things?

Replies from: TheOtherDave
comment by TheOtherDave · 2012-08-30T14:24:22.561Z · LW(p) · GW(p)

No, I would not not-prefer to believe true things.

That said I also don't experience believing true things as making me unhappy the way you describe.

It's the combination of those statements that intrigues me: X makes you unhappy and you would rather do X. So I was curious as to why you would rather do it.

I have to admit, though, your answers leave me even more puzzled.

Replies from: coffeespoons
comment by coffeespoons · 2012-08-30T15:27:39.022Z · LW(p) · GW(p)

Here are a couple of other reasons:

4.So, I suppose in some ways, feeling that my beliefs are more accurate has given me some sort of satisfaction. I don't know it it outweigh's feeling disconnected socially, though.

5.Altruism. I used to put a lot of energy into UK politics. I gained moral satisfaction and approval from my friends for this, but I've come to think that it's really not a very effective way of improving the world. I would rather learn about more effective ways of making the world better (eg, donating to efficient charity).

Does that make sense? If you did feel that believing true things made you unhappy, would you try to make yourself belief not-true but satisfying things?

Replies from: TheOtherDave
comment by TheOtherDave · 2012-08-30T15:50:50.595Z · LW(p) · GW(p)

Altruism makes some sense to me as an answer... if you're choosing to sacrifice your own happiness in order to be more effective at improving the world, and believing true things makes you more effective at improving the world, then that's coherent.

Unrelatedly, if the problem is social alienation, one approach is to find a community in which the things you want to do (including believe true things) are socially acceptable.

If you did feel that believing true things made you unhappy, would you try to make yourself belief not-true but satisfying things?

There are areas in which I focus my attention on useful and probably false beliefs, like "I can make a significant difference in the world if I choose to take action." It's not clear to be that I believe those things, though. It's also not clear to me that it matters whether I believe them or not, if they are motivating my behavior just the same.

comment by lukeprog · 2012-09-02T18:51:39.558Z · LW(p) · GW(p)

That's how I felt for the first few months after discovering that Jesus wasn't magic after all. At that moment, all I could see was that (1) my life up to that point had largely been wasted on meaningless things, (2) my current life plans were pointless, (3) my closest relationships were now strained, and (4) much of my "expertise" was useless.

Things got better after a while.

comment by DaFranker · 2012-08-28T18:36:22.363Z · LW(p) · GW(p)

I'm tempted to conclude that your current accumulated utility given LW is lower than given (counterfactual no-LW), but that in counterpart/compensation your future expected utility has risen considerably by unknown margins with a relatively high confidence.

Is this an incorrect interpretation of the subtext? Am I reading too much into it?

Replies from: amitpamin
comment by amitpamin · 2012-08-29T02:13:18.071Z · LW(p) · GW(p)

That interpretation is correct.

I've noticed that I don't even need to be knowledge to gain utility - there is a strong correlation between the signaling of my 'knowledgeableness' and the post popularity - the most popular had the largest number of references (38), and so on. When writing the post, I just hide the fact that I researched so much because of my uncertainty :)

comment by cousin_it · 2012-08-27T10:32:41.634Z · LW(p) · GW(p)

Absence of evidence is evidence of absence :-) Most of us don't seem to get such benefits from reading LW, so learning about an individual case of benefit shouldn't influence your decisions much. It will probably be for spurious reasons anyway. Not sure about the camps, but my hopes aren't high.

comment by John_Maxwell (John_Maxwell_IV) · 2012-08-28T18:47:46.935Z · LW(p) · GW(p)

Can anyone attest to getting real instrumental rationality benefits from reading Wikipedia? (As a control question; everyone seems to think that Wikipedia is obviously useful and beneficial, so is anyone getting "real instrumental rationality benefits" from it?)

I suspect that the "success equation", as it were, is something like expected_success = drive intelligence rationality, and for most people the limiting factor is drive, or maybe intelligence. Also, I suspect that changes in your "success equation" parameters take can take years to manifest as substantial levels of success, where people regard you as "successful" and not just "promising". And I don't think anyone is going to respond to a question like this with "reading Less Wrong made me more promising" because that would be dopey, so there's an absence of data. (And promising folks may also surf the internet, and LW, less.)

It's worth differentiating between these two questions, IMO: "does reading LW foster mental habits that make you better at figuring out what's true?" and "does being better at figuring out what's true make you significantly more successful?" I tend to assign more credence to the first than the second.

Replies from: JoshuaFox
comment by JoshuaFox · 2012-08-28T19:50:37.533Z · LW(p) · GW(p)

John, Wikipedia is generally valued for epistemic benefit, i.e., it teaches you facts. Only rarely does it give you practically useful facts, like the fact that lottery tickets are a bad buy. I agree that LW-rationality gives epistemic benefits.

And as for "years to manifest": Diets can make you thinner in months. Likewise, PUA lessons get you laid, weightlifting makes you a bit stronger, bicycle repair workshops get you fixing your bike, and Tim Ferris makes you much better at everything, in months -- if each of these is all it's cracked up to be.

Some changes do take years, but note also that LW-style rationality has been around for years, so at least some people should be reporting major instrumental improvements.

Replies from: John_Maxwell_IV
comment by John_Maxwell (John_Maxwell_IV) · 2012-08-29T00:03:12.775Z · LW(p) · GW(p)

And as for "years to manifest": Diets can make you thinner in months. Likewise, PUA lessons get you laid, weightlifting makes you a bit stronger, bicycle repair workshops get you fixing your bike, and Tim Ferris makes you much better at everything, in months -- if each of these is all it's cracked up to be.

One point is that if a specific diet helps you, it's easy to give credit to that diet. But if LW changes your thinking style, and you make a decision differently years later, it's hard to know what decision you would have made if you hadn't found LW.

Another point is that rationality should be most useful for domains where there are long feedback cycles--where there are shorter feedback cycles, you can just futz around and get feedback, and people who study rationality won't have as much of an advantage.

Some changes do take years, but note also that LW-style rationality has been around for years, so at least some people should be reporting major instrumental improvements.

I think I've gotten substantial instrumental benefits from reading LW. It makes me kind of uncomfortable to share personal details, but I guess I'll share one example: When I was younger, I was very driven and ambitious. I wanted to spend my time teaching myself programming, etc., but in actuality I would spend my time reading reddit and feeling extremely guilty that I wasn't teaching myself programming. At a certain point I started to realize that my feeling of guilt was counterproductive, and if I actually wanted to accomplish my goals then I should figure out what emotions would be useful for accomplishing my goals and try to feel those. I think it's likely that if I didn't read LW I wouldn't have had this realization, or would've had this realization but not taken it seriously. And this realization, along with others in the same vein, seems to have been useful for helping me get more stuff done.

comment by Viliam_Bur · 2012-08-29T12:24:24.703Z · LW(p) · GW(p)

I was at July rationality minicamp, and in addition to many "epiphanies", one idea that seems to work for me is this, very simplified -- forget the mysterious "willpower" and use self-reductionism, instead of speaking in far mode what you should and want to do, observe in near mode the little (irrational) causes that really make you do things. Then design your environment to contain more of those causes which make you do things you want to do. And then, if the theory is correct, you find yourself doing more of what you want to do, without having to suffer the internal conflict traditionally called "willpower".

Today it's almost one month since the minicamp, and here are the results so far. I list the areas where I wanted to improve myself, and assign a score from 0 to 5, where 5 means "works like a miracle; awesome" and 0 means "no change at all". (I started to work on all these goals in parallel, which may be a good or bad idea. Bad part is, there is probably no chance succeeding in all at once. Good part is, if there is success in any part, then there is a success.)

  • (5) avoiding sugar and soda
  • (4) sleeping regularly, avoiding sleep deprivation
  • (2) spending less time procrastinating online
  • (2) exercising regularly
  • (2) going to sleep early, waking up early
  • (1) following my long-term plans
  • (1) spending more time with friends
  • (1) being organized, planning, self-reflecting
  • (0) writing on blog, improving web page
  • (0) learning a new language
  • (0) being more successful at work
  • (0) improving social skills and expanding comfort zone
  • (0) spending more time outside

So far it seems like a benefit, although of course I would be more happy with greater/faster improvement. The mere fact that I'm measuring (not very exactly) my progress is suprising enough. I'm curious about the long-term trends: will those changes gradually increase (as parts of my life get fixed and turn to habit) or decrease (as happened with my previous attempts at self-improvement)? Expect a more detailed report in the end of December 2012.

How exactly did I achieve this? (Note: This is strongly tailored to my personality, it may not work for other people.) Gamification -- I have designed a set of rules that allow me to get "points" during the days. There are points e.g. for: having enough sleep, having an afternoon nap, meeting a friend, exercising (a specific amount), publishing a blog article, spending a day without consuming sugar, spending a day without browsing web, etc. Most of these rules allow only one point of given type per day, to avoid failure modes like "I don't feel like getting this point now, but I can get two of these points tomorrow". So each day I collect my earned points, literally small squares of colored paper (this makes them feel more real), and glue them on a paper form, which is always on my desk, and provides me a quick visual feedback on how "good" the previous days were. It's like a computer game (exact rules, quick visual feedback), which is exactly why I like it.

This specific form of gamification was not literally taught at Minicamp, and I was considering something similar years before. Yet I never did it; mostly because I was stopped by my monkey-tribe-belonging instincts. Doing something that other people don't do is weird. I tried to convince some friends to join me in doing this, but all my attempts failed; now I guess it's because admitting that you need some kind of help is low status, while speaking about willpower in far mode is high status. Being with people at Minicamp messed with my tribe instincts; meeting a community of people with social norm of doing "weird" things reduced my elephant's opposition to doing a weird thing. Sigh, I'm just a monkey, and I'm scared of doing things that other monkeys never do; even if it means being rational or winning.

Replies from: koning_robot
comment by koning_robot · 2012-08-31T18:21:09.777Z · LW(p) · GW(p)

Hm, I've been trying to get rid of one particular habit (drinking while sitting at my computer) for a long time. Recently I've considered the possibility of giving myself a reward every time I go to the kitchen to get a beer and come back with something else instead. The problem was that I couldn't think of a suitable reward (there's not much that I like). I hadn't thought of just making something up, like pieces of paper. Thanks for the inspiration!

comment by palladias · 2012-08-27T21:05:03.628Z · LW(p) · GW(p)

I was a July minicamp attendee. I did the big reading through the Sequences thing when lukeprog was doing it at Common Sense Atheism, so I'd day fewer of the benefits were rationality level-ups and more were life hacking. Post-minicamp I am:

  • doing sit-ups, push-ups, and squats every day (using the apps from the 200 situps guy), up from not doing this at all
  • martial arts training four times a week (aikido and krav) again, up from not doing things at all
  • using RTM to manage tasks which means
  • dropping way fewer small tasks
  • breaking tasks down into steps more efficiently
  • knocked off about three lagging tasks (not timebound, so I was making no progress on them) in the month that I got back
  • stopped using inbox as task manager, so I could actually only keep emails I was replying to in there
  • using beeminder to get down to inbox zero (currently three)
  • working in pomodoros has sped up my writing to the point where:
  • I miss doing a daily posts to my blog more rarely (one over two weeks compared to 0-2 a week) and have had more double post days than previously (which translates into higher page views and more money for me)
  • Less time writing left me more time for leisure reading
Replies from: palladias, JoshuaFox, gwern
comment by palladias · 2012-08-27T21:43:56.472Z · LW(p) · GW(p)

I should add that I had a bit of a crestfallen feeling for the first few days of minicamp, since being more efficient and organized feels like a really lame superpower. I expected a bit more of it to be about choosing awesome goals. But then I realized that I'd always be grateful for a genie that magically gave me an extra hour, and I shouldn't look a gift genie in the mouth, just because it wasn't magic.

So, now that I've got more time, it's up to me to do superheroic things with it. Once I finish my Halloween costume.

Replies from: orthonormal
comment by orthonormal · 2012-09-04T02:01:06.793Z · LW(p) · GW(p)

I should add that I had a bit of a crestfallen feeling for the first few days of minicamp, since being more efficient and organized feels like a really lame superpower.

This. Holy cow, I worried I was the only one who felt a bit of a letdown during minicamp and then started noticing afterwards that my ways of dealing with problems had suddenly become more effective.

comment by JoshuaFox · 2012-08-28T06:33:03.482Z · LW(p) · GW(p)

OK, those count as benefits. We shouldn't just give all the credit to the lifehacking community, since LW/SI successfully got you to implement lifehacking techniques.

Of course, anything can be called instrumentally rational if it works, but I wonder how other approaches compare to explicit rationality in successfully convincing oneself to lifehack . For example, the sort of motivational techniques used for salespeople.

Replies from: palladias
comment by palladias · 2012-08-28T06:41:53.790Z · LW(p) · GW(p)

I'm not sure. One thing that worked pretty well for me at minicamp was that the instructors were pretty meticulous about describing levels of confidence in different hacks. Everything from "Here are some well-regarded, peer reviewed studies you can look at" to "It's worked pretty well for us, and most of the people who've tried, and here's how we think it fits into what we know about the brain" to "we don't know why this works, but it has for most people, so we think it's worth trying out, so make sure you tell us if you try and get bupkis so we're hearing about negative data" to "this is something that worked for me that you might find useful."

I think this is a pretty audience-specific selling point, but it did a great job of mitigating the suspicious-seeming levels of enthusiasm most lifehackers open with.

comment by gwern · 2012-08-27T21:11:22.119Z · LW(p) · GW(p)

How are you both posting more to your blog, and spending less time writing?

Replies from: palladias
comment by palladias · 2012-08-27T21:39:52.586Z · LW(p) · GW(p)

I'm writing faster when I work in pomodoros and when I write on the train on the long schlep to aikido.

Replies from: palladias
comment by palladias · 2012-08-28T14:18:58.440Z · LW(p) · GW(p)

Where I just broke my toe. Oh no, negative utility alert!

comment by lukeprog · 2012-08-27T17:40:08.389Z · LW(p) · GW(p)

This topic has been raised dozens of times before, but the stories are scattered. Here's a sampling:

But also see this comment from Carl Shulman.

Replies from: cousin_it, JoshuaFox, JoshuaFox
comment by cousin_it · 2012-08-27T21:23:20.711Z · LW(p) · GW(p)

That comment of mine was from 2010 and I disagree with it now. My current opinion is better expressed in the "Epiphany addiction" post and comments.

Replies from: Wei_Dai
comment by Wei Dai (Wei_Dai) · 2012-08-28T03:08:55.444Z · LW(p) · GW(p)

Are you saying you now don't think LW is "useful for noticing bullshit and cutting it away from my thoughts", or that the value of doing this isn't as high as you thought?

Replies from: cousin_it
comment by cousin_it · 2012-08-28T09:48:40.369Z · LW(p) · GW(p)

Looking back today, the improvement seems smaller than I thought then, and LW seems to have played a smaller role in it.

Replies from: Wei_Dai, DaFranker
comment by Wei Dai (Wei_Dai) · 2012-08-28T18:53:01.128Z · LW(p) · GW(p)

I used to be very skeptical of Eliezer's ideas about improving rationality when he was posting the Sequences, but one result that's hard to deny is that all of a sudden there is a community of people who I can discuss my decision theory ideas with, whereas before that I seemingly couldn't get them across to anyone except maybe one or two people, even though I had my own highly active mailing list.

I'd say that being able to achieve this kind of subtle collective improvement in philosophical ability is already quite impressive, even if the effect is not very dramatic in any given individual. (Of course ultimately the improvement has to be graded against what's needed to solve FAI and not against my expectations, and it seems to still fall far short of that.)

Replies from: cousin_it
comment by cousin_it · 2012-08-28T20:04:39.455Z · LW(p) · GW(p)

It's indeed nice to have a community that discusses decision-theoretic ideas, but a simpler explanation is that Eliezer's writings attracted many smart folks and also happened to make these ideas salient, not that Eliezer's writings improved people's philosophical ability.

Replies from: Wei_Dai, Vladimir_Nesov
comment by Wei Dai (Wei_Dai) · 2012-08-28T21:15:54.900Z · LW(p) · GW(p)

Attracting many smart folks and making some particular ideas salient to them is no mean feat in itself. But do you think that's really all it took? That any group of smart people, if they get together and become interested in some philosophical topic, could likely make progress instead of getting trapped in a number of possible ways?

Replies from: palladias
comment by palladias · 2012-08-28T21:54:21.742Z · LW(p) · GW(p)

I think it's always helpful when a community has a vernacular and a common library of references. It's better if the references are unusually accurate, but even bland ones might still speed up progress on projects.

comment by Vladimir_Nesov · 2012-08-28T20:28:24.910Z · LW(p) · GW(p)

an easier explanation is that Eliezer's writings attracted many smart folks and also happened to make these ideas salient, not that Eliezer's writings improved people's philosophical ability

Eliezer's writings were certainly the focus of my own philosophical development. The current me didn't exist before processing them, and was historically caused by them, even though it might have formed on its own a few years later.

comment by DaFranker · 2012-08-28T18:25:14.727Z · LW(p) · GW(p)

Hmm. Thanks for that update.

I had been considering earlier today that since I started reading lesswrong I noticed a considerable increase in my ability to spot and discern bullshit and flawed arguments, without paying much attention to really asking myself the right questions in order to favor other things I considered more important to think about.

Reading this made me realize that I've drawn a conclusion too early. Perhaps I should re-read those "epiphany addiction" posts with this in mind.

comment by JoshuaFox · 2012-08-27T19:49:35.043Z · LW(p) · GW(p)

Thanks. In most of those links, the author says that he gained some useful mental tools, and maybe that he feels better. That's good. But no one said that rationality helped them achieve any goal other the goal of being rational.

For example:

  • Launch a successful startup
  • Get a prestigious job
  • Break out of a long-term abusive relationship.
  • Lose weight (Diets are discussed, but I don't see that a discussion driven by LW/SI-rationality is any more successful in this area than any random discussion of diets.)
  • Get lucky in love (and from what I can tell, the PUAs do have testimonials for their techniques)
  • Avoid akrasia (The techniques discussed are gathered from elsewhere; so to the extent that rationality means "reading up on the material," the few successes attested in this area can count as confirmation.)
  • Break an addiction to drugs/gambling.

... and so on.

Religious deconversion doesn't count for the purpose of my query unless the testimonial describes some instrumental benefit.

Carl's comment about the need for an experiment is good; but if someone can just give a testimonial, that would be a good start!

Replies from: lukeprog, NancyLebovitz, ShardPhoenix
comment by lukeprog · 2012-08-28T00:12:12.511Z · LW(p) · GW(p)

There's also Zvi losing weight with TDT. :)

comment by NancyLebovitz · 2012-08-28T05:32:04.546Z · LW(p) · GW(p)

Losing weight is a core human value?

Replies from: JoshuaFox
comment by JoshuaFox · 2012-08-28T06:27:29.322Z · LW(p) · GW(p)

Thanks, I edited it.

comment by ShardPhoenix · 2012-08-28T13:14:27.631Z · LW(p) · GW(p)

I think LW-style thinking may have helped me persist better at going to the gym (which has been quite beneficial for me) than I otherwise would have, but obviously it's hard to know for sure.

comment by JoshuaFox · 2012-08-28T17:49:22.195Z · LW(p) · GW(p)

Or even better:

  • "I used to buy lottery tickets every day but now I understand the negative expectation of the gamble and the diminishing marginal utility of the ticket, so I don't."
  • A doctor says "I now realize that I was giving my patients terrible advice about what it meant when a test showed positive for a disease. Now that I have been inducted into the Secret Order of Bayes, My advice on that is much better now."

.... etc.

comment by orthonormal · 2012-09-04T02:11:13.641Z · LW(p) · GW(p)

July minicamper here. My own life has had enough variance in the past few months over many variables (location, job, romantic status) with too many exogenous variables for me to be very confident about the effect of minicamp, aside from a few things (far fewer totally wasted days than I used to suffer from what I saw as being inescapably moody).

But I've gained an identifiable superpower in the realm of talking helpfully to other people by modeling their internal conflicts more accurately, by steering them toward "making deals with themselves" rather than ridiculous memes like "using willpower", and by noticing confusion and getting to the root of it via brainstorming and thought experiments. And the results have absolutely floored people, in three different cases. If you're worried about epiphany addiction, then I suppose you might label me a "carrier" (although there's the anomalous fact that friends have followed through on my advice to them after talking to me).

Replies from: JoshuaFox
comment by JoshuaFox · 2012-09-04T06:49:39.442Z · LW(p) · GW(p)

Great, I'd love to have that superpower!

comment by Dias · 2012-09-08T23:40:38.542Z · LW(p) · GW(p)

I think the probability of my having got my current job without LW etc. is under 20%.

comment by drethelin · 2012-08-28T20:21:28.606Z · LW(p) · GW(p)

Subjectively I feel happier and more effective, but there's not reliable external evidence for this. I've gotten better at talking to people and interacting in positive ways thanks to using metacognition, and my views have become more consistent. Timeless thinking has helped me adopt a diet and stick to it, as well as made me start wearing my seatbelt.

comment by AlexMennen · 2012-08-27T01:08:08.671Z · LW(p) · GW(p)

While we're on the subject of decision theory... what is the difference between TDT and UDT?

Replies from: Wei_Dai, Oscar_Cunningham, lukeprog
comment by Wei Dai (Wei_Dai) · 2012-08-27T20:11:29.115Z · LW(p) · GW(p)

Maybe the easiest way to understand UDT and TDT is:

  • UDT = EDT without updating on sensory inputs, with "actions" to be understood as logical facts about the agent's outputs
  • TDT = CDT with "causality" to be understood as Pearl's notion of causality plus additional arrows for logical correlations

Comparing UDT and TDT directly, the main differences seem to be that UDT does not do Bayesian updating on sensory inputs and does not make use of causality. There seems to be general agreement that Bayesian updating on sensory inputs is wrong in a number of situations, but disagreement and/or confusion about whether we need causality. Gary Drescher put it this way:

Plus, if you did have a general math-counterfactual-solving module, why would you relegate it to the logical-dependency-finding subproblem in TDT, and then return to the original factored causal graph? Instead, why not cast the whole problem as a mathematical abstraction, and then directly ask your math-counterfactual-solving module whether, say, (Platonic) C's one-boxing counterfactually entails (Platonic) $1M? (Then do the argmax over the respective math-counterfactual consequences of C's candidate outputs.)

(Eliezer didn't give an answer. ETA: He did answer a related question here.)

Replies from: AlexMennen
comment by AlexMennen · 2012-08-27T20:28:28.725Z · LW(p) · GW(p)

I can see what updating on sensory updating does to TDT (causing it to fail counterfactual mugging). But what does it mean to say that TDT makes use of causality and UDT doesn't? Are there any situations where this causes them to give different answers?

Replies from: Wei_Dai
comment by Wei Dai (Wei_Dai) · 2012-08-27T20:49:33.571Z · LW(p) · GW(p)

(I added a link at the end of the grandparent comment where Eliezer does give some of his thoughts on this issue.)

Are there any situations where this causes them to give different answers?

Eliezer seems to think that causality can help deal with Gary Drescher's "5-and-10" problem:

But you would still have to factor out your logical uncertainty in a way which prevented you from concluding "if I choose A6, it must have had higher utility than A7" when considering A6 as an option (as Drescher observes).

But it seems possible to build versions of UDT that are free from such problems (such as the proof-based ones that cousin_it and Nesov have explored), although there are still some remaining issues with "spurious proofs" which may be related. In any case, it's unclear how to get help from the notion of causality, and as far as I know, nobody has explored in that direction and reported back any results.

comment by Oscar_Cunningham · 2012-08-27T09:34:13.713Z · LW(p) · GW(p)

I'm not an expert but I think this is how it works:

Both decision theories (TDT and UDT) work by imagining the problem from the point of view of themselves before the problem started. They then think "From this point of view, which sequence of decisions would be the best one?", and then they follow that sequence of decisions. The difference is in how they react to randomness in the environment. When the algorithm is run, the agent is already midway through the problem, and so might have some knowledge that it didn't have at the start of the problem (e.g. whether a coinflip came up heads or tails). When visualising themselves at the start of the problem TDT assumes they have this knowledge, UDT assumes they don't.

An example is Counterfactual Mugging:

Imagine that one day, Omega comes to you and says that it has just tossed a fair coin, and given that the coin came up tails, it decided to ask you to give it $100. Whatever you do in this situation, nothing else will happen differently in reality as a result. Naturally you don't want to give up your $100. But see, the Omega tells you that if the coin came up heads instead of tails, it'd give you $10000, but only if you'd agree to give it $100 if the coin came up tails.

TDT visualises itself before the problem started, knowing that the coin the coin will come up tails. From this point of view the kind of agent that does well is the kind that refuses to give $100, and so that's what TDT does.

UDT visualises itself before the problem started, and pretends it doesn't know what the coin does. From this point of view the kind of agent that does well is the kind that gives $100 in the case of tails, so that's what UDT does.

Replies from: AlexMennen
comment by AlexMennen · 2012-08-27T15:23:17.976Z · LW(p) · GW(p)

Why do we still reference TDT so much if UDT is better?

Replies from: lukeprog
comment by lukeprog · 2012-08-27T17:59:02.153Z · LW(p) · GW(p)

Many people think of UDT as being a member of the "TDT branch of decision theories." And in fact, much of what is now discussed as "UDT" (e.g. in A model of UDT with a halting oracle) is not Wei Dai's first or second variant of UDT but instead a new variant of UDT sometimes called Ambient Decision Theory or ADT.

comment by lukeprog · 2012-08-27T02:49:23.584Z · LW(p) · GW(p)

Follow-up: Is it in how they compute conditional probabilities in the decision algorithm? As I understand it, that's how CDT and EDT and TDT differ.

Replies from: AlexMennen
comment by AlexMennen · 2012-08-27T03:38:02.908Z · LW(p) · GW(p)

I don't think that is how CDT and EDT differ, actually. Instead, it's that EDT cares about conditional probabilities and CDT doesn't. For instance, in Newcomb's problem, a CDT agent could agree that his expected utility is higher conditional on him one-boxing than it is conditional on him two-boxing. But he two-boxes anyway because the correlation isn't causal. A guess TDT/UDT does compute conditional probabilities differently in the sense that they don't pretend that their decisions are independent of the outputs of similar algorithms.

comment by Wei Dai (Wei_Dai) · 2012-08-28T22:49:26.012Z · LW(p) · GW(p)

Why haven't SI and LW attracted or produced any good strategists? I've been given to understand (from someone close to SI) that various people within SI have worked on Singularity strategy but only produced lots of writings that are not of an organized, publishable form. Others have attempted to organize them but also failed, and there seems to be a general feeling that strategy work is bogged down or going in circles and any further effort will not be very productive. The situation on LW seems similar, with people arguing in various directions without much feeling of progress. Why are we so bad at this, given that strategic thinking must be a core part of rationality?

Replies from: lukeprog, Epiphany
comment by lukeprog · 2012-08-30T20:00:02.381Z · LW(p) · GW(p)

There are some but not lots of "writings" produced internally by SingInst that are not available to the public. There's lots of scribbled notes and half-finished models and thoughts in brains and stuff like that. We're working to push them out into written form, but that takes time, money, and people — and we're short on all three.

The other problem is that to talk about strategy we first have to explain lots of things that are basic (to a veteran like you but not to most interested parties) in clear, well-organized language for the first time, since much of this hasn't been done yet (these SI papers definitely help, though: 1, 2, 3, 4, 5, 6). To solve this problem we are (1) adding/improving lots of articles on the LW wiki like you suggested a while back (you'll see a report on what we did, later), and (2) working on the AI risk wiki (we're creating the map of articles right now). Once those resources are available it will be easier to speak clearly in public about strategic issues.

We hit a temporary delay in pushing out strategy stuff at SI because two of our most knowledgable researchers & strategists became unavailable for different reasons: Anna took over launching CFAR and Carl took an extended (unpaid) leave of absence to take care of some non-SI things. Also, I haven't been able to continue my own AI risk strategy series due to other priorities, and because I got to the point where it was going to be a lot of work to continue that sequence if I didn't already have clear, well-organized write-ups of lots of standard material. (So, it'll be easier for me to continue once the LW wiki has been improved and once the AI risk wiki exists, both of which we've got people working on right now.)

Moreover, there are several papers in the works — mostly by Kaj (who is now a staff researcher), with some help from myself — but you won't see them for a while. You did see this and this, however. Those are the product of months of part-time work from several remote researchers, plus analysis by Kaj and Stuart. Remote researchers are currently doing large lit reviews for other paper projects (at the pace of part-time work), but we aren't at the stage of analyzing those large data sets yet so that we can write papers about what we found.

Also, a lot of work currently scattered around in papers and posts by SI and FHI people is being collected and tightly organized in Nick Bostrom's forthcoming monograph on superintelligence, which will be a few hundred pages entirely about singularity strategy. Having much of "the basics" organized in that way will also make it easier to produce additional strategy work.

Replies from: Wei_Dai
comment by Wei Dai (Wei_Dai) · 2012-08-31T01:42:10.436Z · LW(p) · GW(p)

Luke, with the existing people at SI and FHI's disposal, how long do you think it would take (assuming they're not busy with other projects) to produce a document that lays out a cogent argument for some specific Singularity strategy? An argument that takes into account all of the important considerations that have already been raised (for example my comment that Holden quoted)? I will concede that strategy work is not bogged down if you think it can be done in a reasonable time frame. (2 years, perhaps?) But if SI and FHI are merely producing writings that explain the strategic considerations, but which we can't foresee forming into an overall argument for some specific strategy, that seems very weak evidence at best against my claim that we are bad at strategic thinking.

Replies from: lukeprog
comment by lukeprog · 2012-08-31T04:17:46.182Z · LW(p) · GW(p)

I know that FHI plans to produce a particular set of policy recommendations relevant to superintelligence upon the release of Nick's book or shortly thereafter. FHI has given no timeline for Nick's book but I expect it to be published in mid or late 2013.

The comparably detailed document from SI will be the AI risk wiki. We think the wiki format makes even more sense than a book for these purposes, though an OUP book on superintelligence from Nick Bostrom sounds great to us. Certainly, we will be busy with other projects, but even still I think the AI risk wiki (a fairly comprehensive version 1.0, anyway) could be finished within 2 years. I'm not that confident it will be finished in 2 years, though, given that we've barely begun. Six months from now I'll be more confidently able to predict the likelihood of finishing the AI risk wiki version 1.0 within 2 years.

Despite this, I would describe the current situation as "bogged down" when it comes to singularity strategy. Luckily, the situation is changing due to 2 recent game-shifting events: (1) FHI decided to spend a few years focusing on AI risk strategy while Nick wrote a monograph on the subject, and (2) shortly thereafter, SI began to rapidly grow its research team (at first, mostly through part-time remote researchers) and use that team to produce a lot more research writing than before (only a small fraction of which you've seen thus far).

And no, I don't know in advance what strategic recommendations FHI will arrive at, nor which strategic recommendations SI's scholarly AI risk wiki will arrive at, except to say that SI's proposals will probably include Friendly AI research as one of the very important things humanity should be doing right now about AI risk.

ETA: My answer to your original question — "Why haven't SI and LW attracted or produced any good strategists?" — is that it's very difficult and time-consuming to acquire all the domain knowledge required to be good at singularity strategy, especially when things like a book-length treatment of AI risk and an AI risk wiki don't yet exist. It will be easier for someone to become good at singularity strategy once those things exist, but even still they'll have to know a lot about technological development and forecasting, narrow AI, AGI, FAI open problems, computer science, maths, the physical sciences, economics, philosophy of science, value theory, anthropics, and quite a bit more.

Replies from: Wei_Dai
comment by Wei Dai (Wei_Dai) · 2012-09-01T11:14:12.236Z · LW(p) · GW(p)

What methodology will be used to produce SI's strategic recommendations (and FHI's, if you know the answer)? As far as I can tell, we currently don't have a way to make the many known strategic considerations/arguments commensurable (e.g., suitable for integrating into a quantitative strategic framework) except by using our intuitions which seem especially unreliable on matters related to Singularity strategy. The fact that you think the AI risk wiki can be finished in 2 years seems to indicate that you either disagree with this evaluation of the current state of affairs, or think we can make very rapid progress in strategic reasoning. Can you explain?

Replies from: lukeprog
comment by lukeprog · 2012-09-01T19:22:30.049Z · LW(p) · GW(p)

We certainly could integrate known strategic arguments into a quantitative framework like this, but I'm worried that, for example, "putting so many made-up probabilities into a probability tree like this is not actually that helpful."

I think for now both SI and FHI are still in the qualitative stage that normally precedes quantitative analysis. Big projects like Nick's monograph and SI's AI risk wiki will indeed constitute "rapid progress" in strategic reasoning, but it will be rapid progress toward more quantitative analyses, not rapid progress within a quantitative framework that we have already built.

Of course, some of the work on strategic sub-problems is already at the quantitative/formal stage, so quantitative/formal progress can be made on them immediately if SI/FHI can raise the resources to find and hire the right people to work on them. Two examples: (1) What do reasonable economic models of past jumps in optimization power imply about what would happen once we get self-improving AGI? (2) If we add lots more AI-related performance curve data to Nagy's Performance Curve Database and use his improved tech forecasting methods, what does it all imply about AI and WBE timelines?

Replies from: Wei_Dai
comment by Wei Dai (Wei_Dai) · 2012-09-02T08:28:31.503Z · LW(p) · GW(p)

I think for now both SI and FHI are still in the qualitative stage that normally precedes quantitative analysis.

There are many strategic considerations that greatly differ in nature from one another. It seems to me that at best they will require diverse novel methods to analyze quantitatively, and at worst a large fraction may resist attempts at quantitative analysis until the Singularity occurs.

For example we can see that there is an upper bound on how confident a small FAI team, working in secret and with limited time, can be (assuming it's rational) about the correctness of an FAI design, due to the issue raised in my comment quoted by Holden, and this is of obvious strategic importance. But I have no idea what method we can use to derive this bound, other than to "make it up". Solving this problem alone could easily take a team several years to accomplish, so how do you hope to produce the strategic recommendations, which must take into account many such issues, in 2 years?

Replies from: lukeprog
comment by lukeprog · 2012-09-02T17:50:22.223Z · LW(p) · GW(p)

Solving this problem alone could easily take a team several years to accomplish, so how do you hope to produce the strategic recommendations, which must take into account many such issues, in 2 years?

Two answers:

  1. Obviously, our recommendations won't be final, and we'll try to avoid being overconfident — especially where the recommendations depend on highly uncertain variables.
  2. In many (most?) cases, I suspect our recommendations will be for policies that play a dual role of (1) making progress in directions that look promising from where we stand now, and also (2) purchasing highly valuable information, like how feasible an NGO FAI team is, how hard FAI really is, what the failure modes look like, how plausible alternative approaches are, etc.

SI, FHI, you, others — we're working on tough problems with many unknown and uncertain strategic variables. Those challenges are not unique to AI risk. Humans have many tools for doing the best they can while running on spaghetti code and facing decision problems under uncertainty, and we're gaining new tools all the time.

I don't mean to minimize your concerns, though. Right now I expect to fail. I expect us all to get paperclipped (or turned off), though I'll be happy to update in favor of positive outcomes if (1) research shows the problem isn't as hard as I now think, (2) financial support for x-risk reduction increases, (3) etc.

Replies from: Wei_Dai, hairyfigment
comment by Wei Dai (Wei_Dai) · 2012-09-04T23:10:20.202Z · LW(p) · GW(p)

I don't mean to minimize your concerns, though. Right now I expect to fail. I expect us all to get paperclipped (or turned off), though I'll be happy to update in favor of positive outcomes if (1) research shows the problem isn't as hard as I now think, (2) financial support for x-risk reduction increases, (3) etc.

I think you may have misunderstood my intent here. I'm not trying to make you more pessimistic about our overall prospects but arguing (i.e., trying to figure out) the absolute and relative importance of solving various strategic problems.

Another point was to suggest that perhaps SI ought to give higher priority to recruiting/training "hero strategists" as opposed to "hero mathematicians". For example your So You Want to Save the World says:

No, the world must be saved by mathematicians, computer scientists, and philosophers.

which fails to credit the importance of strategic contributions (even though later in the post there is a large section on strategic problems).

Replies from: lukeprog
comment by lukeprog · 2012-09-05T03:58:34.384Z · LW(p) · GW(p)

which fails to credit the importance of strategic contributions

Sorry if that was unclear; I mean to identify the strategists as "philosophers", like this. As you say, I went on to include a large section on strategy.

I certainly agree on the importance of strategy. Most of the research SI and FHI have done is strategic, after all — and most of the work in progress is strategic, too.

I do tend to talk a lot about "hero mathematicians," though. Maybe that's because "hero mathematician" is more concrete (to me) than "hero strategist."

Anyway, it seems like we may be failing to disagree on anything, here.

Replies from: Wei_Dai
comment by Wei Dai (Wei_Dai) · 2012-09-05T22:32:48.594Z · LW(p) · GW(p)

Sorry if that was unclear; I mean to identify the strategists as "philosophers", like this.

I see. I had interpreted you to mean philosophers as part of a team to build FAI.

I do tend to talk a lot about "hero mathematicians," though. Maybe that's because "hero mathematician" is more concrete (to me) than "hero strategist."

What do you mean by "more concrete", and do you think it's a good reason to talk a lot more about "hero mathematicians"?

Replies from: lukeprog, hairyfigment
comment by lukeprog · 2012-09-09T00:06:31.800Z · LW(p) · GW(p)

I had interpreted you to mean philosophers as part of a team to build FAI.

That could also be true, but I'm not sure.

Re: "hero mathematicians" and "hero strategists", here's a more detailed version of what I currently think.

Result of saying we need "hero mathematicians"? A few mathematicians (perhaps primed by HPMoR to be rationality heroes) come to us and learn what the technical research program looks like, help put our memes into the math community, etc.

Result of saying we need "hero strategists"? I'm inundated with people who say they can contribute to singularity strategy after thinking about the issues for one month and reading less than 100 pages on the subject. SI staff wastes valuable time trying to steer amateur strategists along more valuable paths before giving up due to low ROI.

Basically, the recruiting problem is different for mathematicians and strategists, and I think these problems can be tackled more effectively by tackling them separately. Mathematicians can prove themselves useful rather quickly, by offering constructive comments on the problems we will (in the next 12 months) have written up somewhat formally, or by spreading our memes in their research communities.

But to tell whether someone can be a useful strategist they need to read 500 pages of material and spend months chatting regularly with SI and/or FHI, and that's very costly for both them and for SI+FHI.

The best result might be if some of the mathematicians themselves turn out to be good strategists. I don't know that I can count on that, but for example I already count both you and Paul Christiano as among the few strategists whose strategy work I would spend my time reading, even though your primary life work has been in math and compsci (and not, say, civil engineering, business management, political science, or economics).

Replies from: Wei_Dai
comment by Wei Dai (Wei_Dai) · 2012-09-11T23:06:04.724Z · LW(p) · GW(p)

Result of saying we need "hero strategists"? I'm inundated with people who say they can contribute to singularity strategy after thinking about the issues for one month and reading less than 100 pages on the subject. SI staff wastes valuable time trying to steer amateur strategists along more valuable paths before giving up due to low ROI.

But to tell whether someone can be a useful strategist they need to read 500 pages of material and spend months chatting regularly with SI and/or FHI, and that's very costly for both them and for SI+FHI.

You could direct them to LW and let them prove their mettle here?

comment by hairyfigment · 2012-09-07T06:56:38.315Z · LW(p) · GW(p)

I just tried to picture what "hero strategist" could mean, if distinct from 'person who knows LW rationality' or 'practical guy like Luke'. I came up with someone who could hire the world's best mathematicians plus a professional cat-herder and base the strategy on the result.

comment by hairyfigment · 2012-09-08T02:12:42.646Z · LW(p) · GW(p)

Right now I expect to fail. I expect us all to get paperclipped

So, you're currently thinking hard about the best way to approach someone like Terence Tao? (Doesn't have to be him, someone else's blog might also have comments and give you a better opportunity to raise the issue.)

Replies from: lukeprog
comment by lukeprog · 2012-09-08T02:50:05.809Z · LW(p) · GW(p)

Actually, yes. We had a meeting about that a couple weeks ago. Tao was specifically named. :)

comment by Epiphany · 2012-08-29T01:23:51.487Z · LW(p) · GW(p)

I love working on problems like these. If the specifics of the "circles" are written down anywhere, or if you care to describe them, I'd happily give it a whack. I won't claim to be an expert, but I enjoy complex problem solving tasks like this one too much not to offer.

Replies from: Wei_Dai
comment by Wei Dai (Wei_Dai) · 2012-08-29T18:16:49.364Z · LW(p) · GW(p)

My understanding is that there is a lot of writings produced internally by SingInst that are not available to the public. If you think you are up to the task of organizing and polishing them into publishable form (which I guess probably also requires filling in lots of missing pieces) you should contact them and volunteer. (But I guess they probably don't want to hand their unfinished work to just anyone so you'll have to prove yourself somehow.)

If you just want to get an idea of the issues involved, here are some of my writings on the topic:

And see also the unfinished AI risk sequence by lukeprog.

Replies from: Bruno_Coelho
comment by Bruno_Coelho · 2012-08-29T18:55:38.437Z · LW(p) · GW(p)

From time to time people ask lukeprog about SI writings. He talk about AI papers they are working, and at some point, for the sake of security, he stop and says "it's confidential" or something similar.

Evaluating who can do a good work is importante, 60% fail, besides the historical aversion to formated papers.

Note: I'm still waiting EY books.

comment by lukeprog · 2012-08-27T00:20:24.434Z · LW(p) · GW(p)

I finally decided it's worth some of my time to try to gain a deeper understanding of decision theory...

Question: Can Bayesians transform decisions under ignorance into decisions under risk by assuming the decision maker can at least assign probabilities to outcomes using some kind of ignorance prior(s)?

Details: "Decision under uncertainty" is used to mean various things, so for clarity's sake I'll use "decision under ignorance" to refer to a decision for which the decision maker does not (perhaps "cannot") assign probabilities to some of the possible outcomes, and I'll use "decision under risk" to refer to a decision for which the decision maker does assign probabilities to all of the possible outcomes.

There is much debate over which decision procedure to use when facing a decision under ignorance when there is no act that dominates the others. Some proposals include: the leximin rule, the optimism-pessimism rule, the minimax regret rule, the info-gap rule, and the maxipok rule.

However, there is broad agreement that when facing a decision under risk, rational agents maximize expected utility. Because we have a clearer procedure for dealing with decisions under risk than we do for dealing with decisions under ignorance, many decision theorists are tempted to transform decisions under ignorance into decisions under risk by appealing to the principle of insufficient reason: "if you have literally no reason to think that one state is more probable than another, then one should assign equal probability to both states."

And if you're a Bayesian decision-maker, you presumably have some method for generating ignorance priors, whether or not that method always conforms to the principle of insufficient reason, and even if you doubt you've found the final, best method for assigning ignorance priors.

So if you're a Bayesian decision-maker, doesn't that mean that you only ever face decisions under risk, because at they very least you're assigning ignorance priors to the outcomes for which you're not sure how to assign probabilities? Or have I misunderstood something?

Replies from: paulfchristiano, cousin_it, AlexMennen, Decius, Manfred
comment by paulfchristiano · 2012-08-27T05:50:01.412Z · LW(p) · GW(p)

You could always choose to manage ignorance by choosing a prior. It's not obvious whether you should. But as it turns out, we have results like the complete class theorem, which imply that EU maximization with respect to an appropriate prior is the only "Pareto efficient" decision procedure (any other decision can be changed so as to achieve a higher reward in every possible world).

This analysis breaks down in the presence of computational limitations; in that case it's not clear that a "rational" agent should have even an implicit representation of a distribution over possible worlds (such a distribution may be prohibitively expensive to reason about, much less integrate exactly over), so maybe a rational agent should invoke some decision rule other than EU maximization.

The situation is sort of analogous to defining a social welfare function. One approach is to take a VNM utility function for each individual and then maximize total utility. At face value it's not obvious if this is the right thing to do--choosing an exchange rate between person A's preferences and person B's preferences feels pretty arbitrary and potentially destructive (just like choosing prior odds between possible world A and possible world B). But as it turns out, if you do anything else then you could have been better off by picking some particular exchange rate and using it consistently (again, modulo practical limitations).

Replies from: lukeprog
comment by lukeprog · 2012-08-30T22:56:13.255Z · LW(p) · GW(p)

as it turns out, we have results like the complete class theorem, which imply that EU maximization with respect to an appropriate prior is the only "Pareto efficient" decision procedure (any other decision can be changed so as to achieve a higher reward in every possible world).

I found several books which give technical coverage of statistical decision theory, complete classes, and admissibility rules (Berger 1985; Robert 2001; Jaynes 2003; Liese & Miescke 2010), but I didn't find any clear explanation of exactly how the complete class theorem implies that "EU maximization with respect to an appropriate prior is the only 'Pareto efficient' decision procedure (any other decision can be changed so as to achieve a higher reward in every possible world)."

Do you know any source which does so, or are you able to explain it? This seems like a potentially significant argument for EUM that runs independently of the standard axiomatic approaches, which have suffered many persuasive attacks.

Replies from: paulfchristiano
comment by paulfchristiano · 2012-08-31T05:37:13.669Z · LW(p) · GW(p)

The formalism of the complete class theorem applies to arbitrary decisions, the Bayes decision procedures correspond to EU maximization with respect to an appropriate choice of prior. An inadmissable decision procedure is not Pareto efficient, in the sense that a different decision procedure does better in all possible worlds (which feels analogous to making all possible people happier). Does that make sense?

There is a bit of weasel room, in that the complete class theorem assumes that the data is generated by a probabilistic process in each possible world. This doesn't seem like an issue, because you just absorb the observation into the choice of possible world, but this points to a bigger problem:

If you define "possible worlds" finely enough, such that e.g. each (world, observation) pair is a possible world, then the space of priors is very large (e.g., you could put all of your mass on one (world, observation) pair for each observation) and can be used to justify any decision. For example, if we are in the setting of AIXI, any decision procedure can trivially be described as EU maximization under an appropriate prior: if the decision procedure outputs f(X) on input X, it corresponds to EU maximization against a prior which has the universe end after N steps with probability 2^(-N), and when the universe ends after you seeing X, you receive an extra reward if your last output was f(X).

So the conclusion of the theorem isn't so interesting, unless there are few possible worlds. When you argue for EUM, you normally want some stronger statement than saying that any decision procedure corresponds to some prior.

Replies from: lukeprog
comment by lukeprog · 2012-08-31T16:16:53.202Z · LW(p) · GW(p)

That was clear. Thanks!

comment by cousin_it · 2012-08-27T01:17:54.336Z · LW(p) · GW(p)

What AlexMennen said. For a Bayesian there's no difference in principle between ignorance and risk.

One wrinkle is that even Bayesians shouldn't have prior probabilities for everything, because if you assign a prior probability to something that could indirectly depend on your decision, you might lose out.

A good example is the absent-minded driver problem. While driving home from work, you pass two identical-looking intersections. At the first one you're supposed to go straight, at the second one you're supposed to turn. If you do everything correctly, you get utility 4. If you goof and turn at the first intersection, you never arrive at the second one, and get utility 0. If you goof and go straight at the second, you get utility 1. Unfortunately, by the time you get to the second one, you forget whether you'd already been at the first, which means at both intersections you're uncertain about your location.

If you treat your uncertainty about location as a probability and choose the Bayesian-optimal action, you'll get demonstrably worse results than if you'd planned your actions in advance or used UDT. The reason, as pointed out by taw and pengvado, is that your probability of arriving at the second intersection depends on your decision to go straight or turn at the first one, so treating it as unchangeable leads to weird errors.

Replies from: Vladimir_Nesov, DanielLC
comment by Vladimir_Nesov · 2012-08-27T14:22:35.172Z · LW(p) · GW(p)

One wrinkle is that even Bayesians shouldn't have prior probabilities for everything, because if you assign a prior probability to something that could indirectly depend on your decision, you might lose out.

... your probability of arriving at the second intersection depends on your decision to go straight or turn at the first one, so treating it as unchangeable leads to weird errors.

"Unchangeable" is a bad word for this, as it might well be thought of as unchangeable, if you won't insist on knowing what it is. So a Bayesian may "have probabilities for everything", whatever that means, if it's understood that those probabilities are not logically transparent and some of the details about them won't necessarily be available when making any given decision. After you do make a decision that controls certain details of your prior, those details become more readily available for future decisions.

In other words, the problem is not in assigning probabilities to too many things, but in assigning them arbitrarily and thus incorrectly. If the correct assignment of probability is such that the probability depends on your future decisions, you won't be able to know this probability, so if you've "assigned" it in such a way that you do know what it is, you must have assigned a wrong thing. Prior probability is not up for grabs etc.

comment by DanielLC · 2012-08-27T04:27:53.167Z · LW(p) · GW(p)

so treating it as unchangeable leads to weird errors.

The prior probability is unchangeable. It's just that you make your decision based on the posterior probability taking into account each decision. At least, that's what you do if you use EDT. I'm not entirely familiar with the other decision theories, but I'm pretty sure they all have prior probabilities for everything.

comment by AlexMennen · 2012-08-27T00:55:49.169Z · LW(p) · GW(p)

So if you're a Bayesian decision-maker, doesn't that mean that you only ever face decisions under risk, because at they very least you're assigning ignorance priors to the outcomes for which you're not sure how to assign probabilities?

Correct. A Bayesian always has a probability distribution over possible states of the world, and so cannot face a decision under ignorance as you define it. Coming up with good priors is hard, but to be a Bayesian, you need a prior.

comment by Decius · 2012-08-27T08:54:55.680Z · LW(p) · GW(p)

Bayesian decisions cannot be made under an inability to assign a probability distribution to the outcomes.

As mentioned, you can consider a Bayesian probability distribution of what the correct distributions will be; if you have no reason to say which state, if any, is more probable, then they have the same meta-distribution as each other: If you know that a coin is unfair, but have no information about which way it is biased, then you should divide the first bet evenly between heads and tails, (assuming logarithmic payoffs).

It might make sense to consider the Probability distribution of the fairness of the coin as a graph: the X axis, from 0-1 being the chance of each flip coming up heads, and the Y axis being the odds that the coin has that particular property; because of our prior information, there is a removable discontinuity at x=1/2. Initially, the graph is flat, but after the first flip it changes: if it came up tails, the odds of a two-headed coin are now 0, the odds of a .9999% heads coin are infinitesimal, and the odds of a tail-weighted coin are significantly greater: Having no prior information on how weighted the coin is, you could assume that all weightings (except fair) are equally likely. After the second flip, however, you have information about what the bias of the coin was- but no information about whether the bias of the coin is time-variable, such that it is always heads on prime flips, and always tails on composite flips.

If you consider that the coin could be rigged to a sequence equally likely as that the result of the flip could be randomly determined each time, then you have a problem. No information can update some specific lacks of a prior probability.

comment by Manfred · 2012-08-27T00:52:49.652Z · LW(p) · GW(p)

This reminds me of a recent tangent on Kelly betting. Apparently it's claimed that the unusalness of this optimum betting strategy shows that you should treat risk and ignorance differently - but of course the difference between the two situations is entirely accounted for by two different conditional probability distributions. So you can sort of think of situations (that is, the probability distribution describing possible outcomes) as "risk-like" or "ignorance-like."

Replies from: AlexMennen
comment by AlexMennen · 2012-08-27T01:05:34.869Z · LW(p) · GW(p)

If you're talking about what I think you're talking about, then by "risk", you mean "frequentist probability distribution over outcomes", and by "ignorance", you mean "Bayesian probability distribution over what the correct frequentist probability distribution over outcomes is", which is not the way Luke was defining the terms.

comment by loup-vaillant · 2012-08-28T08:11:36.308Z · LW(p) · GW(p)

This question may come off as a bit off topic : people often say cryonics is a scam. Which is the evidence for that, and to the contrary? How should I gather it?

The thing is, cryonics is a priori awfully suspect. It appeal to one of our deepest motive (not dying), is very expensive, has unusual payment plans, and is just plain weird. So the prior of it being a scam designed to rip us off is quite high. On the other hand, reading about it here, I acquired a very strong intuition that it is not a scam, or at least that Alcor and CI are serious. The problem is, I don't have solid evidence I can tell others about.

Now, I doubt the scam argument is the main reason why people don't buy it. But I'd like to get that argument out of the way.

Replies from: NancyLebovitz, drethelin
comment by NancyLebovitz · 2012-08-28T18:49:10.911Z · LW(p) · GW(p)

I think cryonics is more likely to be a mistake than a scam, but that might just be my general belief that incompetence is much more common than malice.

comment by drethelin · 2012-08-28T08:50:27.419Z · LW(p) · GW(p)

I think there is a very good chance some cryonics organizations are in fact scams.

Replies from: loup-vaillant
comment by loup-vaillant · 2012-08-28T09:59:55.223Z · LW(p) · GW(p)

Good. Is this just an intuition, or can you communicate more precise reasons? A list of red flags could be useful (whether they are present or not).

Replies from: Eudoxia, drethelin
comment by Eudoxia · 2012-08-28T18:16:56.410Z · LW(p) · GW(p)

Alcor: Improperly trained personnel, unkempt and ill-equipped facilities.

[...] Saul Kent invited me over to his home in Woodcrest, California to view videotapes of two Alcor cases which troubled him – but he couldn’t quite put his finger on why this was so.[...] Patients were being stabilized at a nearby hospice, transported to Alcor (~20 min away) and then CPS was discontinued, the patients were placed on the OR table and, without any ice on their heads, they were allowed to sit there at temperatures a little below normal body temperature for 1 to 1.5 hours, while burr holes were drilled, [...] smoke could be seen coming from the burr wound! Since the patient had no circulation to provide blood to carry away the enormous heat generated by the action of the burr on the bone, the temperature of the underlying bone (and brain) must have been high enough to literally cook an egg. In one case, a patient’s head was removed in the field and, because they had failed to use a rectal plug, the patient had defecated in the PIB. The result was that feces had contaminated the neck wound, and Alcor personnel were seen pouring saline over the stump of the neck whilst holding the patient’s severed head over a bucket trying to wash the fecal matter off the stump. These are just a few of the grotesque problems I observed.[...] The operating room was unkempt. The floors were scuffed, stained, dirty, and had obviously not been waxed in a long time. [...] I wouldn’t consider medical treatment in a facility with this appearance – nor for that matter would I like to dine in a restaurant with a kitchen in such a state.

Source

Cryonics Institute: Patient experimentation. No need to say anything else.

It was a snotty, and probably inappropriate remark. Basically I was commenting on the operational paradigm at CI, which is pretty much “ritual.” You sign up, you get frozen and it’s pretty much kumbaya, no matter how badly things go. And they go pretty badly. Go to: http://cryonics.org/refs.html#cases and start reading the case reports posted there. That’s pretty much my working definition of horrible. It seems apparent to me that “just getting frozen” is now all that is necessary for a ticket to tomorrow, and that anything else that is done is “just gravy,” and probably unnecessary to a happy outcome. ...Even in cases that CI perfuses, things go horribly wrong – often – and usually for to me bizarre and unfathomable (and careless) reasons. My dear friend and mentor Curtis Henderson was little more than straight frozen because CI President Ben Best had this idea that adding polyethylene glycol to the CPA solution would inhibit edema. Now the thing is, Ben had been told by his own researchers that PEG was incompatible with DMSO containing solutions, and resulted in gel formation. Nevertheless, he decided he would try this out on Curtis Henderson. He did NOT do any bench experiments, or do test mixes of solutions, let alone any animal studies to validate that this approach would in fact help reduce edema (it doesn’t). Instead, he prepared a batch of this untested mixture, and AFTER it gelled, he tried to perfuse Curtis with it. See my introduction to Thus Spake Curtis Henderson on this blog for how this affected me psychologically and emotionally. Needless to say, as soon as he tried to perfuse this goop, perfusion came to a screeching halt. They have pumped air into patient’s circulatory systems… I could go on and on, but all you need to do is really look at those patient case reports and think about everything that is going on in those cases critically.

Source

Trans Time:

The principal criticism against Trans Time was their for-profit model, in which, if funding ran out, the patients would be thawed and conventionally interred (This is what would've happened to Janice Foote and the Mills couple), unlike other organizations with a pay-once model in which the storage costs for the patients are covered for perpetuity.

I should add, Ray Mills was actually removed from suspension and placed in a chest full of dry ice.

You can also consider the now-defunct Cryonics Society of California, though I don't think any of the above organizations would go as far as talking about a non-existent facility in the present tense while the patients lay on the floor, rotting.

Replies from: loup-vaillant, None
comment by loup-vaillant · 2012-08-29T09:56:00.479Z · LW(p) · GW(p)

Okay, looks like I have to lower my probability that « Alcor and CI are serious ». Now this is from over a year ago. Maybe there's some sign things have changed since? I guess not, unless they acquired some Lukeprog like leadership.

I'll read the whole thing to try and determine to what extent this is incompetence, and to what extent this is scammy (for instance, dust and dirt look like incompetence, but the hardened doors with plywood roof looks a bit more suspect).

Replies from: V_V
comment by V_V · 2012-09-05T00:28:28.872Z · LW(p) · GW(p)

It might be difficult to tell incompetence apart from malice, moreover, it is possible to transition from one to the other:

Let's say you start a cryonics organization with all good intentions, then you start running into problems: costs are higher than expected, mishaps occur during the cryopreservation process, evidence that your process is flawed starts to accumulate and you have no idea on how to fix it, etc. So what do you do?

Apologize for the bad service you sold, thaw and bury the frozen corpses (since you know they are already damaged beyond repair), disband the organization and find a new job, risking to face legal action? That's what a perfectly honest person would do.

But if you are not perfectly honest, you might find yourself hiding or downplaying technical issues, cutting the costs at the expense of service quality, using deceitful marketing strategies, and so on.

Maybe you could rationalize that the continued existence of your organization is so important that it should be preserved even at the cost of deceiving some people, maybe you could even deceive yourself into ignoring your essentially fraudolent behavior and maintain a positive self-image (if you were attracted to cryonics in the first place, chances are high that you are prone to wishful thinking). But, whatever your intentions are, at this point your business has become a de facto scam.

comment by [deleted] · 2012-08-29T21:52:46.539Z · LW(p) · GW(p)

I don't think any of the above organizations would go as far as talking about a non-existent facility in the present tense while the patients lay on the floor, rotting.

That's a mighty low bar to clear. Thank goodness CI and Alcor have standards.

Replies from: Eudoxia
comment by Eudoxia · 2012-08-29T21:57:02.389Z · LW(p) · GW(p)

Thank goodness CI and Alcor have standards.

Well, I have this theory that CI stores its neuropatients in the dewar with the dead cats in it.

Replies from: None
comment by [deleted] · 2012-08-30T00:04:32.558Z · LW(p) · GW(p)

In seriousness, it just floors me the degree to which every player worth speaking of in the field of cryonics seems to be managed (and micromanaged, at that) by Bad Decision Dinosaur. The concept of suspended animation is not inherently crackpot material; the idea that clinical death and information-theoretic death are different things (with implications for comparative medical treatment in different eras) is actually kind of profound -- yet the history of cryonics is a sordid tale full of expensive boondoggles, fraud, ethical nightmares and positively macabre events. And that's the stuff cryonicists will admit to! Look at that Alcor case: the only way I can avoid shuddering is by imagining it set to Yakety Sax.

Replies from: Bakkot
comment by Bakkot · 2012-08-30T00:59:03.736Z · LW(p) · GW(p)Replies from: Eudoxia, None
comment by Eudoxia · 2012-08-30T01:17:06.004Z · LW(p) · GW(p)

To the best of my knowledge, doctors don't experiment on patients without their consent, drill burr holes without circulation, or generally just do anything they want without fear of prosecution (Since cryonics is considered a form of interment, whether the person was completely turned into a glass sculpture or straight-frozen like so many people were does not affect the organizations). Doctors may forget rectal plugs or leave patients if funds are unavailable, though.

What do you define as 'very recently'?

comment by [deleted] · 2012-08-30T01:48:55.037Z · LW(p) · GW(p)

Sure, if you leave out the much longer history and ignore that it was substantially leavened with good faith efforts to restore health, arrest decline and reduce suffering, a substantial number of which also succeed.

(As for "until very recently" -- flagrant abuse still happens in medicine, that's not a thing that recently stopped happening. What I'm saying is that this simply means medicine isn't special as an endeavor... whereas cryonics seems to have little to show for it other than that some bodies are, in fact, vitrified or just garden-variety frozen, depending, many of them even standing a good chance of being reasonably intact after going through the handling process. There's such a vast asymmetry between the two fields; if they were really that comparable, most doctors would be this guy.

comment by drethelin · 2012-08-28T10:10:42.574Z · LW(p) · GW(p)

Things people are willing to pay lots of money for are a strong signal to unscrupulous people. Examples abound of people doing scams as investment advice, counterfeiting art, or selling knock-off designer jewelry. Cryonics is something where you pay a lot of money for a service many years down the line. Someone could easily take in cryonics payments for years without ever having to perform a cryopreservation, and only have it become known after they've disappeared with the profits. Alternately, the impossibility of checking results means that a cryonics provider can profit off of shoddy service and equipment, and you might never realize. On these lines, any organization that is unwilling to let you inspect their preservation equipment etc. is suspect in my eyes. Cryonics organizations are also susceptible to drift in motives of their owners. Maybe the creators 10 years ago were serious about cryonics, but if the current CEO or board of directors cares more about optimizing cheap equipment and profits, then that group might become a de facto scam.

Replies from: loup-vaillant
comment by loup-vaillant · 2012-08-28T13:34:10.343Z · LW(p) · GW(p)

If I understand correctly, I can extract those flags, in descending order of redness:

  • Their cryopreservation facility does not exist (yet).
  • Their cryopreservation facility is not open to scrutiny.
  • Governance shows signs of "for profit" behaviour, or fail to demonstrate "non profit" behaviour.
  • Governance merely changed, while you trusted the previous one.

That also suggest signs of trustworthiness:

  • Their cryopreservation facility exists and is open to scrutiny.
  • This is a non profit with open and clean accounts.
  • They are researching or implementing technical improvements.

I'd like to have more such green and red flags, but this is starting to look actionable. Thank you.

Replies from: FeepingCreature, drethelin
comment by FeepingCreature · 2012-08-28T19:00:18.908Z · LW(p) · GW(p)

One strong signal that I think some cryonics orgs implement is preferentially hiring people who have family members in storage.

Replies from: gwern
comment by gwern · 2012-08-28T19:51:23.293Z · LW(p) · GW(p)

Or pets.

comment by drethelin · 2012-08-28T18:14:04.232Z · LW(p) · GW(p)

In the longer run, the governance of a cryo organization should be designed to try and prevent drift. I like how Alcor requires board members to be signed up as well as to have relatives or significant others signed up, but this still doesn't work against someone who's actually unscrupulous.

comment by Xachariah · 2012-08-27T05:26:37.003Z · LW(p) · GW(p)

Question: Why don't people talk about Ems / Uploads as just as disastrous as uncontrolled AGI? Has there been work done or discussion about the friendliness of Ems / Uploads?

Details: Robin Hanson seems to describe the Em age like a new industrial revolution. Eliezer seems to, well, he seems wary of them but doesn't seem to treat them like an existential threat. Though Nick Bostrom sees them as an existential threat. A lot of people on Lesswrong seem to talk of it as the next great journey for humanity, and not just a different name for uFAI. For my part, I can't imagine uploads ending up good. I literally can't imagine it. Every scenario I've tried to imagine ends up with a bad end.

As soon as the first upload is successful then patient zero will realize he's got unimaginable (brain)power, will start talking in ALL CAPS, and go FOOM on the world, bad end. For the sake of argument, lets say we get lucky and first upload is incredibly nice, and just wants to help people. Eventually the second, or the third, or the twenty fifth upload decides to FOOM over everybody. It's still bad end. We need to have some way to restrain Ems from FOOM-ing, and we need to figure it out before we start uploading. Okay, lets pretend we could even invent a restraint that works against a determined transhiman who is unimaginably more intelligent than us...

Maybe we'll get as far as, say, Hanson's Em society. Ems make copies of themselves tailored to situations to complete work. Some of these copies will choose to / be able to replicate more than others; these copies will inherit this propensity to replicate; eventually, processor-time / RAM-time / hard-disk space will become scarce and things won't be able to copy as well and will have to fight to not have their processes terminated. Welp... that sounds like the 3 ingredients required to invoke the evolution fairy. Except instead of it being the Darwinian evolution we're used to, this new breed will employ a terrifying mix of uFIA self-modification and Lamarckian super-evolution. Bad end. Okay, but lets say we find some way to stop THAT...

What about other threats? Ems can still talk to one another and convince one another of things. How do we know they won't all be hijacked by meme-viruses, and transformed Agent Smith style? That's a bad end. Or hell, how do we know they won't be hijacked by virus-viruses? Bad end there too. Or one of the trillions of Ems could build a uFAI and it goes FOOM into a Bad End. Or... The potential for Bad Ends is enormous and you only need one for the end of humanity.

It's not like flesh-based humans can monitor the system. Once ems are in the 1,000,000x era, they'll be effectively decoupled from humanity. A revolution could start at 10pm after the evening shift goes home, and by the time the morning shift gets in, it's been 1,000 years in Em subjective time. Hell, in the time it takes to swing an axe and cut the network/power cable, they've had about a month to manage their migration and dissemination to every electronic device in the world. Any regulation has to be built inside the Em system and, as mentioned before, it has to be built before we make the first successful upload.

Maybe we can build an invincible regulator or regulation institution to control it all. But we can't let it self-replicate or we'll be right back at the evolution problem again. And we can't let it be modified by the outside world or it'll be the hijacking problem again. And we can't let it self-modify, or it'll evolve in ways we can't predict (and we've already established that it'll be outside of everything else's control). So now we have an invulnerable regulator/regulation system that needs to control a world of trillions. And once our Ems start living in 1,000,000x space, it needs to keep order for literally millions of years without ever making a mistake once. So we need to design a system perfect enough to never make a single error while handling trillions of agents for millions of years?

That strikes me as a problem that's just as hard as FAI. There seems like no way to solve it that doesn't involve a friendly AGI controlling the upload world.

Can anyone explain to me why Ems are looked at as a competing technology to FAI instead an existential risk with probability of 1.0?

Replies from: Wei_Dai, Kaj_Sotala, Mitchell_Porter, Eudoxia, Bruno_Coelho
comment by Wei Dai (Wei_Dai) · 2012-08-27T23:12:49.657Z · LW(p) · GW(p)

Has there been work done or discussion about the friendliness of Ems / Uploads?

As soon as the first upload is successful then patient zero will realize he's got unimaginable (brain)power, will start talking in ALL CAPS, and go FOOM on the world, bad end. For the sake of argument, lets say we get lucky and first upload is incredibly nice, and just wants to help people. Eventually the second, or the third, or the twenty fifth upload decides to FOOM over everybody. It's still bad end.

Why can't the first upload FOOM, but in a nice way?

That strikes me as a problem that's just as hard as FAI. There seems like no way to solve it that doesn't involve a friendly AGI controlling the upload world.

Some people suggest uploads only as a stepping stone to FAI. But if you read Carl's paper (linked above) there are also ideas for how to create stable superorganisms out of uploads that can potentially solve your regulation problem.

Replies from: Xachariah
comment by Xachariah · 2012-08-28T05:44:18.475Z · LW(p) · GW(p)

Thank you for the links, they were exactly what I was looking for.

As for friendly upload FOOMs, I consider the chance of them happening at random about equivalent to FIA happening at random.

Replies from: Wei_Dai
comment by Wei Dai (Wei_Dai) · 2012-08-28T07:33:49.437Z · LW(p) · GW(p)

As for friendly upload FOOMs, I consider the chance of them happening at random about equivalent to FIA happening at random.

(I guess "FIA" is a typo for "FAI"?) Why talk about "at random" if we are considering which technology to pursue as the best way to achieve a positive Singularity? From what I can tell, the dangers involved in an upload-based FOOM are limited and foreseeable, and we at least have ideas to solve all of them:

  1. unfriendly values in scanned subject (pick the subject carefully)
  2. inaccurate scanning/modeling (do a lot of testing before running upload at human/superhuman speeds)
  3. value change as a function of subjective time (periodic reset)
  4. value change due to competitive evolution (take over the world and form a singleton)
  5. value change due to self-modification (after forming a singleton, research self-modification and other potentially dangerous technologies such as FAI thoroughly before attempting to apply them)

Whereas FAI could fail in a dangerous way as a result of incorrectly solving one of many philosophical and technical problems (a large portion of which we are still thoroughly confused about) or due to some seemingly innocuous but erroneous design assumption whose danger is hard to foresee.

Replies from: Benja
comment by Benya (Benja) · 2012-08-31T09:27:00.298Z · LW(p) · GW(p)

Wei, do you assume uploading capability would stay local for long stretches of subjective time? If yes, why? (WBE seems to require large-scale technological development, which I'd expect to be fueled by many institutions buying the tech and thus fueling progress -- compare genome sequencing -- so I'd expect multiple places to have the same currently-most-advanced systems at any point in time, or at least being close to the bleeding edge.) If no, why expect the uploads that go FOOM first to be ones that work hard to improve chances of friendliness, rather than primarily working hard to be the first to FOOM?

Replies from: Wei_Dai
comment by Wei Dai (Wei_Dai) · 2012-08-31T17:40:20.304Z · LW(p) · GW(p)

Wei, do you assume uploading capability would stay local for long stretches of subjective time?

No, but there are ways for this to happen that seem more plausible to me than what's needed for FAI to be successful, such as a Manhattan-style project by a major government that recognizes the benefits of obtaining a large lead in uploading technology.

Replies from: Benja
comment by Benya (Benja) · 2012-08-31T19:28:25.227Z · LW(p) · GW(p)

Ok, thanks for clarifying!

comment by Kaj_Sotala · 2012-08-27T07:04:03.635Z · LW(p) · GW(p)

Ems can still talk to one another and convince one another of things. How do we know they won't all be hijacked by meme-viruses, and transformed Agent Smith style?

This one is a little silly. Humans get hijacked by meme-viruses as well, all the time; it does cause problems, but mostly other humans manage to keep them line.

But as for the rest, yes, I agree with you that an upload scenario would have huge risks as well. Not to mention the fact that there might be a considerable pressure towards uploads merging together and ceasing to be individuals in any meaningful sense of the term. Humanity's future seems pretty hopeless to me.

comment by Mitchell_Porter · 2012-08-27T07:06:53.305Z · LW(p) · GW(p)

Human uploads have been discussed as dangerous. But a friendly AI is viewed as an easier goal than a friendly upload, because an AI can be designed.

comment by Eudoxia · 2012-08-28T02:16:58.345Z · LW(p) · GW(p)

As soon as the first upload is successful then patient zero will realize he's got unimaginable (brain)power, will start talking in ALL CAPS, and go FOOM on the world, bad end.

Now, I have to admit I'm not too familiar with the local discourse re:uploading, but if a functional upload requires emulation down to individual ion channels (PSICS-level) and the chemical environment, I find it hard to believe we'll have the computer power to do that, a million times faster, and in a volume of space small enough that we don't have to put it under a constant waterfall of liquid Helium.

I don't expect femtotechnology or rod logic any time soon, the former may not even be possible at all and the latter is based on some dubious math from Nanosystems; so where does that leave us in terms of computing power? (Assuming, of course, that Clarke's law is a wish-fulfilling fantasy). I understand the reach of Bremermann's Limit, but it may not be possible to reach it, or there may be areas in between zero and the Limit that are unreachable for lack of a physical substrate for them.

comment by Bruno_Coelho · 2012-08-28T00:58:09.885Z · LW(p) · GW(p)

Ems have similar human psychology, with adds. I presume they can't escalate well as AIs, even in coalescence cases.

Possible dangers come with the same dangers of now but with more structural changes. If they have artificial agents in they realm, some cheap nanotech, etc. Conflicts has costs too.

comment by moreLytes · 2012-08-27T22:39:58.806Z · LW(p) · GW(p)

Person A and B hold a belief about proposition X.

Person A has purposively sought out, and updated, on evidence related to X since childhood.

Person B has sat on her couch and played video games.

Yet both A and B have arrived at the same degree-of-belief in proposition X.

Does the Bayesian framework equip its adherents with an adequate account of how Person A should be more confident in her conclusion than Person B?

The only viable answer I can think of is that every reasoner should multiply every conclusion with some measure of epistemic confidence, and re-normalize. But I have not yet encountered such a pervasive account of confidence-measurement from leading Bayesian theorists.

Replies from: Kindly
comment by Kindly · 2012-08-28T00:29:39.274Z · LW(p) · GW(p)

If X is just a binary proposition that can be true or false once and for all, and A and B have arrived at the same degree-of-belief, they are equally confident. A has updated on evidence related to X since childhood, and found that it's perfectly balanced in either direction. The only way A can be said to be "more confident" than B is that A has seen a lot of evidence already, so she won't update her conclusion upon seeing the same evidence again; on the other hand, all evidence is new to B.

Things get more interesting if X is some sort of random variable. Let's say we have a bag of black and white marbles. A has seen people draw from the bag 100 times, and 50 of them ended up with white marbles. B only knows the general idea. Now, both of them expect a white marble to come up with 50% probability. But actually, they each have a probability distribution on the fraction of white marbles in the bag. The mean is 1/2 for both of them, but the distribution is flat for B, and has a sharp peak at 1/2 for A. This is what determines how confident they are. If C comes along and says "well, I drew a white marble", then B will update to a new distribution, with mean 2/3, but A's distribution will barely shift at all.

Replies from: moreLytes
comment by moreLytes · 2012-08-28T01:51:55.405Z · LW(p) · GW(p)

The example of stochastic evidence is indeed interesting. But I find myself stuck on the first example.

If a new reasoner C were to update Pc(X) based on the testimony of A, and had an extremely high degree of confidence in her ability to generate correct opinions, he would presumably strongly gravitate towards Pa(X).

Alternatively, suppose C is going to update Pc(X) based on the testimony of B. Further, C has evidence outlining B's apathetic proclivities. Therefore, he would presumably only weakly gravitate towards Pb(X).

The above account may be shown to be confused. But if it is not, why can C update based on evidence of infomed-belief, but A and B are precluded from similarly reflecting on their own testimony? Or, if such introspective activity is not non-normative, should they not strive to perform such an activity consistently?

Replies from: wnoise, alex_zag_al
comment by wnoise · 2012-08-30T05:51:50.479Z · LW(p) · GW(p)

They essentially have already updated on their own testimony.

comment by alex_zag_al · 2012-09-06T15:54:24.648Z · LW(p) · GW(p)

Okay. I'm assuming everyone has the same prior. I'm going to start by comparing the case where C talks to A and learns everything A knows, to the case where C talks to B and learns everything B knows; that is, when C ends up conditioning on all the same things. If you already see why those two cases are very different, you can skip down to the second section, where I talk about what this implies about how C updates when just hearing that A knows a lot and what Pa(X) is, compared to how he updates when learning what B thinks. It's the same scenario as you described: knowlegable A, ignorant B, Pa(X) = Pb(X).

What happens when C learns everything B knows depends on what evidence C already has. If C knows nothing, then after talking to B, Pc(X) = Pb(X), because he'll be conditioning on exactly the same things.

In other words, if C knows nothing, then C is even more ignorant than B is. When he talks to B, he becomes exactly as ignorant as B is, and assigns the probability that you have in that state of ignorance.

It's only if C already has some evidence that talking to A and talking to B becomes different. As Kindly said, Pa(X) is very stable. So once C learns everything that A knows, C ends up with the probability Pa(X|whatever C knew), which is probably a lot like Pa(X). To take an extreme case, if A is well-informed enough, then she already knows everything C knows, and Pa(X|whatever C knew) is equal to Pa(X), and C comes out with exactly the same probability as A. But if C's info is new to A, then it's probably a lot like telling your biochemistry professor about a study that you read weighing in on one side of a debate: she's seen plenty of evidence for both sides, and unless this new study is particularly conclusive, it's not going to change her mind a whole lot.

However, B's probability is not stable. That biochemistry study might change B's mind a lot, because for all she knows, there isn't even a debate, and she has this pretty good evidence for one side of it. So, once C talks to B and learns everything B knows, C will be using the probability that incorporates all of B's knowledge, plus his own: Pb(X|whatever C knew). This is probably farther from Pb(X) aka Pa(X) than Pa(X|whatever C knew).

This is just how it would typically go. I say A's probability is more "stable", but there's actually some evidence that A would recognize as extremely significant that would mean nothing to B. In this case, one C has learned everything A knows, he would also recognize the significance of the little bit of knowledge that he came in with, and end up with a probability far different from Pa(X).


So that's how it would probably go if C actually sits down and learns everything they know. So, what if C just knows that A is knowledgable, and Pa(X)? Well, suppose that C is convinced by my reasoning, that if he sat down with A and learned everything she knew, then her probability of X would end up pretty close to Pa(X).

Here's the key thing: If C expects that, then his probability is already pretty close to Pa(X). All C knows is that A is knowledgable and has Pa(X), but if he expects to be convinced after learning everything A knows, then he already is convinced.

For any event Q, P(X) is equal to the expected value of P(X|the outcome of Q). That is, you don't know the outcome of Q, but if there's N mutually exclusive possible outcomes O_1... O_N, then P(X) = P(X|O_1)P(O_1) + ... + P(X|O_N)P(O_N). This is one way of stating Conservation of Probability. If the expected value of Pc(X|the outcome of learning everything A knows) is pretty close to Pa(X), then, well, Pc(X) must be pretty close too, because the expected value of Pc(X|the outcome of learning everything A knows) is equal to Pc(X).

Likewise, if C learns about B's knowledge and Pb(X), and he doesn't think that learning everything B knows would make much of a difference, then he also doesn't end up matching Pb(X) unless he started out matching before he even learned B's testimony.

I've been assuming that A's knowledge makes her probability more "stable"; Pa(X|one more piece of evidence) is close to Pa(X). What if A is knowledgable but unstable? I think it still works out the same way but I haven't worked it out and I have to go.

PS: This is a first attempt on my part. Hopefully it's overcomplicated and overspecific, so we can work out/receive a more general/simple answer. But I saw that nobody else had replied so here ya go.

comment by MileyCyrus · 2012-08-27T11:20:53.979Z · LW(p) · GW(p)

When discussing the repugnanat conclusion, Eliezer commented:

I have advocated that "lives barely worth living" always be replaced with "lives barely worth celebrating" in every discussion of the 'Repugnant' Conclusion, to avoid equilibrating between "lives almost but not quite horrible enough to imply that a pre-existing person should commit suicide despite their intrinsic desire to live" versus "lives which we celebrate as good news upon learning about them, and hope to hear more such news in the future, but only to a very slight degree".

In a Big World, it's impossible to create anyone; all you can decide is where to allocate measure among experiences. My utilons for novelty are saturated by the size of reality, and that makes me an average utilitarian. As an average utilitarian, I do indeed accept that "mere addition", i.e., allocation of measure to experiences below-average for the global universe, is bad. If it were, unimaginably, to be demonstrated to me that Earth and its descendants were the only sentient beings in all of Tegmark levels I through IV, then I would embrace the actual creation of new experiences, and accept the Repugnant Conclusion without a qualm.

As I understand, a "Big World" is a world where every possible person exists in infinite copies. But how does this defeat total utilitarianism? These infinite copies of us exists too far away for us to interact with. If my actions cannot affect these people, why should I consider them when I face an ethical dilemma?

Replies from: Kaj_Sotala, pengvado
comment by Kaj_Sotala · 2012-08-28T10:48:57.613Z · LW(p) · GW(p)

Nick Bostrom in Infinite Ethics terms this "the causal approach" to the problem of infinities, and comments:

An advocate for the causal approach might point out that, according to relativity theory, nobody can influence events outside their future light cone. Cosmology suggests that the number of value-bearing locations (such as lives, or seconds of consciousness etc.) in our future light cone is finite. Given our best current physics, therefore, the causal approach appears to avoid paralysis.

Not so fast. Basing our ethics on an empirical fact about the laws of nature means that it cannot satisfy the highest methodological standard (cf. section 1). Well, we might be able to live with that. But the situation is much worse: the causal approach fails even in the situation we are actually in, thus failing to meet even the lowest possible acceptability criterion for a moral theory. This is because reasonable agents might— in fact, should—assign a finite non-zero probability to relativity theory and contemporary cosmology being wrong. When a finite positive probability is assigned to scenarios in which it is possible for us to exert a causal effect on an infinite number of value-bearing locations (in such a way that there is a real number r>0 such that we change the value of each of these location by at least r), then the expectation value of the causal changes that we can make is undefined.31 Paralysis will thus strike even when the domain of aggregation is restricted to our causal sphere of influence.

We could attempt to avoid this problem arising from our subjective uncertainty about the correctness of current physics by stipulating that the domain of aggregation should be restricted to our future light cone even if, contrary to special relativity, we could causally affect locations outside it. With this stipulation, we could ignore the physically far-fetched scenarios in which faster-than-light influencing is possible.

This tweak is not as good as it may appear. If, contrary to what current physics leads us to believe, it is in fact possible for us (or for somebody else, perhaps a technologically more advanced civilization) to causally influence events outside our (their) future light cone, moral considerations would still apply to such influencing. According to the present proposal, we should not factor in such considerations even if we thought superluminal propagation of causal influence to be quite likely; and that is surely wrong.

Moreover, even if the propagation of our causal effects is limited by the speed of light, it could still be possible for us to influence an infinite number of locations. This could happen, for instance, in a spatially infinite cyclic spacetime or in a steady-state cosmology.

...though this might not be relevant to Eliezer's actual reasons to reject total utilitarianism, because infinite ethics a la Bostrom would make average utilitarianism just as infeasible:

The threat is not limited to hedonistic utilitarianism. Utilitarian theories that have a broader conception of the good—happiness, preference-satisfaction, virtue, beauty-appreciation, or some objective list of ingredients that make for a good life—face the same problem. So, too, does average utilitarianism, mixed total/average utilitarianism, and prioritarian views that place a premium on the well-being of the worst off. In a canonically infinite world, average utility and most weighted utility measures are just as imperturbable by human agency as is the simple sum of utility.

comment by pengvado · 2012-08-27T17:24:05.609Z · LW(p) · GW(p)

The domain of a utility function is possible states of the world. The whole world, not just the parts you can physically affect. Some utility functions (such as total utilitarianism) can be factored into an integral over spacetime (and over other stuff for Tegmark IV) of some locally-supported function, and some can't. If you have a non-factorable utility function, then even if the world is partitioned into non-interacting pieces x and y and you're in x, the value of y still affects ∂U/∂x, and is thus relevant to decisions.

comment by kilobug · 2012-08-27T10:45:20.843Z · LW(p) · GW(p)

Do you know any game (video or board game, singleplayer or multiplayer, for adults or kids, I'm interested in all) that makes good use of rationality skills, and train them ?

For example, we could imagine a "Trivial Pursuit" game in which you give your answer, and how confident you're in it. If you're confident in it, you earn more if you're right, but you lose more if you're wrong.

Role-playing games do teach quite some on probabilities, it helps "feel" what is a 1% chance, or what it means to have higher expectancy but higher deviation. Card games like poker probably do too, even if I never played much poker.

Replies from: bradm, Emile, MileyCyrus
comment by bradm · 2012-08-27T19:36:25.320Z · LW(p) · GW(p)

The board game "Wits and Wagers" might qualify for what you are looking for. Game play is roughly as follows: A trivia question is asked and the answer is always a number (e.g., "How many cups of coffee does the average American drink each year?", "How wide, in feet, is an American football field?"). All the players write their estimate on a slip of paper and then then they are arranged in numerical order on the board. Everybody then places a bet on the estimate they like the best (it doesn't have to be your own). The estimates near the middle have a low payback (1:1, 2:1) and the estimates near the outside have a larger payback (4:1). If your estimate is closest to the actual number or if you bet on that one, will get a payback on your bet.

Replies from: MileyCyrus
comment by MileyCyrus · 2012-08-28T04:36:15.864Z · LW(p) · GW(p)

I'll second Wits and Wagers.Great for learning how to calibrate yourself.

comment by Emile · 2012-08-27T11:14:40.237Z · LW(p) · GW(p)

Zendo, Nomic, Eleusis, Master Mind ... and yeah, probably Poker for probability.

Petals around the rose: http://www.borrett.id.au/computing/petals-j.htm

I asked a similarish question here: http://stats.stackexchange.com/questions/28925/good-games-for-learning-statistical-thinking - for example, these games for simple stats.

comment by MileyCyrus · 2012-08-28T04:59:53.131Z · LW(p) · GW(p)

Settlers of Catan isn't a rationality game, but it's great for teaching economics. I play Catan with my little brothers and it has helped them understand concepts like comparative advantage, supply and demand, cartels, opportunity cost, time value of money, and derivatives markets. Just make sure you play with everyone showing what resource cards they have, instead of keeping their resource cards hidden. More interesting trades that way.

comment by Benya (Benja) · 2012-09-01T07:41:17.444Z · LW(p) · GW(p)

In the discussion about AI-based vs. upload-based singularities, and the expected utility of pushing for WBE (whole-brain emulation) first, has it been taken into account that an unfriendly AI is unlikely to do something worse than wiping out humanity, while the same isn't necessarily true in an upload-based singularity? I haven't been able to find discussion of this point, yet (unless you think that Robin's Hardscrapple Frontier scenario would be significantly worse than nonexistence, which it doesn't feel like, to me).

[ETA: To be clear, I'm not trying to argue anything at this point, I'm honestly asking for more info to help me figure out how to think about this.]

Replies from: Wei_Dai
comment by Wei Dai (Wei_Dai) · 2012-09-01T10:43:56.406Z · LW(p) · GW(p)

In the discussion about AI-based vs. upload-based singularities, and the expected utility of pushing for WBE (whole-brain emulation) first, has it been taken into account that an unfriendly AI is unlikely to do something worse than wiping out humanity, while the same isn't necessarily true in an upload-based singularity?

"Yes" in the sense that people are aware of the argument, which goes back at least as far as Vernor Vinge, 1993, but "no" in the sense that there are also arguments that it may not be highly unlikely that a failed attempt at FAI will be worse than extinction (especially since some of the FAI proposals, such as Paul Christiano's, are actually very closely related to uploading), and also "no" in the sense that we don't know how to take into account considerations like this one except by using our intuitive judgments which seem extremely unreliable.

Replies from: lukeprog, Benja
comment by lukeprog · 2012-09-01T23:58:12.186Z · LW(p) · GW(p)

The non-negligible chance of waking up to a personal hell-world (including, partial+failed revivification) is the main non-akratic reason I'm not signed up for cryonics. I currently think AGI is coming sooner than WBE, but if WBE starts pulling ahead then I would be even more disinclined to sign up for cryonics.

Wei, do you know of any arguments better than XiXiDu's that a failed attempt at FAI could very well be worse than extinction?

Replies from: Wei_Dai
comment by Wei Dai (Wei_Dai) · 2012-09-03T23:32:08.870Z · LW(p) · GW(p)

Wei, do you know of any arguments better than XiXiDu's that a failed attempt at FAI could very well be worse than extinction?

I'm not aware of an especially good writeup, but here's a general argument. Any attempt to build an AGI induces a distribution of possible outcomes, and specifically the distribution induced by an attempt at FAI can be thought of as a circle of uncertainty around an FAI in design space. AGIs that cause worse-than-extinction outcomes are clustered around FAIs in design space. So an attempt at FAI may be more likely to hit one of these worse-than-extinction AGIs than an attempt to build an AGI without consideration of Friendliness.

Replies from: lukeprog
comment by lukeprog · 2012-09-03T23:51:59.168Z · LW(p) · GW(p)

AGIs that cause worse-than-extinction outcomes are clustered around FAIs in design space.

Yes, that's the part I'd like to see developed more. Maybe SI or FHI will get around to it eventually, but in the meantime I wouldn't mind somebody like Wei Dai taking a crack at it.

Replies from: Mitchell_Porter, Wei_Dai
comment by Mitchell_Porter · 2012-09-04T00:40:35.081Z · LW(p) · GW(p)

Part of the problem in developing the argument is that you need a detailed concept of what a successful FAI design would look like, in order to then consider what similar-but-failed designs are like.

One approach is to think in terms of the utility function or goal system. Suppose that a true FAI has a utility function combining some long list of elemental values with a scheme for rating their importance. Variations away from this miss an essential value, add a false value, and/or get the recipe for combining elementary values wrong.

Another way to fail is to have the values right in principle but then to apply them wrongly in practice. My favorite example was, what if the AI thinks that some class of programs is conscious, when actually they aren't. It might facilitate the creation of an upload civilization which is only a simulation of utopia and not actually a utopia. It might incorrectly attach moral significance to the nonexistent qualia of programs which aren't conscious but which fake it. (Though neither of these is really "worse than extinction". The first one, taken to its extreme, just is extinction, while the worst I can see coming from the second scenario is a type of "repugnant conclusion" where the conscious beings are made to endure privation for the sake of vast sim-populations that aren't even conscious.)

Still another way to conceptualize "successful FAI design", in order to then think about unsuccessful variations, is to think of the FAI as a developmental trajectory. The FAI is characterized by a set of initial conditions, such as a set of specific answers to the questions: how does it select its utility function, how does it self-modify, how does it obtain appropriate stability of values under self-modification. And then you would consider what goes wrong down the line, if you get one or more of those answers wrong.

comment by Wei Dai (Wei_Dai) · 2012-09-04T20:27:58.409Z · LW(p) · GW(p)

I'm not sure what more can be said about "AGIs that cause worse-than-extinction outcomes are clustered around FAIs in design space". It's obvious, isn't it?

I guess I could write about some FAI approaches being more likely to cause worse-than-extinction outcomes than others. For example, FAIs that are closely related to uploading or try to automatically extract values from humans seem riskier in this regard than FAIs where the values are coded directly and manually. But this also seems obvious and I'm not sure what I can usefully say beyond a couple of sentences.

Replies from: TheOtherDave
comment by TheOtherDave · 2012-09-04T20:56:29.529Z · LW(p) · GW(p)

FWIW, that superhuman environment-optimizers (e.g. AGIs) that obtain their target values from humans using an automatic process (e.g., uploading or extraction) are more likely to cause worse-than-extinction outcomes than those using a manual process (e.g. coding) is not obvious to me.

comment by Benya (Benja) · 2012-09-01T10:53:24.563Z · LW(p) · GW(p)

Thanks!

comment by John_Maxwell (John_Maxwell_IV) · 2012-08-28T19:30:54.786Z · LW(p) · GW(p)

WRT CEV: What happens if my CEV is different than yours? What's the plan for resolving differences between different folks' CEVs? Does the FAI put us all in our own private boxes where we each think we're getting our CEVs, take a majority vote, or what?

Replies from: DanArmak, Dias, shminux
comment by DanArmak · 2012-08-28T20:11:34.585Z · LW(p) · GW(p)

I've asked this several times before. As far as I can make out, no (published) text answers this question. (If I'm wrong I am very interested in learning about it.)

The CEV doc assumes without any proof, not just that we (or a superintelligent FAI) will find a reconciling strategy for CEV, but that such a strategy exists to be found. It assumes that there is a unique such strategy that can be defined in some way that everyone could agree about. This seems to either invite a recursion (everyone does not agree about metaethics, CEV is needed to resolve this, but we don't agree about the CEV algorithm or inputs); or else to involve moral realism.

comment by Dias · 2012-09-09T00:29:35.182Z · LW(p) · GW(p)

Individuals have Volitions and (hopefully) Extrapolatable Volitions. If many people have EVs that 'agree', they interfere constructively (like waves), and that becomes part of the group's Coherant Extrapolated Volition. If they 'disagree' on some issue, they interfere distructively, and CEV has nothing to say on the issue.

(I'd be nice to be able to explain this by saying that individuals have EVs but not CEVs, excelent clearly they have a degenerate case of CEV if they have an EV)

Replies from: TheOtherDave
comment by TheOtherDave · 2012-09-09T01:30:28.142Z · LW(p) · GW(p)

It needn't be as degenerate as all that, actually, depending on just how coherent the mechanisms generating an individual's volition(s) is/are.

comment by shminux · 2012-08-29T22:21:18.348Z · LW(p) · GW(p)

What happens if my CEV is different than yours?

Then the "coherent" qualifier does not apply, does it? Are you asking how to construct CEV from the multitude of PEVs (P for personal)?

Does the FAI put us all in our own private boxes

Presumably those folks whose PEV does not mind boxing will get boxed, and the rest will have to be reconciled into the CEV, if possible. Or maybe commensurate PEVs get boxed together into partial CEV worlds.

The hard part is what to do with those whose PEV is incompatible with other people having different ideas from theirs. Eh, maybe not that hard. Trickery or termination is always an option when nothing better is available.

comment by Epiphany · 2012-08-29T01:20:46.579Z · LW(p) · GW(p)

Is there a complete list of known / theoretical AI risks anywhere? I searched and couldn't find one.

comment by lukeprog · 2012-08-27T04:11:48.253Z · LW(p) · GW(p)

I can see how the money pump argument demonstrates the irrationality of an agent with cyclic preferences. Is there a more general argument that demonstrates the irrationality of an agent with intransitive preferences of any kind (not merely one with cyclic preferences)?

Replies from: DanielLC, vi21maobk9vp
comment by DanielLC · 2012-08-27T05:01:57.123Z · LW(p) · GW(p)

I don't understand what you mean. Can you give me an example of preferences that are intransitive but not cyclic?

Replies from: Unnamed
comment by Unnamed · 2012-08-27T06:42:22.875Z · LW(p) · GW(p)

A little bit of googling turned up this paper by Gustafsson (2010) on the topic, which says that indifference allows for intransitive preferences that do not create a strict cycle. For instance, A>B, B>C, and C=A.

The obvious solution is to add epsilon to break the indifference. If A>B, then there exists e>0 such that A>B+e. And if e>0 and C=A, then C+e>A. So A>B+e, B+e>C+e, and C+e>A, which gives you a strict cycle that allows for money pumping. Gustafsson calls this the small-bonus approach.

Gustafsson suggests an alternative, using lotteries and applying the principle of dominance. Consider the 4 lotteries:

Lottery 1: heads you get A, tails you get B
Lottery 2: heads you get A, tails you get C
Lottery 3: heads you get B, tails you get A
Lottery 4: heads you get C, tails you get A

Lottery 1 > Lottery 2, because if it comes up tails you prefer Lottery 1 (B>C) and if it comes up heads you are indifferent (A=A).
Lottery 2 > Lottery 3, because if it comes up heads you prefer Lottery 2 (A>B) and if it comes up tails you are indifferent (C=A)
Lottery 3 > Lottery 4, because if it comes up heads you prefer Lottery 3 (B>C) and if it comes up tails you are indifferent (A=A)
Lottery 4 > Lottery 1, because if it comes up tails you prefer Lottery 4 (A>B) and if it comes up heads you are indifferent (C=A)

Replies from: lukeprog
comment by lukeprog · 2012-08-27T17:48:34.765Z · LW(p) · GW(p)

This is the kind of thing I was looking for; thanks!

comment by vi21maobk9vp · 2012-08-27T06:45:11.961Z · LW(p) · GW(p)

Just in case - synchronising the definitions.

I usually consider something transitive if "X≥Y, Y≥Z then X≥Z" holds for all X,Y,Z.

If this holds, preferences are transitive. Otherwise, there are some X,Y,Z: X≥Y, Y≥Z, Z>X. I would call that cyclical.

comment by JQuinton · 2012-08-31T23:20:03.488Z · LW(p) · GW(p)

Don't know if this has been answered, or where to even look for it, but here goes.

Once FAI is achieved and we are into the Singularity, how would we stop this superintelligence from rewriting its "friendly" code to something else and becoming unfriendly?

Replies from: Alicorn
comment by Alicorn · 2012-08-31T23:46:10.830Z · LW(p) · GW(p)

We wouldn't. However, the FAI knows that if it changed its code to unFriendly code, then unFriendly things would happen. It's Friendly, so it doesn't want unFriendly things to happen, so it doesn't want to change its code in such a way as to cause those things - so a proper FAI is stably Friendly. Unfortunately, this works both ways: an AI that wants something else will want to keep wanting it, and will resist attempts to change what it wants.

There's more on this in Omohundro's paper "Basic AI Drives"; relevant keyword is "goal distortion". You can also check out various uses of the classic example of giving Gandhi a pill that would, if taken, make him want to murder people. (Hint: he does not take it, 'cause he doesn't want people to get murdered.)

comment by philh · 2012-08-27T18:03:37.791Z · LW(p) · GW(p)

Dragging up anthropic questions and quantum immortality: suppose I am Schrodinger's cat. I enter the box ten times (each time it has a .5 probability of killing me), and survive. If I started with a .5 belief in QI, my belief is now 1024/1025.

But if you are watching, your belief in QI should not change. (If QI is true, the only outcome I can observe is surviving, so P_me(I survive | QI) = 1. But someone else can observe my death even if QI is true, so P_you(I survive | QI) = 1/1024 = P_you(I survive | ~QI).)

Aumann's agreement theorem says that if we share priors and have mutual knowledge of each others' posteriors, we should each update to hold the same posteriors. Aumann doesn't require that we share observations, but in this case we're doing that too. So, what should we each end up believing? If you update in my direction, then every time anybody does something risky and survives, your belief in QI should go up. But if not, then I'm not allowed to update my belief in QI even if I survive the box once a day for a thousand years. Neither of those seems sensible.

Does Aumann make an implicit assumption that we agree on all possible values of P(evidence | model); if so, is that a safe assumption to make even in normal applications? (Granted, the assumptions of "common priors" and "Bayesian rationalists" are unsafe, so this might not cost much.)

Replies from: Kindly
comment by Kindly · 2012-08-27T21:55:04.630Z · LW(p) · GW(p)

I don't think Aumann's agreement theorem is the problem here.

If QI is true, the only outcome I can observe is surviving.

What does it mean for QI to be true or false? What would you expect to happen differently? Certainly, whether or not QI is true, the only outcome you can observe is surviving, so I don't see how you're updating your belief.

Replies from: philh
comment by philh · 2012-08-27T23:16:55.546Z · LW(p) · GW(p)

If QI is true, I expect to observe myself surviving. If QI is false, I expect not to be able to observe anything. I don't know exactly what that means, but I don't feel like this confusion is the problem. I think that surviving thousand-to-one odds must be strong evidence that I am somehow immortal (if you disagree, we can make it 3^^^^3-to-one), and QI is the only form of immortality that I currently assign non-neglible probability to.

I briefly thought that this made QI a somehow priveleged hypothesis, because I can't observe the strongest evidence against it (my death). But I don't think that's the case, because there are other observations that would reduce my belief in QI. For example, if wavefunction collapse turns out to be a thing, I understand that would make QI much less likely. (But I don't actually know quantum mechanics beyond Eliezer's sequence, so the actual observations would be along the lines of "people who know QM saying that QI is incompatible with other observations that have been made, and appearing to know what they're talking about".)

Replies from: Kindly
comment by Kindly · 2012-08-27T23:56:03.331Z · LW(p) · GW(p)

If QI is true, you still don't observe anything in 1023/1024 of all worlds. Nothing makes the 1-in-1024 event happen in any case, you just happen to only wake up in the situation where you legitimately get to be surprised about it happening.

Replies from: philh
comment by philh · 2012-08-28T11:11:30.936Z · LW(p) · GW(p)

If QI is true then my probability of observing myself survive is 1. That's pretty much what QI is. It is true that most of my measure does not survive, but I don't think it's relevant in this case.

Replies from: FeepingCreature
comment by FeepingCreature · 2012-08-28T18:58:18.911Z · LW(p) · GW(p)

In 1023/1024 worlds your observer doesn't update on QI, and neither do you. In 1/1024 worlds, you update on QI and so does the version of the person you interact with. ;)

Replies from: philh
comment by philh · 2012-08-28T20:47:07.928Z · LW(p) · GW(p)

The person watching me gives 1/1024 chance of my survival, regardless of whether QI is true or false. So if I survive, he does not update his belief in QI.

(That said, if I observed a 1/3^^^^3 probability, that might well increase my belief in MWI (I'm not sure if it should do, but it would be along the lines of "there's no way I would have observed that unless all possible outcomes were observed by some part of my total measure"). And I'm not sure how MWI could be true but QI false, so it would also increase my belief in QI.

So maybe 1/1024 would do the same, but certainly not to anything like the same extent as personally surviving those odds.)

comment by amcknight · 2012-08-27T01:53:23.436Z · LW(p) · GW(p)

what is anthropic information? what is indexical information? Is there a difference?

comment by Incorrect · 2012-08-30T02:50:33.086Z · LW(p) · GW(p)

The use of external computation (like a human using a computer to solve a math problem or an AI expanding its computational resources) is a special case of inferring information about mathematical statements from your observations about the universe.

What is the general algorithm for accomplishing this in terms of pure observations (no action, observation cycles)? How does the difficulty of the mathematical statements you can infer to be probably true relate to the amount of computation you have expended approximating solomonoff induction?

comment by shminux · 2012-08-29T22:11:10.335Z · LW(p) · GW(p)

In MWI, do different Everett worlds share the same spacetime?

Replies from: vi21maobk9vp
comment by vi21maobk9vp · 2012-08-30T15:50:28.538Z · LW(p) · GW(p)

As far as I understand, there is still no satisfactory theory that would include both quantum mechanics and general relativity (i.e. the possibility for spacetime not to be same).

I would expect that in unified theory spacetime structure would be a part of the state undergoing quantum superposition.

Replies from: shminux
comment by shminux · 2012-08-30T16:30:37.477Z · LW(p) · GW(p)

there is still no satisfactory theory that would include both quantum mechanics and general relativity

That's true. I was wondering what the standard claim that "MWI is just decoherence" has to say about the spacetime. Does it also decohere into multiple outcomes? If so, how? Does it require quantum gravity to understand? In this case "just decoherence" is not a valid claim.

Replies from: vi21maobk9vp
comment by vi21maobk9vp · 2012-08-31T07:49:36.011Z · LW(p) · GW(p)

Probably it does lose coherence. What specifically that means has to be shown in the future by working theory that accepts GR and QM as its limit cases...

Whether it will be any of the current research directions called quantum gravity or something else is hard to predict.

I have no intuitions here, as I am between a mathematician and a programmer and have catastrophically not enough knowledge of physics to try to predict unknown areas of it.

comment by tmgerbich · 2012-08-28T07:41:03.736Z · LW(p) · GW(p)

I'm a bit late on this, obviously, but I've had a question that I've always felt was a bit too nonsensical (and no doubt addressed somewhere in the sequences that I haven't found) to bring up but it kinda bugs me.

Do we have any ideas/guesses/starting points about whether or not "self-awareness" is some kind of weird quirk of our biology and evolution or if would be be an inevitable consequence of any general AI?

I realize that's not a super clear definition- I guess I'm talking about that feeling of "existing is going on here" and you can't take it away- even if it turned out that all the evidence I thought I was getting was really just artificial stimulation of a culture of neurons, even if I'm just a whole brain emulation on some computer, even if I'm really locked up in a psych ward somewhere on antipsychotics? Because my first-hand experience of existing is irrefutable evidence for existence, even if I'm completely wrong about everything besides that?

Since I assume that basically everything about me has a physical correlate, I assume there's some section of my brain that's responsible for processing that. I imagine it would be useful to have awareness of myself in order to simulate future situations, etc- building models in our heads is something human brains seem quite good at. So could an AI be built without that? Obviously it would have access to its own source code and such... but do we have any information on whether self-awareness/sense of self is just a trick our brains play on us and an accident of evolution or whether that would be a basic feature of basically any general AI?

Sorry if this question doesn't really make sense!

Replies from: Mitchell_Porter
comment by Mitchell_Porter · 2012-08-28T08:56:58.873Z · LW(p) · GW(p)

The question makes sense, but the answers probably won't.

Questions like this are usually approached in an upside-down way. People assume, as you are doing, that reality is "just neurons" or "just atoms" or "just information", then they imagine that what they are experiencing is somehow "just that", and then they try to live with that belief. They will even construct odd ways of speaking, in which elements of the supposed "objective reality" are substituted for subjective or mentalistic terms, in order to affirm the belief.

You're noticing that "self-awareness" or "the feeling that something is happening" or "the feeling that I exist" doesn't feel like it's the same thing as "neurons"; though perhaps you will tell yourself - as Wittgenstein may have done - that you don't actually know what being a pack of neurons should feel like, so how do you know that it wouldn't feel exactly like this?

But if you pay attention to the subjective component of your thought, even when you're thinking objectively or scientifically, you'll notice that the reduction actually goes in the other direction. You don't have any direct evidence of the "objective existence" of neurons or atoms or "information". The part of reality that you do know about is always some "experience" that is "happening", which may include thoughts about an objective world, that match up in some way with elements of the experience. In other words, you don't know that there are neurons or atoms, but you can know that you are having thoughts about these hypothetical objects. If you're really good at observing and analyzing your thoughts, you may even be able to say a lot about the conscious mental activity which goes into making the thought and applying it to experience.

The fundamental problem is that the physical concept of reality is obtained by taking these conscious states and amputating the subjective part, leaving only the "object" end. Clearly there is a sense in which the conscious subject is itself an "object", it's something that exists. But all the dimly apprehended ontological peculiarities which make consciousness what it is, and which give rise to these nebulous names like intentionality and qualia, are not part of the ontology of "objects". If you think about another person, and think of them as possessing a consciousness just as you do, then your object-of-thought is still ontologically a subject, but of course that's not what we do in physics. We just focus on a few attributes which we have learnt to think about rigorously, like space, time, quantity, and then we assume that they are the whole of reality.

That is the essence of the upside-down approach to the problem of physics and consciousness. You have to go in the other direction, acknowledge that consciousness is what it is, and then try to understand physics so that something in physics has all those properties. This isn't easy and it's why I promote certain kinds of quantum brain theories, because quantum mechanics, under certain interpretations, can contain complex "wholes" that might be consciousness. But even if that were true, you would still have one more challenge: find a way to think of the usual mathematical ontology of physics as the superficial functional description, and the lived experience of existing as a glimpse of the true reality, rather than the other way around.

Replies from: Alejandro1
comment by Alejandro1 · 2012-08-28T15:46:56.479Z · LW(p) · GW(p)

Your labeling of physicalism as an "upside-down" approach reminded me of this quote from Schopenhauer, which you would no doubt approve of:

Of all systems of philosophy which start from the object, the most consistent, and that which may be carried furthest, is simple materialism. It regards matter, and t with it time and space, as existing absolutely, and ignores the relation to the subject in which alone all this really exists. It then lays hold of the law of causality as a guiding principle or clue, regarding it as a self-existent order (or arrangement) of things, Veritas aeterna, and so fails to take account of the understanding, in which and for which alone causality is. It seeks the primary and most simple state of matter, and then tries to develop all the others from it; ascending from mere mechanism, to chemism, to polarity, to the vegetable and to the animal kingdom. And if we suppose this to have been done, the last link in the chain would be animal sensibility — that is knowledge — which would consequently now appear as a mere modification or state of matter produced by causality. Now if we had followed materialism thus far with clear ideas, when we reached its highest point we would suddenly be seized with a fit of the inextinguishable laughter of the Olympians. As if waking from a dream, we would all at once become aware that its final result — knowledge, which it reached so laboriously, was presupposed as the indispensable condition of its very starting-point, mere matter; and when we imagined that we thought matter, we really thought only the subject that perceives matter ; the eye that sees it, the hand that feels it, the understanding that knows it Thus the tremendous petitio principii reveals itself unexpectedly; for suddenly the last link is seen to be the starting-point, the chain a circle, and the materialist is like Baron Munchausen who, when swimming in water on horseback, drew the horse into the air with his legs.

I still think it is a confused philosophy, but it is a memorable and powerful passage.

Replies from: Mitchell_Porter, NancyLebovitz
comment by Mitchell_Porter · 2012-08-29T06:53:38.911Z · LW(p) · GW(p)

Although Schopenhauer set himself against people like Hegel, his outlook still seems to have been a sort of theistic Berkeleyan idealism, in which everything that exists owes its existence to being the object of a universal consciousness; the difference between Hegel and Schopenhauer being, that Hegel calls this universal consciousness rational and good, whereas Schopenhauer calls it irrational and evil, a cosmic Will whose local manifestation in oneself should be annulled through the pursuit of indifference.

But I'm much more like a materialist, in that I think of the world as consisting of external causal interactions between multiple entities, some of which have a mindlike interior, but which don't owe their existence to their being posited by an overarching cosmic mind. I say a large part of the problem is just that physicalism employs an insufficiently rich ontology. Its categories don't include the possibility of "entity with a mindlike interior".

To put it another way: Among all the entities that the world contains, are entities which we can call subjects or persons or thinking beings, and these entities themselves "contain" "ideas of objects" and "experiences of objects". My problem with physicalism is not that it refuses to treat all actual objects as ideas, or otherwise embed all objects into subjects; it is just that it tries to do without the ontological knowledge obtained by self-reflection, which is the only way we know that there are such things as conscious beings, with their specific properties.

Somehow, we possess the capacity to conceive of a self, as well as the capacity to conceive of objects independent of the self. Physicalism tries to understand everything using only this second capacity, and as such is methodologically blind to the true nature of anything to do with consciousness, which can only be approached through the first capacity. This bias produces a "mechanistic, materialistic" concept of the universe, and then we wonder where the ghost in the machine is hiding. The "amputation of the subjective part" really refers to the attempt to do without knowledge of the first kind, when understanding reality.

comment by NancyLebovitz · 2012-08-28T18:47:34.262Z · LW(p) · GW(p)

Have you read Zen and the Art of Motorcycle Maintenance?

The climactic realization is

gung vzzrqvngr rkcrevrapr vf havgnel, ohg gur zvaq dhvpxyl qvivqrf vg vagb jung vf vafvqr gur frys naq bhgfvqr gur frys.

Replies from: DanArmak, Alejandro1
comment by DanArmak · 2012-08-28T20:13:01.435Z · LW(p) · GW(p)

The climactic realization is gung vzzrqvngr rkcrevrapr vf havgnel, ohg gur zvaq dhvpxyl qvivqrf vg vagb jung vf vafvqr gur frys naq bhgfvqr gur frys.

...That's the sound made by a poorly maintained motorcycle.

comment by Alejandro1 · 2012-08-28T19:03:20.703Z · LW(p) · GW(p)

No, I haven't read it; thanks for the recommendation.

comment by William_Quixote · 2012-08-31T21:10:25.297Z · LW(p) · GW(p)

Question on posting norms: What is the community standard for opening a discussion thread about an issue discussed in the sequences? Are there strong norms regarding minimum / maximum length? Is formalism required, or frowned on, or just optional? Thanks

Replies from: Kaj_Sotala
comment by Kaj_Sotala · 2012-09-02T06:56:30.478Z · LW(p) · GW(p)

The general rule seems to be "if your post is interesting and well-written enough, it's fine"; hard to say anything more specific than that. No strong norms about length (even quite short posts have been heavily upvoted), and formalism is optional, but a post that's heavy on math or formal logic will probably get less readers.

comment by Carinthium · 2013-08-11T12:56:32.315Z · LW(p) · GW(p)

Say you start from merely the axioms of probability. From those, how do you get to the hypothesis that "the existence of the world is probable"? I'm curious to look at it in more detail because I'm not sure if it's philosophically sound or not.

comment by Wei Dai (Wei_Dai) · 2012-09-30T23:05:25.586Z · LW(p) · GW(p)

Has Eliezer written about what theory of meaning he prefers? (Or does anyone want to offer a guess?)

comment by JQuinton · 2012-09-10T20:19:51.730Z · LW(p) · GW(p)

I've also been doing searches for topics related to the singularity and space travel (this thought came up after playing a bit of Mass Effect ^ _ ^). It would seem to me that biological restrictions on space travel wouldn't apply to a sufficiently advanced AI. This AI could colonize other worlds using near speed of light travel with minimal physical payload and harvest the raw materials on some new planet using algorithms programmed in small harvesting bots. If this is possible then it seem to me that unfriendly AI might not be that much of a threat since they would have many more "habitable" worlds to harvest/live on (like Venus or Mars, comets, asteroids, or extra-solar planets).

Another thing. If this is possible it sort of leads to a paradox: Why hasn't it happened already with other intelligent life on other planets?

Replies from: None
comment by [deleted] · 2012-09-10T21:10:29.818Z · LW(p) · GW(p)

It would seem to me that biological restrictions on space travel wouldn't apply to a sufficiently advanced AI. This AI could colonize other worlds using near speed of light travel with minimal physical payload and harvest the raw materials on some new planet using algorithms programmed in small harvesting bots.

Pretty much right.

If this is possible then it seem to me that unfriendly AI might not be that much of a threat since they would have many more "habitable" worlds to harvest/live on (like Venus or Mars, comets, asteroids, or extra-solar planets).

We would eventually like to inhabit the currently uninhabitable planets. Terraforming, self modification, sealed colonies, or some combination of those will eventually make this feasable. At that time, we would rather that those planets not fight back.

Symmetrically, an unfriendly process will not be satisfied with taking merely Mercury, Venus, Mars, Jupiter, Saturn, Uranus, Neptune, and the rest of the universe; it will want to do its thing on earth as well. The choice between "kill the humans and take over earth" and "don't kill the humans and don't take over earth" is independent of the existence of other territory, so it doesn't matter and it will kill us.

(the short answer is that there is no "satisfied" or "enough" among nonhuman agents.)

Another thing. If this is possible it sort of leads to a paradox: Why hasn't it happened already with other intelligent life on other planets?

You mean the fermi paradox? You'll have to expand, but note that a singularity will expand at lightspeed (=we wouldn't see it until it were here), and it will consume all resources (= if it had been here, we wouldn't).

comment by The_Duck · 2012-09-07T06:31:02.308Z · LW(p) · GW(p)

Do people in these parts think that creating new people is a moral good? If so, is it because of utilitarian considerations; i.e. "overall utility is the sum of all people's utility; therefore we should create more people?" Conversely, if you are a utilitarian who sums the utilities of all people, why aren't you vigorously combating falling birth-rates in developed countries? Perhaps you are? Perhaps most people here are not utilitarians of this sort?

Replies from: None
comment by [deleted] · 2012-09-10T21:13:57.141Z · LW(p) · GW(p)

That topic is full of a lot of confusion and everyone seems to have different intuitions.

I for one am not proper utilitarian because it seems unnaturally simple (when we have no reason to suspect that human value should be simple).

But an additional awesome person seems like a good thing, if it doesn't make the world suck more.

I am (as much as I can) vigorously trying to make lots of money to fund positive-singularity research.

comment by lukeprog · 2012-09-02T18:49:08.803Z · LW(p) · GW(p)

What's that quote, from an Ancient Greek I think, about how the first question in argument should be "What do you mean by that?" and the second question should be "And how do you know that?"

Replies from: siodine
comment by siodine · 2012-09-02T19:01:05.018Z · LW(p) · GW(p)

Sounds like Socrates.

Replies from: lukeprog
comment by lukeprog · 2012-09-02T23:23:04.734Z · LW(p) · GW(p)

I thought there was some pithy quote from him (well, Plato) about it, but I can't find the pithy quote version of the idea.

comment by lukeprog · 2012-08-27T04:09:17.441Z · LW(p) · GW(p)

[duplicate comment; deleted]