Stupid Questions Open Thread

post by Costanza · 2011-12-29T23:23:21.412Z · LW · GW · Legacy · 264 comments

This is for anyone in the LessWrong community who has made at least some effort to read the sequences and follow along, but is still confused on some point, and is perhaps feeling a bit embarrassed. Here, newbies and not-so-newbies are free to ask very basic but still relevant questions with the understanding that the answers are probably somewhere in the sequences. Similarly, LessWrong tends to presume a rather high threshold for understanding science and technology. Relevant questions in those areas are welcome as well.  Anyone who chooses to respond should respectfully guide the questioner to a helpful resource, and questioners should be appropriately grateful. Good faith should be presumed on both sides, unless and until it is shown to be absent.  If a questioner is not sure whether a question is relevant, ask it, and also ask if it's relevant.

264 comments

Comments sorted by top scores.

comment by [deleted] · 2011-12-30T00:29:00.970Z · LW(p) · GW(p)

Well, hmmm. I wonder if this qualifies as "stupid".

Could someone help me summarize the evidence for MWI in the quantum physics sequence? I tried once, and only came up with 1) the fact that collapse postulates are "not nice" (i.e., nonlinear, nonlocal, and so on) and 2) the fact of decoherence. However, the following quote from Many Worlds, One Best Guess (emphasis added):

The debate should already be over. It should have been over fifty years ago. The state of evidence is too lopsided to justify further argument. There is no balance in this issue. There is no rational controversy to teach. The laws of probability theory are laws, not suggestions; there is no flexibility in the best guess given this evidence. Our children will look back at the fact that we were STILL ARGUING about this in the early 21st-century, and correctly deduce that we were nuts.

Is there other evidence as well, then? 1) seems depressingly weak, and as for 2)...

As was mentioned in Decoherence is Falsifiable and Testable, and brought up in the comments, the existence of so-called "microscopic decoherence" (which we have evidence for) is independent from so-called "macroscopic decoherence" (which -- as far as I know, and I would like to be wrong about this -- we do not have empirical evidence for). Macroscopic decoherence seems to imply MWI, but the evidence given in the decoherence subsequence deals only with microscopic decoherence.

I would rather not have this devolve into a debate on MWI and friends -- EY above to the contrary, I don't think we can classify that question as a "stupid" one. I'm focused entirely in EY's argument for MWI and possible improvements that can be made to it.

Replies from: Will_Newsome, CharlesR, CronoDAS, Dan_Moore, shminux, saturn
comment by Will_Newsome · 2011-12-30T03:26:31.135Z · LW(p) · GW(p)

(There are two different argument sets here: 1) against random collapse, and 2) for MWI specifically. It's important to keep these distinct.)

Replies from: None
comment by [deleted] · 2011-12-30T03:30:52.569Z · LW(p) · GW(p)

Unless I'm missing something, EY argues that evidence against random collapse is evidence for MWI. See that long analogy on Maxwell's equations with angels mediating the electromagnetic force.

Replies from: Will_Newsome
comment by Will_Newsome · 2011-12-30T03:35:54.984Z · LW(p) · GW(p)

It's also evidence for a bunch of other interpretations though, right? I meant "for MWI specifically"; I'll edit my comment to be clearer.

Replies from: None
comment by [deleted] · 2011-12-30T03:40:33.129Z · LW(p) · GW(p)

I agree, which is one of the reasons why I feel 1) alone isn't enough to substantiate "There is no rational controversy to teach" and etc.

comment by CharlesR · 2011-12-30T05:49:26.071Z · LW(p) · GW(p)

Quantum mechanics can be described by a set of postulates. (Sometimes five, sometimes four. It depends how you write them.)

In the "standard" Interpretation, one of these postulates invokes something called "state collapse".

MWI can be described by the same set of postulates without doing that.

When you have two theories that describe the same data, the simpler one is usually the right one.

Replies from: None
comment by [deleted] · 2011-12-30T06:04:46.085Z · LW(p) · GW(p)

This falls under 1) above, and is also covered here below. Was there something new you wanted to convey?

Replies from: KPier
comment by KPier · 2011-12-30T06:12:47.181Z · LW(p) · GW(p)

I think 1) should probably be split into two arguments, then. One of them is that Many World is strictly simpler (by any mathematical formalization of Occam's Razor.) The other one is that collapse postulates are problematic (which could itself be split into sub-arguments, but that's probably unnecessary).

Grouping those makes no sense. They can stand (or fall) independently, they aren't really connected to each other, and they look at the problem from different angles.

Replies from: None, JoshuaZ
comment by [deleted] · 2011-12-30T06:18:44.856Z · LW(p) · GW(p)

I think 1) should probably be split into two arguments, then.

Ah, okay, that makes more sense. 1a) (that MWI is simpler than competing theories) would be vastly more convincing than 1b) (that collapse is bad, mkay). I'm going to have to reread the relevant subsequence with 1a) in mind.

Replies from: Will_Newsome
comment by Will_Newsome · 2011-12-30T21:38:59.754Z · LW(p) · GW(p)

I really don't think 1a) is addressed by Eliezer; no offense meant to him, but I don't think he knows very much about interpretations besides MWI (maybe I'm wrong and he just doesn't discuss them for some reason?). E.g. AFAICT the transactional interpretation has what people 'round these parts might call an Occamian benefit in that it doesn't require an additional rule that says "ignore advanced wave solutions to Maxwell's equations". In general these Occamian arguments aren't as strong as they're made out to be.

Replies from: None
comment by [deleted] · 2011-12-30T21:57:38.883Z · LW(p) · GW(p)

If you read Decoherence is Simple while keeping in mind that EY treats decoherence and MWI as synonymous, and ignore the superfluous references to MML, Kolmogorov and Solomonoff, then 1a) is addressed there.

comment by JoshuaZ · 2011-12-30T14:45:13.529Z · LW(p) · GW(p)

One of them is that Many World is strictly simpler (by any mathematical formalization of Occam's Razor.)

The claim in parentheses isn't obvious to me and seems to be probably wrong. If one replaced any with "many" or "most" it seems more reasonable. Why do you assert this applies to any formalization?

Replies from: KPier
comment by KPier · 2011-12-31T21:52:23.254Z · LW(p) · GW(p)

Kolmogorov Complexity/Solmanoff Induction and Minimum Message Length have been proven equivalent in their most-developed forms. Essentially, correct mathematical formalizations of Occam's Razor are all the same thing.

Replies from: None, JoshuaZ
comment by [deleted] · 2011-12-31T22:23:52.733Z · LW(p) · GW(p)

The whole point is superfluous, because nobody is going to sit around and formally write out the axioms of these competing theories. It may be a correct argument, but it's not necessarily convincing.

comment by JoshuaZ · 2012-01-01T02:46:49.196Z · LW(p) · GW(p)

This is a pretty unhelpful way of justifying this sort of thing. Kolmogorv complexity doesn't give a unique result. What programming system one uses as one's basis can change things up to a constant. So simply looking at the fact that Solomonoff induction is equivalent to a lot of formulations isn't really that helpful for this purpose.

Moreover, there are other formalizations of Occam's razor which are not formally equivalent to Solomonoff induction. PAC learning is one natural example.

comment by CronoDAS · 2011-12-30T03:32:47.271Z · LW(p) · GW(p)

Is it really so strange that people are still arguing over "interpretations of quantum mechanics" when the question of whether atoms existed wasn't settled until one hundred years after John Dalton published his work?

comment by Dan_Moore · 2011-12-30T15:21:24.613Z · LW(p) · GW(p)

From the Wikipedia fined-tuned universe page

Mathematician Michael Ikeda and astronomer William H. Jefferys have argued that [, upon pre-supposing MWI,] the anthropic principle resolves the entire issue of fine-tuning, as does philosopher of science Elliott Sober. Philosopher and theologian Richard Swinburne reaches the opposite conclusion using Bayesian probability.

(Ikeda & Jeffrey are linked at note 21.)

In a nutshell, MWI provides a mechanism whereby a spectrum of universes are produced, some life-friendly and some life-unfriendly. Consistent with the weak anthropic principle, life can only exist in the life-friendly (hence fine-tuned) universes. So, MWI provides an explanation of observed fine-tuning, whereas the standard QM interpretation does not.

Replies from: Nisan, Nisan
comment by Nisan · 2012-01-02T06:05:40.968Z · LW(p) · GW(p)

That line of reasoning puzzles me, because the anthropic-principle explanation of fine tuning works just fine without MWI: Out of all the conceivable worlds, of course we find ourselves in one that is habitable.

Replies from: Manfred
comment by Manfred · 2012-01-02T06:19:40.624Z · LW(p) · GW(p)

This only works if all worlds that follow the same fundamental theory exist in the same way our local neighborhood exists. If all of space has just one set of constants even though other values would fit the same theory of everything equally well, the anthropic principle does not apply, and so the fact that the universe is habitable is ordinary Bayesian evidence for something unknown going on.

Replies from: Nisan
comment by Nisan · 2012-01-02T06:40:16.747Z · LW(p) · GW(p)

The word "exist" doesn't do any useful work here. There are conceivable worlds that are different from this one, and whether they exist depends on the definition of "exist". But they're still relevant to an anthropic argument.

The habitability of the universe is not evidence of anything because the probability of observing a habitable universe is practically unity.

Replies from: TheOtherDave, Dan_Moore, Manfred
comment by TheOtherDave · 2012-01-04T16:43:43.914Z · LW(p) · GW(p)

Can you clarify why a conceivable world that doesn't exist in the conventional sense of existing is relevant to an anthropic argument?

I mean, if I start out as part of a group of 2^10 people, and that group is subjected to an iterative process whereby we split the group randomly into equal subgroups A and B and kill group B, then at every point along the way I ought to expect to have a history of being sorted into group A if I'm alive, but I ought not expect to be alive very long. This doesn't seem to depend in any useful way on the definition of "alive."

Is it different for universes? Why?

Replies from: Nisan
comment by Nisan · 2012-01-04T19:13:18.799Z · LW(p) · GW(p)

I mean, if I start out as part of a group of 2^10 people, and that group is subjected to an iterative process whereby we split the group randomly into equal subgroups A and B and kill group B, then at every point along the way I ought to expect to have a history of being sorted into group A if I'm alive, but I ought not expect to be alive very long. This doesn't seem to depend in any useful way on the definition of "alive."

I agree with all that. I don't quite see where that thought experiment fits into the discussion here. I see that the situation where we have survived that iterative process is analogous to fine-tuning with MWI, and I agree that fine-tuning is unsurprising given MWI. I further claim that fine-tuning is unsurprising even in a non-quantum universe. Let me describe the though experiment I have in mind:

Imagine a universe with very different physics. (1) Suppose the universe, by nature, splits into many worlds shortly after the beginning of time, each with different physical constants, only one of which allows for life. The inhabitants of that one world ought not to be surprised at the fine-tuning they observe. This is analogous to fine-tuning with MWI.

(2) Now suppose the universe consists of many worlds at its inception, and these other worlds can be observed only with great difficulty. Then the inhabitants still ought not to be surprised by fine-tuning.

(3) Now suppose the universe consists of many worlds from its inception, but they are completely inaccessible, and their existence can only be inferred from the simplest scientific model of the universe. The inhabitants still ought not to be surprised by fine-tuning.

(4) Now suppose the simplest scientific model describes only one world, but the physical constants are free parameters. You can easily construct a parameterless model that says "a separate world exists for every choice of parameters somehow", but whether this means that those other worlds "exist" is a fruitless debate. The inhabitants still ought not to be surprised by fine-tuning. This is what I mean when I say that fine-tuning is not surprising even without MWI.

In cases (1)-(4), the inhabitants can make an anthropic argument: "If the physical constants were different, we wouldn't be here to wonder about them. We shouldn't be surprised that they allow us to exist." Does that makes sense?

Replies from: TheOtherDave
comment by TheOtherDave · 2012-01-04T19:21:52.107Z · LW(p) · GW(p)

Ah, I see.

Yes, I agree: as long as there's some mechanism for the relevant physical constants to vary over time, anthropic arguments for the "fined-tuned" nature of those constants can apply; anthropic arguments don't let us select among such mechanisms.

Thanks for clarifying.

Replies from: Nisan
comment by Nisan · 2012-01-04T19:43:36.123Z · LW(p) · GW(p)

Hm, only the first of the four scenarios in the grandparent involves physical constants varying over time. But yes, anthropic arguments don't distinguish between the scenarios.

Replies from: TheOtherDave
comment by TheOtherDave · 2012-01-04T20:05:14.166Z · LW(p) · GW(p)

Huh. Then I guess I didn't understand you after all.

You're saying that in scenario 4, the relevant constants don't change once set for the first time?

In that case this doesn't fly. If setting the constants is a one-time event in scenario 4, and most possible values don't allow for life, then while I ought not be surprised by the fine-tuning given that I observe something (agreed), I ought to be surprised to observe anything at all.

That's why I brought up the small-scale example. In that example, I ought not be surprised by the history of A's given that I observe something, but I ought to be surprised to observing anything in the first place. If you'd asked me ahead of time whether I would survive I'd estimate a .0001 chance... a low-probability event.

If my current observed environment can be explained by positing scenarios 1-4, and scenario 4 requires assuming a low-probability event that the others don't, that seems like a reason to choose 1-3 instead.

Replies from: Nisan
comment by Nisan · 2012-01-04T20:39:16.186Z · LW(p) · GW(p)

You're saying that in scenario 4, the relevant constants don't change once set for the first time?

I'm saying that in all four scenarios, the physical constants don't change once set for the first time. And in scenarios (2)-(4), they are set at the very beginning of time.

I was confused as to why you started talking about changing constants, but it occurs to me that we may have different ideas about how the MWI explanation of fine-tuning is supposed to run. I admit I'm not familiar with cosmology. I imagine the Big Bang occurs, the universal wavefunction splits locally into branches, the branches cool down and their physical constants are fixed, and over the next 14 billion years they branch further but their constants do not change, and then life evolves in some of them. Were you imagining our world constantly branching into other worlds with slightly different constants?

Replies from: TheOtherDave
comment by TheOtherDave · 2012-01-04T21:06:38.123Z · LW(p) · GW(p)

No, I wasn't; I don't think that's our issue here.

Let me try it this way. If you say "I'm going to roll a 4 on this six-sided die", and then you roll a 4 on a six-sided die, and my observations of you are equally consistent with both of the following theories:
Theory T1: You rolled the die exactly once, and it came up a 4
Theory T2: You rolled the die several times, and stopped rolling once it came up 4
...I should choose T2, because the observed result is less surprising given T2 than T1.

Would you agree? (If you don't agree, the rest of this comment is irrelevant: that's an interesting point of disagreement I'd like to explore further. Stop reading here.)

OK, good. Just to have something to call it, let's call that the Principle of Least Surprise.

Now, suppose that in all scenarios constants are set shortly after the creation of a world, and do not subsequently change, but that the value of a constant is indeterminate prior to being set. Suppose further that life-supporting values of constants are extremely unlikely. (I think that's what we both have been supposing all along, I just want to say it explicitly.)

In scenario 1-3, we have multiple worlds with different constants. Constants that support life are unlikely, but because there are multiple worlds, it is not surprising that at least one world exists with constants that support life. We'd expect that, just like we'd expect a six-sided die to come up '4' at least once if tossed ten times. We should not be surprised that there's an observer in some world, and that world has constants that support life, in any of these cases.

In scenario 4, we have one world with one set of constants. It is surprising that that world has life-supporting constants. We ought not expect that, just like we ought not expect a six-sided die to come up '4' if tossed only once. We should be surprised that there's an observer in some world.

So. If I look around, and what I observe is equally consistent with scenarios 1-4, the Principle of Least Surprise tells me I should reject scenario 4 as an explanation.

Would you agree?

Replies from: Nisan
comment by Nisan · 2012-01-05T00:11:24.055Z · LW(p) · GW(p)

Let me try it this way. If you say "I'm going to roll a 4 on this six-sided die", and then you roll a 4 on a six-sided die, and my observations of you are equally consistent with both of the following theories:
Theory T1: You rolled the die exactly once, and it came up a 4
Theory T2: You rolled the die several times, and stopped rolling once it came up 4
...I should choose T2, because the observed result is less surprising given T2 than T1.

Would you agree? (If you don't agree, the rest of this comment is irrelevant: that's an interesting point of disagreement I'd like to explore further. Stop reading here.)

This bit is slightly ambiguous. I would agree if Theory T1 were replaced by "You decided to roll the die exactly once and then show me the result", and Theory T2 were replaced by "You decided to roll the die until it comes up '4', and then show me the result", and the two theories have equal prior probability. I think this is probably what you meant, so I'll move on.

In scenario 1-3, we have multiple worlds with different constants. Constants that support life are unlikely, but because there are multiple worlds, it is not surprising that at least one world exists with constants that support life. We'd expect that, just like we'd expect a six-sided die to come up '4' at least once if tossed ten times. We should not be surprised that there's an observer in some world, and that world has constants that support life, in any of these cases.

I agree that we should not be surprised. Although I have reservations about drawing this analogy, as I'll explain below.

In scenario 4, we have one world with one set of constants. It is surprising that that world has life-supporting constants. We ought not expect that, just like we ought not expect a six-sided die to come up '4' if tossed only once. We should be surprised that there's an observer in some world.

If we take scenario 4 as I described it — there's a scientific model where the constants are free parameters, and a straightforward parameterless modification of the model (of equal complexity) that posits one universe for every choice of constants — then I disagree; we should not be surprised. I disagree because I think the die-rolling scenario is not a good analogy for scenarios 1-4, and scenario 4 resembles Theory T2 at least as much as Theory T1.

  • Scenario 4 as I described it basically is scenario 3. The theory with free parameters isn't a complete theory, and the parameterless theory sorta does talk about other universes which kind of exist, in the sense that a straightforward interpretation of the parameterless theory talks about other universes. So scenario 4 resembles Theory T2 at least as much as it resembles Theory T1.

  • You could ask why we can't apply the same argument in the previous bullet point to the die-rolling scenario and conclude that Theory T1 is just as plausible as Theory T2. (If you don't want to ask that, please ignore the rest of this bullet point, as it could spawn an even longer discussion.) We can't because the scenarios differ in essential ways. To explain further I'll have to talk about Solomonoff induction, which makes me uncomfortable. The die-rolling scenario comes with assumptions about a larger universe with a causal structure such that (Theory T1 plus the observation '4') has greater K-complexity than (Theory T2 plus the observation '4'). But the hack that turns the theory in scenario 4 into a parameterless theory doesn't require much additional K-complexity.

Replies from: TheOtherDave
comment by TheOtherDave · 2012-01-05T18:52:05.855Z · LW(p) · GW(p)

I didn't really follow this, I'm afraid.

It seems to follow from what you're saying that the assertions "a world containing an observer exists in scenario 4" and "a world containing an observer doesn't exist in scenario 4" don't make meaningful different claims about scenario 4, since we can switch from a model that justifies the first to a model that justifies the second without any cost worth considering.

If that's right, then I guess it follows from the fact that I should be surprised to observe an environment in scenario 4 that I should not be surprised to observe an environment in scenario 4, and vice-versa, and there's not much else I can think of to say on the subject.

comment by Dan_Moore · 2012-01-04T15:58:07.200Z · LW(p) · GW(p)

By 'explain observed fine-tuning', I mean 'answer the question why does there exist a universe (which we inhabit) which is fine-tuned to be life-friendly.' The anthropic principle, while tautologically true, does not answer this question, in my view.

In other words, the existence of life does not cause our universe to be life-friendly (of course it implies that the universe is life friendly); rather, the life-friendliness of our universe is a prerequisite for the existence of life.

Replies from: Nisan
comment by Nisan · 2012-01-04T19:38:52.560Z · LW(p) · GW(p)

We may have different ideas of what sort of answers a "why does this phenomenon occur?" question deserves. You seem to be looking for a real phenomenon that causes fine-tuning, or which operates at a more fundamental level of nature. I would be satisfied with a simple, plausible fact that predicts the phenomenon. In practice, the scientific hypotheses with the greatest parsimony and predictive power tend to be causal ones, or hypotheses that explain observed phenomena as arising from more fundamental laws. But the question of where the fundamental constants of nature come from will be an exception if they are truly fundamental and uncaused.

comment by Manfred · 2012-01-02T07:17:47.515Z · LW(p) · GW(p)

You're right that observing that we're in a habitable universe doesn't tell us anything. However, there are a lot more observations about the universe that we use in discussions about quantum mechanics. And some observations suit the idea that we're know what's going on better than others. "Know what's going on" here means that a theory that is sufficient to explain all of reality in our local neighborhood is also followed more globally.

comment by Nisan · 2012-01-04T20:50:01.297Z · LW(p) · GW(p)

I glanced at Ikeda & Jefferys, and they seem to explicitly not presuppose MWI:

our argument is not dependent on the notion that there are many other universes.

At first glance, they seem to render the fine-tuning phenomenon unsurprising using only an anthropic argument, without appealing to multiverses or a simulator. I am satisfied that someone has written this down.

comment by shminux · 2011-12-30T00:51:23.352Z · LW(p) · GW(p)

As a step toward this goal, I would really appreciate someone rewriting the post you mentioned to sound more like science and less like advocacy. I tried to do that, but got lost in the forceful emotional assertions about how collapse is a gross violation of Bayes, and how "The discussion should simply discard those particular arguments and move on."

comment by saturn · 2011-12-30T00:43:33.543Z · LW(p) · GW(p)

Here's some evidence for macroscopic decoherence.

Replies from: Manfred, shminux
comment by Manfred · 2011-12-30T06:39:29.517Z · LW(p) · GW(p)

The interpretations of quantum mechanics that this sort of experiment tests are not all of the same ones as the ones Eliezer argues against. You can have "one world" interpretations that appear exactly identical to many-worlds, and indeed that's pretty typical.

Maybe I should have written this in reply to the original post.

comment by shminux · 2011-12-30T01:08:27.890Z · LW(p) · GW(p)

Actually, this is evidence for making a classical object behave in a quantum way, which seems like the opposite of decoherence.

Replies from: saturn
comment by saturn · 2011-12-30T01:35:12.925Z · LW(p) · GW(p)

I don't understand your point. How would you demonstrate macroscopic decoherence without creating a coherent object which then decoheres?

comment by MileyCyrus · 2011-12-30T00:35:49.456Z · LW(p) · GW(p)

If the SIAI engineers figure out how to construct friendly super-AI, why would they care about making it respect the values of anyone but themselves? What incentive do they have to program an AI that is friendly to humanity, and not just to themselves? What's stopping LukeProg from appointing himself king of the universe?

Replies from: Dr_Manhattan, lukeprog, jimrandomh, John_Maxwell_IV, Larks, Zed, falenas108, John_Maxwell_IV, Solvent, FiftyTwo, RomeoStevens
comment by Dr_Manhattan · 2011-12-30T02:50:45.320Z · LW(p) · GW(p)

Not an answer, but a solution:

You know what they say the modern version of Pascal's Wager is? Sucking up to as many Transhumanists as possible, just in case one of them turns into God. -- Julie from Crystal Nights by Greg Egan

:-p

comment by lukeprog · 2011-12-30T02:18:29.687Z · LW(p) · GW(p)

What's stopping LukeProg from appointing himself king of the universe?

Personal abhorrence at the thought, and lack of AI programming abilities. :)

(But, your question deserves a more serious answer than this.)

Replies from: TrueBayesian, orthonormal, Armok_GoB
comment by TrueBayesian · 2011-12-30T19:35:58.022Z · LW(p) · GW(p)

Too late - Eliezer and Will Newsome are already dual kings of the universe. They balance each other's reigns in a Ying/Yang kind of way.

comment by orthonormal · 2011-12-30T18:18:32.372Z · LW(p) · GW(p)

This is basically what I was asking before. Now, it seems to me highly unlikely that SIAI is playing that game, but I still want a better answer than "Trust us to not be supervillains".

comment by Armok_GoB · 2011-12-30T15:34:43.945Z · LW(p) · GW(p)

Serious or not, it seems correct. There might be some advanced game thoery that says otherwise, but it only aplies to those who know the game theory.

comment by jimrandomh · 2011-12-31T08:23:49.575Z · LW(p) · GW(p)

Lots of incorrect answers in other replies to this one. The real answer is that, from Luke's perspective, creating Luke-friendly AI and becoming king of the universe isn't much better than creating regular friendly AI and getting the same share of the universe as any other human. Because it turns out, after the first thousand galaxies worth of resources and trillion trillion millenia of lifespan, you hit such diminishing returns that having another seven-billion times as many resources isn't a big deal.

This isn't true for every value - he might assign value to certain things not existing, like powerful people besides him, which other people want to exist. And that last factor of seven billion is worth something. But these are tiny differences in value, utterly dwarfed by the reduced AI-creation success-rate that would happen if the programmers got into a flamewar over who should be king.

comment by John_Maxwell (John_Maxwell_IV) · 2011-12-31T06:35:31.030Z · LW(p) · GW(p)

The good guys do not write an AI which values a bag of things that the programmers think are good ideas, like libertarianism or socialism or making people happy or whatever. There were multiple Overcoming Bias sequences about this one point, like the Fake Utility Function sequence and the sequence on metaethics. It is dealt with at length in the document Coherent Extrapolated Volition. It is the first thing, the last thing, and the middle thing that I say about Friendly AI.

...

The good guys do not directly impress their personal values onto a Friendly AI.

http://lesswrong.com/lw/wp/what_i_think_if_not_why/

The rest of your question has the same answer as "why is anyone altruist to begin with", I think.

Replies from: MileyCyrus
comment by MileyCyrus · 2011-12-31T06:45:58.490Z · LW(p) · GW(p)

I understand CEV. What I don't understand is why the programmers would ask the AI for humanity's CEV, rather than just their own CEV.

Replies from: wedrifid, TheOtherDave, John_Maxwell_IV
comment by wedrifid · 2011-12-31T07:04:36.739Z · LW(p) · GW(p)

I understand CEV. What I don't understand is why the programmers would ask the AI for humanity's CEV, rather than just their own CEV.

The only (sane) reason is for signalling - it's hard to create FAI without someone else stopping you. Given a choice, however, CEV is strictly superior. If you actually do want to have FAI then FAI will be equivalent to it. But if you just think you want FAI but it turns out that, for example, FAI gets dominated by jerks in a way you didn't expect then FAI will end up better than FAI... even from a purely altruistic perspective.

comment by TheOtherDave · 2011-12-31T06:58:39.721Z · LW(p) · GW(p)

Yeah, I've wondered this for a while without getting any closer to an understanding.

It seems that everything that some human "really wants" (and therefore could potentially be included in the CEV target definition) is either something that, if I was sufficiently well-informed about it, I would want for that human (in which case my CEV, properly unpacked by a superintelligence, includes it for them) or is something that, no matter how well informed I was, I would not want for that human (in which case it's not at all clear that I ought to endorse implementing it).

If CEV-humanity makes any sense at all (which I'm not sure it does), it seems that CEV-arbitrary-subset-of-humanity makes leads to results that are just as good by the standards of anyone whose standards are worth respecting.

My working answer is therefore that it's valuable to signal the willingness to do so (so nobody feels left out), and one effective way to signal that willingness consistently and compellingly is to precommit to actually doing it.

comment by John_Maxwell (John_Maxwell_IV) · 2011-12-31T06:54:06.079Z · LW(p) · GW(p)

Is this question any different from the question of why there are altruists?

Replies from: TheOtherDave
comment by TheOtherDave · 2011-12-31T07:23:09.899Z · LW(p) · GW(p)

Sure. For example, if I want other people's volition to be implemented, that is sufficient to justify altruism. (Not necessary, but sufficient.)

But that doesn't justify directing an AI to look at other people's volition to determine its target directly... as has been said elsewhere, I can simply direct an AI to look at my volition, and the extrapolation process will naturally (if CEV works at all) take other people's volition into account.

comment by Larks · 2011-12-30T05:14:24.973Z · LW(p) · GW(p)

I think it would be significantly easier to make FAI than LukeFreindly AI: for the latter, you need to do most of the work involved in the former, but also work out how to get the AI to find you (and not accidentally be freindly to someone else).

If it turns out that there's a lot of coherance in human values, FAI will resemble LukeFreindlyAI quite closely anyway.

Replies from: wedrifid, TheOtherDave, Armok_GoB
comment by wedrifid · 2011-12-31T08:42:34.208Z · LW(p) · GW(p)

I think it would be significantly easier to make FAI than LukeFreindly AI

Massively backwards! Creating an FAI (presumably 'friendly to humanity') requires an AI that can somehow harvest and aggregate preferences over humans in general but an FAI just needs to scan one brain.

Replies from: Larks
comment by Larks · 2011-12-31T21:16:12.034Z · LW(p) · GW(p)

Scanning is unlikely to be the bottleneck for a GAI, and it seems most of the difficulty with CEV is from the Extrapolation part, not the Coherence.

Replies from: wedrifid
comment by wedrifid · 2011-12-31T21:54:32.565Z · LW(p) · GW(p)

Scanning is unlikely to be the bottleneck for a GAI, and it seems most of the difficulty with CEV is from the Extrapolation part, not the Coherence.

It doesn't matter how easy the parts may be, scanning, extrapolating and cohering all of humanity is harder than scanning and extrapolating Luke.

Replies from: torekp
comment by torekp · 2012-01-02T18:48:35.797Z · LW(p) · GW(p)

Not if Luke's values contain pointers to all those other humans.

comment by TheOtherDave · 2011-12-30T05:19:26.049Z · LW(p) · GW(p)

If FAI is HumanityFriendly rather than LukeFriendly, you have to work out how to get the AI to find humanity and not accidentally optimize for the extrapolated volition of some other group. It seems easier to me to establish parameters for "finding" Luke than for "finding" humanity.

Replies from: Larks
comment by Larks · 2011-12-30T05:29:22.893Z · LW(p) · GW(p)

Yes, it depends on whether you think Luke is more different from humanity than humanity is from StuffWeCareNotOf

Replies from: TheOtherDave
comment by TheOtherDave · 2011-12-30T10:36:34.396Z · LW(p) · GW(p)

Of course an arbitrarily chosen human's values are more similar to to the aggregated values of humanity as a whole than humanity's values are similar to an arbitrarily chosen point in value-space. Value-space is big.

I don't see how my point depends on that, though. Your argument here claims that "FAI" is easier than "LukeFriendlyAI" because LFAI requires an additional step of defining the target, and FAI doesn't require that step. I'm pointing out that FAI does require that step. In fact, target definition for "humanity" is a more difficult problem than target definition for "Luke"

comment by Armok_GoB · 2011-12-30T15:40:54.627Z · LW(p) · GW(p)

I find it much more likely that it's the other way around; making one for a single brain that already has an utility function seems much easier than finding out a good compromise between billions. Especially if the form "upload me, then preform this specific type of enchantment to enable me to safely continue self improving." turns out to be safe enough.

comment by Zed · 2011-12-30T01:58:45.765Z · LW(p) · GW(p)

Game theory. If different groups compete in building a "friendly" AI that respects only their personal extrapolated coherent violation (extrapolated sensible desires) then cooperation is no longer an option because the other teams have become "the enemy". I have a value system that is substantially different from Eliezer's. I don't want a friendly AI that is created in some researcher's personal image (except, of course, if it's created based on my ideals). This means that we have to sabotage each other's work to prevent the other researchers to get to friendly AI first. This is because the moment somebody reaches "friendly" AI the game is over and all parties except for one lose. And if we get uFAI everybody loses.

That's a real problem though. If different fractions in friendly AI research have to destructively compete with each other, then the probability of unfriendly AI will increase. That's real bad. From a game theory perspective all FAI researchers agree that any version of FAI is preferable to uFAI, and yet they're working towards a future where uFAI is becoming more and more likely! Luckily, if the FAI researchers take the coherent extrapolated violation of all of humanity the problem disappears. All FAI researchers can work to a common goal that will fairly represent all of humanity, not some specific researcher's version of "FAI". It also removes the problem of different morals/values. Some people believe that we should look at total utility, other people believe we should consider only average utility. Some people believe abstract values matter, some people believe consequences of actions matter most. Here too the solution of an AI that looks at a representative set of all human values is the solution that all people can agree on as most "fair". Cooperation beats defection.

If Luke were to attempt to create a LukeFriendlyAI he knows he's defecting from the game theoretical optimal strategy and thereby increasing the probability of a world with uFAI. If Luke is aware of this and chooses to continue on that course anyway then he's just become another uFAI researcher who actively participates in the destruction of the human species (to put it dramatically).

We can't force all AI programmers to focus on the FAI route. We can try to raise the sanity waterline and try to explain to AI researchers that the optimal (game theoretically speaking) strategy is the one we ought to pursue because it's most likely to lead to a fair FAI based on all of our human values. We just have to cooperate, despite differences in beliefs and moral values. CEV is the way to accomplish that because it doesn't privilege the AI researchers who write the code.

Replies from: Xachariah, Armok_GoB, TimS
comment by Xachariah · 2011-12-30T23:01:30.012Z · LW(p) · GW(p)

Game Theory only helps us if it's impossible to deceive others. If one is able to engage in deception, the dominant strategy becomes to pretend to support CEV FAI while actually working on your own personal God in a jar. AI development in particular seems an especially susceptible domain for deception. The creation of a working AI is a one time event, it's not like most stable games in nature which allow one to detect defections of hundreds of iterations. The creation of a working AI (FAI or uFAI) is so complicated that it's impossible for others to check if any given researcher is defecting or not.

Our best hope then is for the AI project to be so big it cannot be controlled by a single entity and definitely not by a single person. If it only takes guy in a basement getting lucky to make an AI go FOOM, we're doomed. If it takes ten thousand researchers collaborating in the biggest group coding project ever, we're probably safe. This is why doing work on CEV is so important. So we can have that piece of the puzzle already built when the rest of AI research catches up and is ready to go FOOM.

comment by Armok_GoB · 2011-12-30T15:46:39.974Z · LW(p) · GW(p)

This doesn't apply to all of humanity, just to AI researchers good enough to pose a threat.

comment by TimS · 2011-12-30T02:11:59.449Z · LW(p) · GW(p)

As I understand the terminology, AI that only respects some humans' preferences is uFAI by definition. Thus:

a friendly AI that is created in some researcher's personal image

is actually unFriendly, as Eliezer uses the term. Thus, the researcher you describe is already an "uFAI researcher"


It also removes the problem of different morals/values. Some people believe that we should look at total utility, other people believe we should consider only average utility. Some people believe abstract values matter, some people believe consequences of actions matter most. Here too the solution of an AI that looks at a representative set of all human values is the solution that all people can agree on as most "fair".

What do you mean by "representative set of all human values"? Is there any reason to that the resulting moral theory would be acceptable to implement on everyone?

Replies from: Zed
comment by Zed · 2011-12-30T02:22:36.907Z · LW(p) · GW(p)

[a "friendly" AI] is actually unFriendly, as Eliezer uses the term

Absolutely. I used "friendly" AI (with scare quotes) to denote it's not really FAI, but I don't know if there's a better term for it. It's not the same as uFAI because Eliezer's personal utopia is not likely to be valueless by my standards, whereas a generic uFAI is terrible from any human point of view (paperclip universe, etc).

Replies from: TimS
comment by TimS · 2011-12-30T02:40:31.670Z · LW(p) · GW(p)

I guess it just doesn't bother me that uFAI includes both indifferent AI and malicious AI. I honestly think that indifferent AI is much more likely than malicious (Clippy is malicious, but awfully unlikely), but that's not good for humanity's future either.

comment by falenas108 · 2011-12-30T01:39:26.342Z · LW(p) · GW(p)

Right now, and for the foreseeable future, SIAI doesn't have the funds to actually create FAI. All they're doing is creating a theory for friendliness, which can be used when someone else has the technology to create AI. And of course, nobody else is going to use the code if it focuses on SIAI.

Replies from: Vladimir_Nesov, EStokes
comment by Vladimir_Nesov · 2011-12-30T16:19:11.278Z · LW(p) · GW(p)

SIAI doesn't have the funds to actually create FAI

Funds are not a relevant issue for this particular achievement at present time. It's not yet possible to create a FAI even given all the money in the world; a pharaoh can't build a modern computer. (Funds can help with moving the time when (and if) that becomes possible closer, improving the chances that it happens this side of an existential catastrophe.)

Replies from: falenas108
comment by falenas108 · 2011-12-30T16:32:13.995Z · LW(p) · GW(p)

Yeah, I was assuming that they were able to create FAI for the sake of responding to the grandparent post. If they weren't, then there wouldn't be any trouble with SIAI making AI only friendly to themselves to begin with.

comment by EStokes · 2011-12-30T02:05:40.545Z · LW(p) · GW(p)

If they have all the threory and coded it and whatnot, where is the cost coming from?

Replies from: falenas108
comment by falenas108 · 2011-12-30T14:59:15.404Z · LW(p) · GW(p)

The theory for friendliness is completely separate from the theory of AI. So, assuming they complete one does not mean that they complete the other. Furthermore, for something as big as AI/FAI, the computing power required is likely to be huge, which makes it unlikely that a small company like SIAI will be able to create it.

Though, I suppose it might be possible if they were able to get large enough loans, I don't have the technical knowledge to say how much computing power is needed or how much that would cost.

Replies from: Psy-Kosh
comment by Psy-Kosh · 2011-12-30T19:39:00.815Z · LW(p) · GW(p)

The theory for friendliness is completely separate from the theory of AI.

??? Maybe I'm being stupid, but I suspect it's fairly hard to fully and utterly solve the friendliness problem without, by the end of doing so, AT LEAST solving many of the tricky AI problems in general.

comment by John_Maxwell (John_Maxwell_IV) · 2012-03-26T00:25:01.904Z · LW(p) · GW(p)

Now that I understand your question better, here's my answer:

Let's say the engineers decide to make the AI respect only their values. But if they were the sort of people who were likely to do that, no one would donate money to them. They could offer to make the AI respect the values of themselves and their donors, but that would alienate everyone else and make the lives of themselves and their donors difficult. The species boundary between humans and other living beings is a natural place to stop expanding the circle of enfranchised agents.

Replies from: TheOtherDave
comment by TheOtherDave · 2012-03-26T00:51:42.533Z · LW(p) · GW(p)

This seems to depend on the implicit assumption that their donors (and everyone else powerful enough to make their lives difficult) don't mind having the values of third parties respected.

If some do mind, then there's probably some optimally pragmatic balancing point short of all humans.

Replies from: John_Maxwell_IV
comment by John_Maxwell (John_Maxwell_IV) · 2012-03-26T01:37:06.522Z · LW(p) · GW(p)

Probably, but defining that balancing point would mean a lot of bureaucratic overhead to determine who to exclude or include.

Replies from: TheOtherDave
comment by TheOtherDave · 2012-03-26T03:19:22.582Z · LW(p) · GW(p)

Can you expand on what you mean by "bureaucratic" here?

Replies from: John_Maxwell_IV
comment by John_Maxwell (John_Maxwell_IV) · 2012-03-26T03:32:31.537Z · LW(p) · GW(p)

Are people going to vote on whether someone should be included? Is there an appeals process? Are all decisions final?

Replies from: TheOtherDave
comment by TheOtherDave · 2012-03-26T13:01:00.652Z · LW(p) · GW(p)

OK, thanks.

It seems to me all these questions arise for "include everyone" as well. Somewhere along the line someone is going to suggest "don't include fundamentalist Christians", for example, and if I'm committed to the kind of democratic decision process you imply, then we now need to have a vote, or at least decide whether we have a vote, etc. etc, all of that bureaucratic overhead.

Of course, that might not be necessary; I could just unilaterally override that suggestion, mandate "No, we include everyone!", and if I have enough clout to make that stick, then it sticks, with no bureaucratic overhead. Yay! This seems to more or less be what you have in mind.

It's just that the same goes for "Include everyone except fundamentalist Christians."

In any case, I don't see how any of this cumbersome democratic machinery makes any sense in this scenario. Actually working out CEV implies the existence of something, call it X, that is capable of extrapolating a coherent volition from the state of a group of minds. What's the point of voting, appeals, etc. when that technology is available? X itself is a better solution to the same problem.

Which implies that it's possible to identify a smaller group of minds as the Advisory Board and say to X "Work out the Advisory Board's CEV with respect to whose minds should be included as input to a general-purpose optimizer's target definition, then work out the CEV of those minds with respect to the desired state of the world."
Then anyone with enough political clout to get in my way, I add to the Advisory Board, thereby ensuring that their values get taken into consideration (including their values regarding whose values get included).

That includes folks who think everyone should get an equal say, folks who think that every human should get an equal say, folks who think that everyone with more than a certain threshold level of intelligence and moral capacity get a say, folks who think that everyone who agrees with them get a say, etc., etc. X works all of that out, and spits out a spec on the other side for who actually gets a say and to what degree, which it then takes as input to the actual CEV-extrapolating process.

This seems kind of absurd to me, but no more so than the idea that X can work out humanity's CEV at all. If I'm granting that premise for the sake of argument, everything else seems to follow.

Replies from: John_Maxwell_IV
comment by John_Maxwell (John_Maxwell_IV) · 2012-03-26T18:31:56.299Z · LW(p) · GW(p)

It's just that the same goes for "Include everyone except fundamentalist Christians."

There is no clear bright line determining who is or is not a fundamentalist Christian. Right now, there pretty much is a clear bright line determining who is or is not human. And that clear bright line encompasses everyone we would possibly want to cooperate with.

Your advisory board suggestion ignores the fact that we have to be able to cooperate prior to the invention of CEV deducers.

And you're not describing a process for how the advisory board is decided either. Different advisory boards may produce different groups of enfranchised minds. So your suggestion doesn't resolve the problem.

In fact, I don't see how putting a group of minds on the advisory board is any different than just making them the input to the CEV. If a person's CEV is that someone's mind should contribute to the optimizer's target, that will be their CEV regardless of whether it's measured in an advisory board context or not.

Replies from: TheOtherDave, kodos96
comment by TheOtherDave · 2012-03-26T23:02:11.419Z · LW(p) · GW(p)

There is no clear bright line determining who is or is not a fundamentalist Christian.

There is no clear bright line determining what is or isn't a clear bright line.

I agree that the line separating "human" from "non-human" is much clearer and brighter than that separating "fundamentalist Christian" from "non-fundamentalist Christian", and I further agree that for minds like mine the difference between those two lines is very important. Something with a mind like mine can work with the first distinction much more easily than with the second.

So what?

A mind like mine doesn't stand a chance of extrapolating a coherent volition from the contents of a group of target minds. Whatever X is, it isn't a mind like mine.

If we don't have such an X available, then it doesn't matter what defining characteristic we use to determine the target group for CEV extrapolation, because we can't extrapolate CEV from them anyway.

If we do have such an X available, then it doesn't matter what lines are clear and bright enough for minds like mine to reliably work with; what matters is what lines are clear and bright enough for systems like X to reliably work with.

Right now, there pretty much is a clear bright line determining who is or is not human. And that clear bright line encompasses everyone we would possibly want to cooperate with.

I have confidence < .1 that either one of us can articulate a specification determining who is human that doesn't either include or exclude some system that someone included in that specification would contest the inclusion/exclusion of.

I also have confidence < .1 that, using any definition of "human" you care to specify, the universe contains no nonhuman systems I would possibly want to cooperate with.

Your advisory board suggestion ignores the fact that we have to be able to cooperate prior to the invention of CEV deducers.

Sure, but so does your "include all humans" suggestion. We're both assuming that there's some way the AI-development team can convincingly commit to a policy P such that other people's decisions to cooperate will plausibly be based on the belief that P will actually be implemented when the time comes; we are neither of us specifying how that is actually supposed to work. Merely saying "I'll include all of humanity" isn't good enough to ensure cooperation if nobody believes me.

I have confidence that, given a mechanism for getting from someone saying "I'll include all of humanity" to everyone cooperating, I can work out a way to use the same mechanism to get from someone saying "I'll include the Advisory Board, which includes anyone with enough power that I care whether they cooperate or not" to everyone I care about cooperating.

And you're not describing a process for how the advisory board is decided either.

I said: "Then anyone with enough political clout to get in my way, I add to the Advisory Board." That seems to me as well-defined a process as "I decide to include every human being."

Different advisory boards may produce different groups of enfranchised minds.

Certainly.

So your suggestion doesn't resolve the problem.

Can you say again which problem you're referring to here? I've lost track.

In fact, I don't see how putting a group of minds on the advisory board is any different than just making them the input to the CEV.

Absolutely agreed.

Consider the implications of that, though.

Suppose you have a CEV-extractor and we're the only two people in the world, just for simplicity.
You can either point the CEV-extractor at yourself, or at both of us.
If you genuinely want me included, then it doesn't matter which you choose; the result will be the same.
Conversely, if the result is different, that's evidence that you don't genuinely want me included, even if you think you do.

Knowing that, why would you choose to point the CEV-extractor at both of us?

One reason for doing so might be that you'd precommitted to doing so (or some UDT equivalent), so as to secure my cooperation. Of course, if you can secure my cooperation without such a precommitment (say, by claiming you would point it at both of us), that's even better.

Replies from: John_Maxwell_IV
comment by John_Maxwell (John_Maxwell_IV) · 2012-03-27T00:44:57.322Z · LW(p) · GW(p)

Complicated or ambiguous schemes take more time to explain, get more attention, and risk folks spending time trying to gerrymander their way in instead of contributing to FAI.

I think any solution other than "enfranchise humanity" is a potential PR disaster.

Keep in mind that not everyone is that smart, and there are some folks who would make a fuss about disenfranchisement of others even if they themselves were enfranchised (and therefore, by definition, those they were making a fuss about would be enfranchised if they thought it was a good idea).

I agree there are potential ambiguity problems with drawing the line at humans, but I think the potential problems are bigger with other schemes.

Sure, but so does your "include all humans" suggestion. We're both assuming that there's some way the AI-development team can convincingly commit to a policy P such that other people's decisions to cooperate will plausibly be based on the belief that P will actually be implemented when the time comes; we are neither of us specifying how that is actually supposed to work. Merely saying "I'll include all of humanity" isn't good enough to ensure cooperation if nobody believes me.

I agree there are potential problems with credibility, but that seems like a separate argument.

I have confidence that, given a mechanism for getting from someone saying "I'll include all of humanity" to everyone cooperating, I can work out a way to use the same mechanism to get from someone saying "I'll include the Advisory Board, which includes anyone with enough power that I care whether they cooperate or not" to everyone I care about cooperating.

It's not all or nothing. The more inclusive the enfranchisement, the more cooperation there will be in general.

I said: "Then anyone with enough political clout to get in my way, I add to the Advisory Board." That seems to me as well-defined a process as "I decide to include every human being."

With that scheme, you're incentivizing folks to prove they have enough political clout to get in your way.

Moreover, humans aren't perfect reasoning systems. Your way of determining enfranchisement sounds a lot more adversarial than mine, which would affect the tone of the effort in a big and undesirable way.

Why do you think that the right to vote in democratic countries is as clearly determined as it is? Restricting voting rights to those of a certain IQ or higher would be a politically unfeasible PR nightmare.

One reason for doing so might be that you'd precommitted to doing so (or some UDT equivalent), so as to secure my cooperation. Of course, if you can secure my cooperation without such a precommitment (say, by claiming you would point it at both of us), that's even better.

Again, this is a different argument about why people cooperate instead of defect. To a large degree, evolution hardwired us to cooperate, especially when others are trying to cooperate with us.

I agree that if the FAI project seems to be staffed with a lot of untrustworthy, selfish backstabbers, we should cast a suspicious eye on it regardless of what they say about their project.

Ultimately it probably doesn't matter much what their broadcasted intention towards the enfranchisement of those outside their group is, since things will largely come down to what their actual intentions are.

Replies from: TheOtherDave
comment by TheOtherDave · 2012-03-27T01:33:29.866Z · LW(p) · GW(p)

It's not all or nothing. The more inclusive the enfranchisement, the more cooperation there will be in general.

That's not clear to me.

Suppose the Blues and the Greens are political opponents. If I credibly commit to pointing my CEV-extractor at all the Blues, I gain the support of most Blues and the opposition of most Greens. If I say "at all Blues and Greens" instead, I gain the support of some of the Greens, but I lose the support of some of the Blues, who won't want any part of a utopia patterned even partially on hateful Green ideologies.

This is almost undoubtedly foolish of the Blues, but I nevertheless expect it. As you say, people aren't all that smart.

The question is, is the support I gain from the Greens by including them worth the support I lose from the Blues by including the Greens? Of course it depends. That said, the strong support of a sufficiently powerful small group is often more valuable than the weak support of a more powerful larger group, so I'm not nearly as convinced as you sound that saying "we'll incorporate the values of both you and your hated enemies!" will get more net support than picking a side and saying "we'll incorporate your values and not those of your hated enemies."

With that scheme, you're incentivizing folks to prove they have enough political clout to get in your way.

Sure, that's true.
Heck, they don't have to prove it; if they give me enough evidence to consider it plausible, I'll include 'em.
So what?

Moreover, humans aren't perfect reasoning systems. Your way of determining enfranchisement sounds a lot more adversarial than mine, which would affect the tone of the effort in a big and undesirable way.

I think you underestimate how threatening egalitarianism sounds to a lot of people, many of whom have a lot of power. Cf including those hateful Greens, above. That said, I suspect there's probably ways to spin your "include everyone" idea in such a way that even the egalitarianism-haters will not oppose it too strongly. But I also suspect there's ways to spin my "don't include everyone" idea in such a way that even the egalitarianism-lovers will not oppose it too strongly.

Why do you think that the right to vote in democratic countries is as clearly determined as it is?

Because many people believe it represents power. That's also why it's not significantly more clearly determined. It's also why that right is not universal.

Restricting voting rights to those of a certain IQ or higher would be a politically unfeasible PR nightmare.

Sure, I agree. Nor would I recommend announcing that we're restricting the advisory board to people of a certain IQ or higher, for analogous reasons. (Also it would be a silly thing to do, but that's beside the point, we're talking about sales and not implementation here.) I'm not sure why you bring it up. I also wouldn't recommend (in my country) announcing restrictions based on skin color, income, religious affiliation, or a wide variety of other things.

On the other hand, in my country, we successfully exclude people below a certain age from voting, and I correspondingly expect announcing restrictions based on age to not be too big a deal. Mostly this is because young people have minimal political clout. (And as you say, this incentivizes young people to prove they have political clout, and sometimes they even try, but mostly nobody cares because in fact they don't.)

Conversely, extending voting rights to everyone regardless of age would be a politically unfeasible PR nightmare, and I would not recommend announcing that we're including everyone regardless of age (which I assume you would recommend, since 2-year-olds are human beings by many people's bright line test), for similar reasons.

(Somewhat tangentially: extending CEV inclusion, or voting rights, to everyone regardless of age would force us as a matter of logic to either establish a cutoff at birth or not establish a cutoff at birth. Either way we'd then have stepped in the pile of cow manure that is U.S. abortion politics, where the only winning move is not to play. What counts as a human being simply isn't as politically uncontroversial a question as you're making it sound.)

Again, this is a different argument about why people cooperate instead of defect.

Sorry, you've lost me. Can you clarify what the different arguments you refer to here are, and why the difference between them matters in this context?

Ultimately it probably doesn't matter much what their broadcasted intention towards the enfranchisement of those outside their group is, since things will largely come down to what their actual intentions are.

Once they succeed in building a CEV-extractor and a CEV-implementor, then yeah, their broadcast intentions probably don't matter much. Until then, they can matter a lot.

Replies from: John_Maxwell_IV
comment by John_Maxwell (John_Maxwell_IV) · 2012-03-27T02:23:20.748Z · LW(p) · GW(p)

What you see as the factors holding back people from cooperating with modern analogues of FAI projects? Do you think those modern analogues could derive improved cooperation through broadcasting specific enfranchisement policy?

As a practical matter, it looks to me like the majority of wealthy, intelligent, rational modern folks an FAI project might want to cooperate with lean towards egalitarianism and humanism, not blues versus greens type sectarianism.

If you don't think someone has enough political clout to bother with, they'll be incentivized to prove you wrong. Even if you're right most of the time, you'll be giving yourself trouble.

I agree that very young humans are a potential difficult gray area. One possible solution is to simulate their growth into adults before computing their CEV. Presumably the age at which their growth should be simulated up to is not as controversial as who should be included.

Sorry, you've lost me. Can you clarify what the different arguments you refer to here are, and why the difference between them matters in this context?

FAI team trustworthiness is a different subject than optimal enfranchisement structure.

Replies from: TheOtherDave
comment by TheOtherDave · 2012-03-27T03:22:04.983Z · LW(p) · GW(p)

What you see as the factors holding back people from cooperating with modern analogues of FAI projects?

I'm not sure what those modern analogues are, but in general here are a few factors I see preventing people from cooperating on projects where both mutual cooperation and unilateral cooperation would be beneficial:

  • Simple error in calculating the expected value of cooperating.
  • Perceiving more value in obtaining higher status within my group by defending my group's wrong beliefs about the project's value than in defecting from my group by cooperating in the project
  • Perceiving more value in continuing to defend my previously articulated position against the project (e.g., in being seen as consistent or as capable of discharging earlier commitments) than in changing my position and cooperating in the project

Why do you ask?

Do you think those modern analogues could derive improved cooperation through broadcasting specific enfranchisement policy?

I suspect that would be an easier question to answer with anything other than "it depends" if I had a specific example to consider. In general, I expect that it depends on who is motivated to support the project now to what degree, and the specific enfranchisement policy under discussion, and what value they perceive in that policy.

As a practical matter, it looks to me like the majority of wealthy, intelligent, rational modern folks an FAI project might want to cooperate with lean towards egalitarianism and humanism, not blues versus greens type sectarianism.

Sure, that's probably true, at least for some values of "lean towards" (there's a lot to be said here about actual support and signaled support but I'm not sure it matters). And it will likely remain true for as long as the FAI project in question only cares about the cooperation of wealthy, intelligent, rational modern folks, which they are well advised to continuing to do for as long as FAI isn't a subject of particular interest to anyone else, and to stop doing as soon as possible thereafter.

If you don't think someone has enough political clout to bother with, they'll be incentivized to prove you wrong. Even if you're right most of the time, you'll be giving yourself trouble.

(shrug) Sure, there's some nonzero expected cost to the brief window between when they start proving their influence and I concede and include them.

One possible solution is to simulate their growth into adults before computing their CEV. Presumably the age at which their growth should be simulated up to is not as controversial as who should be included.

Can you clarify what the relevant difference is between including a too-young person in the target for a CEV-extractor, vs. pointing a growth-simulator at the too-young-person and including the resulting simulated person in the target for a CEV-extractor?

FAI team trustworthiness is a different subject than optimal enfranchisement structure.

I agree with this, certainly.

Replies from: John_Maxwell_IV
comment by John_Maxwell (John_Maxwell_IV) · 2012-03-27T04:03:06.577Z · LW(p) · GW(p)

Why do you ask?

It was mainly rhetorical; I tend to think that what holds back today's FAI efforts is lack of rationality and inability of folks to take highly abstract arguments seriously.

Can you clarify what the relevant difference is between including a too-young person in the target for a CEV-extractor, vs. pointing a growth-simulator at the too-young-person and including the resulting simulated person in the target for a CEV-extractor?

Potentially bad things that could happen from implementing the CEV of a two-year-old.

Replies from: TheOtherDave
comment by TheOtherDave · 2012-03-27T04:27:12.489Z · LW(p) · GW(p)

I conclude that I do not understand what you think the CEV-extractor is doing.

Replies from: John_Maxwell_IV
comment by John_Maxwell (John_Maxwell_IV) · 2012-03-27T05:33:58.025Z · LW(p) · GW(p)

Humans acquire morality as part of their development. Three-year-olds have a different, more selfish morality than older folks. There's no reason in principle why a three-year-old who was "more the person he wished he was" would necessarily be a moral adult...

CEV does not mean considering the preferences of an agent who is "more moral". There is no such thing. Morality is not a scalar quantity. I certainly hope the implementation would end up favoring the sort of morals I like enough to calculate the CEV of a three-year-old and get an output similar to that of an adult, but it seems like a bad idea to count on the implementation being that robust.

Replies from: TheOtherDave
comment by TheOtherDave · 2012-03-27T13:05:41.531Z · LW(p) · GW(p)

Consider the following three target-definitions for a superhuman optimizer:
a) one patterned on the current preferences of a typical three-year-old
b) one patterned on the current preferences of a typical thirty-year old
c) one that is actually safe to implement (aka "Friendly")

I understand you to be saying that the gulf between A and C is enormous, and I quite agree. I have not the foggiest beginnings of a clue how one might go about building a system that reliably gets from A to C and am not at all convinced it's possible.

I would say that the gulf between B and C is similarly enormous, and I'm equally ignorant of how to build a system that spans it. But this whole discussion (and all discussions of CEV-based FAI) presumes that this gulf is spannable in practice. If we can span the B-C gulf, I take that as strong evidence indicating that we can span the A-C gulf.

Put differently: to talk seriously about implementing an FAI based on the CEV of thirty-year-olds, but at the same time dismiss the idea of doing so based on the CEV of three-year-olds, seems roughly analogous to seriously setting out to build a device that lets me teleport from Boston to Denver without occupying the intervening space, but dismissing the idea of building one that goes from Boston to San Francisco as a laughable fantasy because, as everyone knows, San Francisco is further away than Denver.

That's why I said I don't understand what you think the extractor is doing. I can see where, if I had a specific theory of how a teleporter operates, I might confidently say that it can span 2k miles but not 3k miles, arbitrary as that sounds in the absence of such a theory. Similarly, if I had a specific theory of how a CEV-extractor operates, I might confidently say it can work safely on a 30-year-old mind but not a 3-year-old. It's only in the absence of such a theory that such a claim is arbitrary.

Replies from: John_Maxwell_IV
comment by John_Maxwell (John_Maxwell_IV) · 2012-03-27T18:53:20.908Z · LW(p) · GW(p)

It seems likely to me that the CEV of the 30-year-old would be friendly and the CEV of the three-year-old would not be, but as you say at this point it's hard to say much for sure.

Replies from: TheOtherDave
comment by TheOtherDave · 2012-03-27T19:30:55.388Z · LW(p) · GW(p)

(nods) That follows from what you've said earlier.

I suspect we have very different understandings of how similar the 30-year-old's desires are to their volition.

Perhaps one way of getting at that difference is thus: how likely do you consider it that the CEV of a 30-year-old would be something that, if expressed in a form that 30-year-old can understand (say, for example, the opportunity to visit a simulated world for a year that is constrained by that CEV), would be relatively unsurprising to that 30-year-old... something that would elicit "Oh, cool, yeah, this is more or less what I had in mind" rather than "Holy Fucking Mother of God what kind of an insane world IS this?!?"?

For my own part, I consider the latter orders of magnitude more likely.

Replies from: John_Maxwell_IV
comment by John_Maxwell (John_Maxwell_IV) · 2012-03-27T19:38:13.869Z · LW(p) · GW(p)

I'm pretty uncertain.

comment by kodos96 · 2013-01-01T00:32:35.240Z · LW(p) · GW(p)

There is no clear bright line determining who is or is not a fundamentalist Christian. Right now, there pretty much is a clear bright line determining who is or is not human.

Is there? What about unborn babies? What about IVF fetuses? People in comas? Cryo-presevered bodies? Sufficiently-detailed brain scans?

comment by Solvent · 2011-12-30T10:51:43.012Z · LW(p) · GW(p)

Short answer is that they're nice people, and they understand that power corrupts, so they can't even rationalize wanting to be king of the universe for altruistic reasons.

Also, a post-Singularity future will probably (hopefully) be absolutely fantastic for everyone, so it doesn't matter whether you selfishly get the AI to prefer you or not.

comment by FiftyTwo · 2012-01-02T00:52:41.973Z · LW(p) · GW(p)

I for one welcome our new singularitarian overlords!

comment by RomeoStevens · 2011-12-31T09:04:00.538Z · LW(p) · GW(p)

I should hope most intelligent people realize they just want to be king of their own sensory inputs.

Replies from: Solvent
comment by Solvent · 2011-12-31T14:20:01.699Z · LW(p) · GW(p)

That's not actually the consensus here at LW: most people would rather not be delusional.

Replies from: RomeoStevens
comment by RomeoStevens · 2012-01-01T04:33:09.500Z · LW(p) · GW(p)

It was tongue in cheek. I realize that cypher complex (the matrix) is not common. I also think the rest of you are insane.

comment by _ozymandias · 2011-12-30T01:05:11.136Z · LW(p) · GW(p)

Before I ask these questions, I'd like to say that my computer knowledge is limited to "if it's not working, turn it off and turn it on again" and the math I intuitively grasp is at roughly a middle-school level, except for statistics, which I'm pretty talented at. So, uh... don't assume I know anything, okay? :)

How do we know that an artificial intelligence is even possible? I understand that, in theory, assuming that consciousness is completely naturalistic (which seems reasonable), it should be possible to make a computer do the things neurons do to be conscious and thus be conscious. But neurons work differently than computers do: how do we know that it won't take an unfeasibly high amount of computer-form computing power to do what brain-form computing power does?

I've seen some mentions of an AI "bootstrapping" itself up to super-intelligence. What does that mean, exactly? Something about altering its own source code, right? How does it know what bits to change to make itself more intelligent? (I get the feeling this is a tremendously stupid question, along the lines of "if people evolved from apes then why are there still apes?")

Finally, why is SIAI the best place for artificial intelligence? What exactly is it doing differently than other places trying to develop AI? Certainly the emphasis on Friendliness is important, but is that the only unique thing they're doing?

Replies from: lukeprog, None, Zetetic, None, TimS
comment by lukeprog · 2011-12-30T02:14:22.310Z · LW(p) · GW(p)

Consciousness isn't the point. A machine need not be conscious, or "alive", or "sentient," or have "real understanding" to destroy the world. The point is efficient cross-domain optimization. It seems bizarre to think that meat is the only substrate capable of efficient cross-domain optimization. Computers already surpass our abilities in many narrow domains; why not technology design or general reasoning, too?

Neurons work differently than computers only at certain levels of organization, which is true for every two systems you might compare. You can write a computer program that functionally reproduces what happens when neurons fire, as long as you include enough of the details of what neurons do when they fire. But I doubt that replicating neural computation is the easiest way to build a machine with a human-level capacity for efficient cross-domain optimization.

How does it know what bits to change to make itself more intelligent?

There is an entire field called "metaheuristics" devoted to this, but nothing like improving general abilities at efficient cross-domain optimization. I won't say more about this at the moment because I'm writing some articles about it, but Chalmers' article analyzes the logical structure of intelligence explosion in some detail.

Finally, why is SIAI the best place for artificial intelligence? What exactly is it doing differently than other places trying to develop AI?

The emphasis on Friendliness is the key thing that distinguishes SIAI and FHI from other AI-interested organizations, and is really the whole point. To develop full-blown AI without Friendliness is to develop world-destroying unfriendly AI.

Replies from: _ozymandias, Will_Newsome
comment by _ozymandias · 2011-12-30T05:25:46.585Z · LW(p) · GW(p)

Thank you for the link to the Chalmers article: it was quite interesting and I think I now have a much firmer grasp on why exactly there would be an intelligence explosion.

comment by Will_Newsome · 2011-12-30T03:50:54.346Z · LW(p) · GW(p)

Consciousness isn't the point. A machine need not be conscious, or "alive", or "sentient," or have "real understanding" to destroy the world.

(I see what you mean, but technically speaking your second sentence is somewhat contentious and I don't think it's necessary for your point to go through. Sorry for nitpicking.)

Replies from: Vladimir_Nesov, endoself
comment by Vladimir_Nesov · 2011-12-30T16:31:54.142Z · LW(p) · GW(p)

(Slepnev's "narrow AI argument" seems to be related. A "narrow AI" that can win world-optimization would arguably lack person-like properties, at least on the stage where it's still a "narrow AI".)

comment by endoself · 2011-12-30T20:02:24.256Z · LW(p) · GW(p)

This is wrong in a boring way; you're supposed to be wrong in interesting ways. :-)

comment by [deleted] · 2011-12-30T13:29:55.247Z · LW(p) · GW(p)

How do we know that an artificial intelligence is even possible? I understand that, in theory, assuming that consciousness is completely naturalistic (which seems reasonable), it should be possible to make a computer do the things neurons do to be conscious and thus be conscious. But neurons work differently than computers do: how do we know that it won't take an unfeasibly high amount of computer-form computing power to do what brain-form computing power does?

What prevents you from making a meat-based AI?

Replies from: orthonormal
comment by Zetetic · 2011-12-30T03:04:25.459Z · LW(p) · GW(p)

A couple of things come to mind, but I've only been studying the surrounding material for around eight months so I can't guarantee a wholly accurate overview of this. Also, even if accurate, I can't guarantee that you'll take to my explanation.

Anyway, the first thing is that brain form computing probably isn't a necessary or likely approach to artificial general intelligence (AGI) unless the first AGI is an upload. There doesn't seem to be good reason to build an AGI in a manner similar to a human brain and in fact, doing so seems like a terrible idea. The issues with opacity of the code would be nightmarish (I can't just look at a massive network of trained neural networks and point to the problem when the code doesn't do what I thought it would).

The second is that consciousness is not necessarily even related to the issue of AGI, the AGI certainly doesn't need any code that tries to mimick human thought. As far as I can tell, all it really needs (and really this might be putting more constraints than are necessary) is code that allows it to adapt to general environments (transferability) that have nice computable approximations it can build by using the data it gets through it's sensory modalities (these can be anything from something familiar, like a pair of cameras, or something less so like a geiger counter or some kind of direct feed from thousands of sources at once).

Also, a utility function that encodes certain input patterns with certain utilities, some [black box] statistical hierarchical feature extraction [/black box] so it can sort out useful/important features in its environment that it can exploit. Researchers in the areas of machine learning and reinforcement learning are working on all of this sort of stuff, it's fairly mainstream.

As far as computing power - the computing power of the human brain is definitely measurable so we can do a pretty straightforward analysis of how much more is possible. As far as raw computing power, I think we're actually getting quite close to the level of the human brain, but I can't seem to find a nice source for this. There are also interesting "neuromorphic" technologies geared to stepping up the massively parallel processing (many things being processed at once) and scale down hardware size by a pretty nice factor (I can't recall if it was 10 or 100), such as the SyNAPSE project. In addition, with things like cloud/distributed computing, I don't think that getting enough computing power together is likely to be much of an issue.

Bootstrapping is a metaphor referring to the ability of a process to proceed on its own. So a bootstrapping AI is one that is able to self-improve along a stable gradient until it reaches superintelligence. As far as "how does it know what bits to change", I'm going to interpret that as "How does it know how to improve itself". That's tough :) . We have to program it to improve automatically by using the utility function as a guide. In limited domains, this is easy and has already been done. It's called reinforcement learning. The machine reads off its environment after taking an action an updates its "policy" (the function it uses to pick its actions) after getting feedback (positive or negative or no utility).

The tricky part is having a machine that can self-improve not just by reinforcement in a single domain, but in general, both by learning and by adjusting its own code to be more efficient, all while keeping its utility function intact - so it doesn't start behaving dangerously.

As far as SIAI, I would say that Friendliness is the driving factor. Not because they're concerned about friendliness, but because (as far as I know) they're the first group to be seriously concerned with friendliness and one of the only groups (the other two being headed by Nick Bostrom and having ties to SIAI) concerned with Friendly AI.

Of course the issue is that we're concerned that developing a generally intelligent machine is probable, and if it happens to be able to self improve to a sufficient level it will be incredibly dangerous if no one put in some serious, serious effort into thinking about how it could go wrong and solving all of the problems necessary to safeguard against that. If you think about it, the more powerful the AGI is, the more needs to be considered. An AGI that has access to massive computing power, can self improve and can get as much information (from the internet and other sources) as it wants, could easily be a global threat. This is, effectively, because the utility function has to take into account everything the machine can affect in order to guarantee we avoid catastrophe. An AGI that can affect things at a global scale needs to take everyone into consideration, otherwise it might, say, drain all electricity from the Eastern seaboard (including hospitals and emergency facilities) in order to solve a math problem. It won't "know" not to do that, unless it's programed to (by properly defining its utility function to make it take those things into consideration). Otherwise it will just do everything it can to solve the math problem and pay no attention to anything else. This is why keeping the utility function intact is extremely important. Since only a few groups, SIAI, Oxford's FHI and the Oxford Martin Programme on the Impacts of Future Technologies, seem to be working on this, and it's an incredibly difficult problem, I would much rather have SIAI develop the first AGI than anywhere else I can think of.

Hopefully that helps without getting too mired in details :)

Replies from: _ozymandias, Vladimir_Nesov
comment by _ozymandias · 2011-12-30T04:20:10.443Z · LW(p) · GW(p)

The second is that consciousness is not necessarily even related to the issue of AGI, the AGI certainly doesn't need any code that tries to mimick human thought. As far as I can tell, all it really needs (and really this might be putting more constraints than are necessary) is code that allows it to adapt to general environments (transferability) that have nice computable approximations it can build by using the data it gets through it's sensory modalities (these can be anything from something familiar, like a pair of cameras, or something less so like a geiger counter or some kind of direct feed from thousands of sources at once).

Also, a utility function that encodes certain input patterns with certain utilities, some [black box] statistical hierarchical feature extraction [/black box] so it can sort out useful/important features in its environment that it can exploit. Researchers in the areas of machine learning and reinforcement learning are working on all of this sort of stuff, it's fairly mainstream.

I am not entirely sure I understood what was meant by those two paragraphs. Is a rough approximation of what you're saying "an AI doesn't need to be conscious, an AI needs code that will allow it to adapt to new environments and understand data coming in from its sensory modules, along with a utility function that will tell it what to do"?

Replies from: Zetetic
comment by Zetetic · 2011-12-30T04:35:30.027Z · LW(p) · GW(p)

Yeah, I'd say that's a fair approximation. The AI needs a way to compress lots of input data into a hierarchy of functional categories. It needs a way to recognize a cluster of information as, say, a hammer. It also needs to recognize similarities between a hammer and a stick or a crow bar or even a chair leg, in order to queue up various policies for using that hammer (if you've read Hofstadter, think of analogies) - very roughly, the utility function guides what it "wants" done, the statistical inference guides how it does it (how it figures out what actions will accomplish its goals). That seems to be more or less what we need for a machine to do quite a bit.

If you're just looking to build any AGI, he hard part of those two seems to be getting a nice, working method for extracting statistical features from its environment in real time. The (significantly) harder of the two for a Friendly AI is getting the utility function right.

comment by Vladimir_Nesov · 2011-12-30T16:44:43.173Z · LW(p) · GW(p)

An AGI that has access to massive computing power, can self improve and can get as much information (from the internet and other sources) as it wants, could easily be a global threat.

Interestingly, hypothetical UFAI (value drift) risk is something like other existential risks in its counterintuitive impact, but more so, in that (compared to some other risks) there are many steps where you can fail, that don't appear dangerous beforehand (because nothing like that ever happened), but that might also fail to appear dangerous after-the-fact, and therefore as properties of imagined scenarios where they're allowed to happen. The grave implications aren't easy to spot. Assuming soft takeoff, a prototype AGI escapes to the Internet - would that be seen as a big deal if it didn't get enough computational power to become too disruptive? In 10 years it grown up to become a major player, and in 50 years it controls the whole future...

Even without assuming intelligence explosion or other extraordinary effects, the danger of any misstep is absolute, and yet arguments against these assumptions are taken as arguments against the risk.

comment by [deleted] · 2011-12-30T02:14:45.327Z · LW(p) · GW(p)

How do we know that an artificial intelligence is even possible? I understand that, in theory, assuming that consciousness is completely naturalistic (which seems reasonable), it should be possible to make a computer do the things neurons do to be conscious and thus be conscious. But neurons work differently than computers do: how do we know that it won't take an unfeasibly high amount of computer-form computing power to do what brain-form computing power does?

As far as we know, it easily could require an insanely high amount of computing power. The thing is, there are things out there that have as much computing power as human brains—namely, human brains themselves. So if we ever become capable of building computers out of the same sort of stuff that human brains are built out of (namely, really tiny machines that use chemicals and stuff), we'll certainly be able to create computers with the same amount of raw power as the human brain.

How hard will it be to create intelligent software to run on these machines? Well, creating intelligent beings is hard enough that humans haven't managed to do it in a few decades of trying, but easy enough that evolution has done it in three billion years. I don't think we know much else about how hard it is.

I've seen some mentions of an AI "bootstrapping" itself up to super-intelligence. What does that mean, exactly? Something about altering its own source code, right?

Well, "bootstrapping" is the idea of AI "pulling itself up by its own bootstraps", or, in this case, "making itself more intelligent using its own intelligence". The idea is that every time the AI makes itself more intelligent, it will be able to use its newfound intelligence to find even more ways to make itself more intelligent.

Is it possible that the AI will eventually "hit a wall", and stop finding ways to improve itself? In a word, yes.

How does it know what bits to change to make itself more intelligent?

There's no easy way. If it knows the purpose of each of its parts, then it might be able to look at a part, and come up with a new part that does the same thing better. Maybe it could look at the reasoning that went into designing itself, and think to itself something like, "What they thought here was adequate, but the system would work better if they had known this fact." Then it could change the design, and so change itself.

comment by TimS · 2011-12-30T01:59:39.279Z · LW(p) · GW(p)

How do we know that an artificial intelligence is even possible? I understand that, in theory, assuming that consciousness is completely naturalistic (which seems reasonable), it should be possible to make a computer do the things neurons do to be conscious and thus be conscious. But neurons work differently than computers do

The highlighted portion of your sentence is not obvious. What exactly do you mean by work differently? There's a thought experiment (that you've probably heard before) about replacing your neurons, one by one, with circuits that behave identically to each replaced neuron. The point of the hypo is to ask when, if ever, you draw the line and say that it isn't you anymore. Justifying any particular answer is hard (since it is axiomatically true that the circuit reacts the way that the neuron would).
I'm not sure that circuit-neuron replacement is possible, but I certainly couldn't begin to justify (in physics terms) why I think that. That is, the counter-argument to my position is that neurons are physical things and thus should obey the laws of physics. If the neuron was build once (and it was, since it exists in your brain), what law of physics says that it is impossible to build a duplicate?

how do we know that it won't take an unfeasibly high amount of computer-form computing power to do what brain-form computing power does?

I'm not physicist, but I don't know that it is feasible (or understand the science well enough to have an intelligent answer). That said, it is clearly feasible with biological parts (again, neurons actually exist).

I've seen some mentions of an AI "bootstrapping" itself up to super-intelligence. What does that mean, exactly? Something about altering its own source code, right? How does it know what bits to change to make itself more intelligent? (I get the feeling this is a tremendously stupid question, along the lines of "if people evolved from apes then why are there still apes?")

By hypothesis, the AI is running a deterministic process to make decisions. Let's say that the module responsible for deciding Newcomb problems is originally coded to two-box. Further, some other part of the AI decides that this isn't the best choice for achieving AI goals. So, the Newcomb module is changed so that it decides to one-box. Presumably, doing this type of improvement repeatedly to will make the AI better and better at achieving its goals. Especially if the self-improvement checker can itself by improved somehow.

It's not obvious to me that this leads to super intelligence (i.e. Straumli-perversion level intelligence, if you've read [EDIT] A Fire on the Deep), even with massively faster thinking. But that's what the community seems to mean by "recursive self-improvement."

Replies from: XFrequentist
comment by XFrequentist · 2011-12-30T13:40:12.276Z · LW(p) · GW(p)

(A Fire Upon the Deep)

ETA: Oops! Deepness in the Sky is a prequel, didn't know and didn't google.

(Also, added to reading queue.)

Replies from: TimS
comment by TimS · 2011-12-30T13:45:59.390Z · LW(p) · GW(p)

Thanks, edited.

comment by A1987dM (army1987) · 2011-12-30T17:24:30.005Z · LW(p) · GW(p)

Given that utility functions are only defined up to positive linear transforms, what do total utilitarians and average utilitarians actually mean when they're talking about the sum or the average of several utility functions? I mean, taking what they say literally, if Alice's utility function were twice what it actually is, she would behave the exact same way but she would be twice as ‘important’; that cannot possibly be what they mean. What am I missing?

Replies from: orthonormal, steven0461, jimrandomh, Manfred, RomeoStevens, endoself
comment by orthonormal · 2011-12-30T21:49:35.193Z · LW(p) · GW(p)

This is actually an open problem in utilitarianism; there were some posts recently looking to bargaining between agents as a solution, but I can't find them at the moment, and in any case that's not a mainstream LW conclusion.

comment by steven0461 · 2011-12-30T20:44:53.499Z · LW(p) · GW(p)

See here.

comment by jimrandomh · 2011-12-31T08:14:54.148Z · LW(p) · GW(p)

what do total utilitarians and average utilitarians actually mean when they're talking about the sum or the average of several utility functions?

They don't know. In most cases, they just sort of wave their hands. You can combine utility functions, but "sum" and "average" do not uniquely identify methods for doing so, and no method identified so far has seemed uniquely compelling.

comment by Manfred · 2012-01-02T06:29:20.528Z · LW(p) · GW(p)

There isn't a right answer (I think), but some ways of comparing are better than others. Stuart Armstrong is working on some of this stuff, as he mentions here.

comment by RomeoStevens · 2011-12-31T08:59:02.087Z · LW(p) · GW(p)

I think you figure out common units to denote utilons in through revealed preference. This only works if both utility functions are coherent.

also last time this came up I linked this to see if anyone knew anything about it: http://www.jstor.org/pss/2630767 and got downvoted. shrug

comment by endoself · 2011-12-30T20:17:08.750Z · LW(p) · GW(p)

If two possible futures have different numbers of people, those will be subject to different affine transforms, so the utility function as a whole will have been transformed in a non-affine way. See repugnant conclusion for a concrete example.

Replies from: army1987
comment by A1987dM (army1987) · 2011-12-30T20:36:01.917Z · LW(p) · GW(p)

I think you misunderstood my question. I wasn't asking about what would the difference between summing and averaging be, but how to sum utility functions of different people together in the first place.

Replies from: endoself
comment by endoself · 2011-12-30T20:50:16.655Z · LW(p) · GW(p)

Oh, I completely misunderstood that.

The right answer is that utilitarians aren't summing utility functions, they're just summing some expression about each person. The term hedonic function is used for these when they just care about pleasure or when they aren't worried about being misinterpreted as just caring about pleasure and the term utility function is used when they don't know what a utility function is or when they are willing to misuse it for convenience.

comment by [deleted] · 2011-12-30T13:10:30.792Z · LW(p) · GW(p)

I would like someone who understands Solomonoff Induction/the univeral prior/algorithmic probability theory to explain how the conclusions drawn in this post affect those drawn in this one. As I understand it, cousin_it's post shows that the probability assigned by the univeral prior is not related to K-complexity; this basically negates the points Eliezer makes in Occam's Razor and in this post. I'm pretty stupid with respect to mathematics, however, so I would like someone to clarify this for me.

Replies from: Erebus, Manfred, Will_Newsome, Incorrect, Will_Newsome
comment by Erebus · 2012-01-03T10:47:27.544Z · LW(p) · GW(p)

Solomonoff's universal prior assigns a probability to every individual Turing machine. Usually the interesting statements or hypotheses about which machine we are dealing with are more like "the 10th output bit is 1" than "the machine has the number 643653". The first statement describes an infinite number of different machines, and its probability is the sum of the probabilities of those Turing machines that produce 1 as their 10th output bit (as the probabilities of mutually exclusive hypotheses can be summed). This probability is not directly related to the K-complexity of the statement "the 10th output bit is 1" in any obvious way. The second statement, on the other hand, has probability exactly equal to the probability assigned to the Turing machine number 643653, and its K-complexity is essentially (that is, up to an additive constant) equal to the K-complexity of the number 643653.

So the point is that generic statements usually describe a huge number of different specific individual hypotheses, and that the complexity of a statement needed to delineate a set of Turing machines is not (necessarily) directly related to the complexities of the individual Turing machines in the set.

comment by Manfred · 2012-01-02T07:56:11.502Z · LW(p) · GW(p)

I don't think there's very much conflict. The basic idea of cousin-it's post is that the probabilities of generic statements are not described by a simplicity prior. Eliezer's post is about the reasons why the probabilities of every mutually exclusive explanation for your data should look like a simplicity prior (an explanation is a sort of statement, but in order for the arguments to work, you can't assign probabilities to any old explanations - they need to have this specific sort of structure).

comment by Will_Newsome · 2012-01-01T02:32:08.193Z · LW(p) · GW(p)

Stupid question: Does everyone agree that algorithmic probability is irrelevant to human epistemic practices?

Replies from: torekp, None
comment by torekp · 2012-01-02T01:59:59.373Z · LW(p) · GW(p)

I see it as a big open question.

comment by [deleted] · 2012-01-01T06:20:55.688Z · LW(p) · GW(p)

I don't think it's a clear-cut issue. Algorithmic probability seems to be the justification for several Sequence posts, most notably this one and this one. But, again, I am stupid with respect to algorithmic probability theory and its applications.

comment by Incorrect · 2011-12-31T00:55:23.711Z · LW(p) · GW(p)

Kolmogorov Complexity is defined with respect to a Turing complete language.

I think cousin_it is saying we should be careful what language our hypothesis is encoded in before checking its complexity. It's easy to make mistakes when trying to explain something in terms of Solomonoff Induction.

  1. The complexity of "A and B and C and D" is roughly equal to the complexity of "A or B or C or D", but we know for certain that the former hypothesis can never be more probable than the latter, no matter what A, B, C and D are.

"A or B or C or D" is invalid to judge the complexity of directly because it is a set of different alternative hypothesis rather than a single one.

  1. The hypothesis "the correct theory of everything is the lexicographically least algorithm with K-complexity 3^^^^3" is quite short, but the universal prior for it is astronomically low.

This one is invalid to judge the complexity of directly because K-complexity is not computable and the algorithm to compute it in this special case is very large. If the term K-complexity were expanded this statement would be astronomically complex.

  1. The hypothesis "if my brother's wife's first son's best friend flips a coin, it will fall heads" has quite high complexity, but should be assigned credence 0.5, just like its negation.

Here we must be careful to note that (presumably) all alternative low-complexity worlds where the "brother's wife's first son's best friend" does not flip the coin have already been ruled out.

comment by Dr_Manhattan · 2011-12-30T04:07:30.127Z · LW(p) · GW(p)

(I super-upvoted this, since asking stupid questions is a major flinch/ugh field)

Ok, my stupid question, asked in a blatantly stupid way, is: where does the decision theory stuff fit in The Plan? I have gotten the notion that it's important for Value-Preserving Self-Modification in a potential AI agent, but I'm confused because it all sounds too much like game theory - there all all these other-agents it deals with. If it's not for VPSM, and it fact some exploration of how AI would deal with potential agents, why is this important at all? Let AI figure that out, it's going to be smarter than us anyway.

If there is some Architecture document I should read to grok this, please point me there.

Replies from: Vaniver, Dufaer, Vladimir_Nesov, amcknight
comment by Vaniver · 2011-12-30T17:01:45.279Z · LW(p) · GW(p)

I have gotten the notion that it's important for Value-Preserving Self-Modification in a potential AI agent, but I'm confused because it all sounds too much like game theory - there all all these other-agents it deals with

My impression is that, with self-modification and time, continuity of identity becomes a sticky issue. If I can become an entirely different person tomorrow, how I structure my life is not the weak game theory of "how do I bargain with another me?" but the strong game theory of "how do I bargain with someone else?"

comment by Dufaer · 2011-12-30T17:05:37.790Z · LW(p) · GW(p)

I think Eliezer's reply (point '(B)') to this comment by Wei Dai provides some explanation, as to what the decision theory is doing here.

From the reply (concerning UDT):

I still think [an AI ought to be able to come up with these ideas by itself], BTW. We should devote some time and resources to thinking about how we are solving these problems (and coming up with questions in the first place). Finding that algorithm is perhaps more important than finding a reflectively consistent decision algorithm, if we don't want an AI to be stuck with whatever mistakes we might make.

And yet you found a reflectively consistent decision algorithm long before you found a decision-system-algorithm-finding algorithm. That's not coincidence. The latter problem is much harder. I suspect that even an informal understanding of parts of it would mean that you could find timeless decision theory as easily as falling backward off a tree - you just run the algorithm in your own head. So with vey high probability you are going to start seeing through the object-level problems before you see through the meta ones. Conversely I am EXTREMELY skeptical of people who claim they have an algorithm to solve meta problems but who still seem confused about object problems. Take metaethics, a solved problem: what are the odds that someone who still thought metaethics was a Deep Mystery could write an AI algorithm that could come up with a correct metaethics? I tried that, you know, and in retrospect it didn't work.

The meta algorithms are important but by their very nature, knowing even a little about the meta-problem tends to make the object problem much less confusing, and you will progress on the object problem faster than on the meta problem. Again, that's not saying the meta problem is important. It's just saying that it's really hard to end up in a state where meta has really truly run ahead of object, though it's easy to get illusions of having done so.

comment by Vladimir_Nesov · 2011-12-30T16:53:16.494Z · LW(p) · GW(p)

Other agents are complicated regularities in the world (or a more general decision problem setting). Finding problems with understanding what's going on when we try to optimize in other agents' presence is a good heuristic for spotting gaps in our understanding of the idea of optimization.

comment by amcknight · 2012-01-04T01:52:07.209Z · LW(p) · GW(p)

I think the main reason is simple. It's hard to create a transparent/reliable agent without decision theory. Also, since we're talking about a super-power agent, you don't want to mess this up. CDT and EDT are known to mess up, so it would be very helpful to find a "correct" decision theory. Though you may somehow be able to get around it by letting an AI self-improve, it would be nice to have one less thing to worry about, especially because how the AI improves is itself a decision.

comment by faul_sname · 2011-12-30T03:28:10.700Z · LW(p) · GW(p)

What exactly is the difference in meaning of "intelligence", "rationality", and "optimization power" as used on this site?

Replies from: lukeprog, Bobertron, timtyler
comment by lukeprog · 2011-12-30T03:39:55.688Z · LW(p) · GW(p)

Optimization power is a processes' capacity for reshaping the world according to its preferences.

Intelligence is optimization power divided by the resources used.

"Intelligence" is also sometimes used to talk about whatever is being measured by popular tests of "intelligence," like IQ tests.

Rationality refers to both epistemic and instrumental rationality: the craft of obtaining true beliefs and of achieving one's goals. Also known as systematized winning.

Replies from: mathemajician, timtyler, wedrifid
comment by mathemajician · 2011-12-30T12:22:22.563Z · LW(p) · GW(p)

If I had a moderately powerful AI and figured out that I could double its optimisation power by tripling its resources, my improved AI would actually be less intelligent? What if I repeat this process a number of times; I could end up an AI that had enough optimisation power to take over the world, and yet its intelligence would be extremely low.

Replies from: benelliott
comment by benelliott · 2011-12-30T12:32:13.090Z · LW(p) · GW(p)

We don't actually have units of 'resources' or optimization power, but I think the idea would be that any non-stupid agent should at least triple its optimization power when you triple its resources, and possibly more. As a general rule, if I have three times as much stuff as I used to have, I can at the very least do what I was already doing but three times simultaneously, and hopefully pool my resources and do something even better.

Replies from: timtyler, mathemajician
comment by timtyler · 2011-12-30T13:47:24.948Z · LW(p) · GW(p)

We don't actually have units of 'resources' or optimization power [...]

For "optimization power", we do now have some fairly reasonable tests:

comment by mathemajician · 2011-12-30T15:24:46.321Z · LW(p) · GW(p)

Machine learning and AI algorithms typically display the opposite of this, i.e. sub-linear scaling. In many cases there are hard mathematical results that show that this cannot be improved to linear, let alone super-linear.

This suggest that if a singularity were to occur, we might be faced with an intelligence implosion rather than explosion.

Replies from: faul_sname
comment by faul_sname · 2011-12-31T00:01:23.252Z · LW(p) · GW(p)

If intelligence=optimization power/resources used, this might well be the case. Nonetheless, this "intelligence implosion" would still involve entities with increasing resources and thus increasing optimization power. A stupid agent with a lot of optimization power (Clippy) is still dangerous.

Replies from: mathemajician
comment by mathemajician · 2011-12-31T01:06:48.996Z · LW(p) · GW(p)

I agree that it would be dangerous.

What I'm arguing is that dividing by resource consumption is an odd way to define intelligence. For example, under this definition is a mouse more intelligent than an ant? Clearly a mouse has much more optimisation power, but it also has a vastly larger brain. So once you divide out the resource difference, maybe ants are more intelligent than mice? It's not at all clear. That this could even be a possibility runs strongly counter to the everyday meaning of intelligence, as well as definitions given by psychologists (as Tim Tyler pointed out above).

comment by timtyler · 2011-12-30T13:26:07.445Z · LW(p) · GW(p)

Intelligence is optimization power divided by the resources used.

I checked with: A Collection of Definitions of Intelligence.

Out of 71 definitions, only two mentioned resources:

“Intelligence is the ability to use optimally limited resources – including time – to achieve goals.” R. Kurzweil

“Intelligence is the ability for an information processing system to adapt to its environment with insufficient knowledge and resources.” P. Wang

The paper suggests that the nearest thing to a consensus is that intelligence is about problem-solving ability in a wide range of environments.

Yes, Yudkowsky apparently says otherwise - but: so what?

Replies from: endoself, orthonormal
comment by endoself · 2011-12-30T19:58:37.670Z · LW(p) · GW(p)

I don't think he really said this. The exact quote is

If you want to measure the intelligence of a system, I would suggest measuring its optimization power as before, but then dividing by the resources used. Or you might measure the degree of prior cognitive optimization required to achieve the same result using equal or fewer resources. Intelligence, in other words, is efficient optimization.

This seems like just a list of different measurements trying to convey the idea of efficiency.

When we want something to be efficient, we really just mean that we have other things to use our resources for. The right way to measure this is in terms of the marginal utility of the other uses of resources. Efficiency is therefore important, but trying to calculate efficiency by dividing is oversimplifying.

comment by orthonormal · 2011-12-30T18:26:48.799Z · LW(p) · GW(p)

What about a giant look-up table, then?

Replies from: Solvent, timtyler, mathemajician
comment by Solvent · 2011-12-31T14:28:19.631Z · LW(p) · GW(p)

That requires lots of computing resources. (I think that's the answer.)

comment by timtyler · 2011-12-30T20:12:51.301Z · LW(p) · GW(p)

What about a giant look-up table, then?

That would surely be very bad at solving problems in a wide range of environments.

Replies from: orthonormal
comment by orthonormal · 2011-12-30T21:18:51.174Z · LW(p) · GW(p)

For any agent, I can create a GLUT that solves problems just as well (provided the vast computing resources necessary to store it), by just duplicating that agent's actions in all of its possible states.

Replies from: timtyler
comment by timtyler · 2011-12-30T22:20:06.464Z · LW(p) · GW(p)

Surely its performance would be appalling on most problems - vastly inferior to a genuinely intellligent agent implemented with the same hardware technology - and so it will fail to solve many of the problems with time constraints. The idea of a GLUT seems highly impractical. However, if you really think that it would be a good way to construct an intelligent machine, go right ahead.

Replies from: orthonormal
comment by orthonormal · 2011-12-30T23:27:23.982Z · LW(p) · GW(p)

vastly inferior to a genuinely intellligent agent implemented with the same hardware technology

I agree. That's the point of the original comment- that "efficient use of resources" is as much a factor in our concept of intelligence as is "cross-domain problem-solving ability". A GLUT could have the latter, but not the former, attribute.

Replies from: timtyler
comment by timtyler · 2011-12-31T13:53:42.310Z · LW(p) · GW(p)

"Cross-domain problem-solving ability" implicitly includes the idea that some types of problem may involve resource constraints. The issue is whether that point needs further explicit emphasis - in an informal definition of intelligence.

comment by mathemajician · 2011-12-30T19:34:58.309Z · LW(p) · GW(p)

Sure, if you had an infinitely big and fast computer. Of course, even then you still wouldn't know what to put in the table. But if we're in infinite theory land, then why not just run AIXI on your infinite computer?

Back in reality, the lookup table approach isn't going to get anywhere. For example, if you use a video camera as the input stream and after just one frame of data your table would already need something like 256^1000000 entries. The observable universe only has 10^80 particles.

Replies from: orthonormal
comment by orthonormal · 2011-12-30T21:13:04.526Z · LW(p) · GW(p)

You misunderstand me. I'm pointing out that a GLUT is an example of something with (potentially) immense optimization power, but whose use of computational resources is ridiculously prodigal, and which we might hesitate to call truly intelligent. This is evidence that our concept of intelligence does in fact include some notion of efficiency, even if people don't think of this aspect without prompting.

Replies from: mathemajician
comment by mathemajician · 2011-12-30T21:31:06.426Z · LW(p) · GW(p)

Right, but the problem with this counter example is that it isn't actually possible. A counter example that could occur would be much more convincing.

Personally, if a GLUT could cure cancer, cure aging, prove mind blowing mathematical results, write a award wining romance novel, take over the world, and expand out to take over the universe... I'd be happy considering it to be extremely intelligent.

Replies from: orthonormal
comment by orthonormal · 2011-12-30T21:39:48.229Z · LW(p) · GW(p)

It's infeasible within our physics, but it's possible for (say) our world to be a simulation within a universe of vaster computing power, and to have a GLUT from that world interact with our simulation. I'd say that such a GLUT was extremely powerful, but (once I found out what it really was) I wouldn't call it intelligent- though I'd expect whatever process produced it (e.g. coded in all of the theorem-proof and problem-solution pairs) to be a different and more intelligent sort of process.

That is, a GLUT is the optimizer equivalent of a tortoise with the world on its back- it needs to be supported on something, and it would be highly unlikely to be tortoises all the way down.

comment by wedrifid · 2011-12-31T08:49:17.648Z · LW(p) · GW(p)

Intelligence is optimization power divided by the resources used.

A 'featherless biped' definition. That is, it's decent attempt at a simplified proxy but massively breaks down if you search for exceptions.

comment by Bobertron · 2011-12-30T12:59:53.003Z · LW(p) · GW(p)

What Intelligence Tests Miss is a book about the difference between intelligence and rationality. The linked LW-article about the book should answer your questions about the difference between the two.

A short answer would be that intelligence describes how well you think, but not some important traits and knowledge like: Do you use your intelligence (are you a reflective person), do you have a strong need for closure, can you override your intuitions, do you know Bayes-theorem, probability theory, or logic?

comment by timtyler · 2011-12-30T13:09:54.643Z · LW(p) · GW(p)

What exactly is the difference in meaning of "intelligence", "rationality", and "optimization power" as used on this site?

"Intelligence" is often defined as being the "g-factor" of humans - which is a pretty sucky definition of "rationality".

Go to definitions of "intelligence" used by machine intelligence researchers and it's much closer to "rationality".

comment by Ezekiel · 2011-12-30T20:37:41.764Z · LW(p) · GW(p)

If I understand it correctly, the FAI problem is basically about making an AI whose goals match those of humanity. But why does the AI need to have goals at all? Couldn't you just program a question-answering machine and then ask it to solve specific problems?

Replies from: Vladimir_Nesov, Kaj_Sotala, shminux
comment by Vladimir_Nesov · 2011-12-30T21:03:05.493Z · LW(p) · GW(p)

This idea is called "Oracle AI"; see this post and its dependencies for some reasons why it's probably a bad idea.

Replies from: Ezekiel
comment by Ezekiel · 2011-12-30T21:23:30.674Z · LW(p) · GW(p)

That's exactly what I was looking for. Thank you.

comment by Kaj_Sotala · 2011-12-31T06:11:08.829Z · LW(p) · GW(p)

In addition to the post Vladimir linked, see also this paper.

comment by shminux · 2011-12-30T20:48:26.288Z · LW(p) · GW(p)

Presumably once AGI becomes smarter than humans, it will develop goals of some kind, whether we want it or not. Might as well try to influence them.

Replies from: Ezekiel
comment by Ezekiel · 2011-12-30T20:51:46.058Z · LW(p) · GW(p)

Presumably once AGI becomes smarter than humans, it will develop goals of some kind

Why?

Replies from: Kaj_Sotala, Vladimir_Nesov
comment by Kaj_Sotala · 2011-12-31T06:31:12.261Z · LW(p) · GW(p)

A better wording would probably be that you can't design something with literally no goals and still call it an AI. A system that answers questions and solves specific problems has a goal: to answer questions and solve specific problems. To be useful for that task, its whole architecture has to be crafted with that purpose in mind.

For instance, suppose it was provided questions in the form of written text. This means that its designers will have to build it in such a way that it interprets text in a certain way and tries to discover what we mean by the question. That's just one thing that it could do to the text, though - it could also just discard any text input, or transform each letter to a number and start searching for mathematical patterns in the numbers, or use the text to seed its random-number generator that it was using for some entirely different purpose, and so forth. In order for the AI to do anything useful, it has to have a large number of goals such as "interpret the meaning of this text file I was provided" implicit in its architecture. As the AI grows more powerful, these various goals may manifest themselves in unexpected ways.

comment by Andy_McKenzie · 2011-12-30T04:40:12.182Z · LW(p) · GW(p)

In this interview between Eliezer and Luke, Eliezer says that the "solution" to the exploration-exploitation trade-off is to "figure out how much resources you want to spend on exploring, do a bunch of exploring, use all your remaining resources on exploiting the most valuable thing you’ve discovered, over and over and over again." His point is that humans don't do this, because we have our own, arbitrary value called boredom, while an AI would follow this "pure math."

My potentially stupid question: doesn't this strategy assume that environmental conditions relevant to your goals do not change? It seems to me that if your environment can change, then you can never be sure that you're exploiting the most valuable choice. More specifically, why is Eliezer so sure that what wikipedia describes as the epsilon-first strategy is always the optimal one? (Posting this here because I assume he has read more about this than me and that I am missing something.)

Edit 12/30 8:56 GMT: fixed typo in last sentence of second paragraph.

Replies from: jsteinhardt, Larks, TheOtherDave
comment by jsteinhardt · 2011-12-30T16:55:52.893Z · LW(p) · GW(p)

You got me curious, so I did some searching. This paper gives fairly tight bounds in the case where the payoffs are adaptive (i.e. can change in response to your previous actions) but bounded. The algorithm is on page 5.

Replies from: Andy_McKenzie
comment by Andy_McKenzie · 2011-12-30T18:23:23.368Z · LW(p) · GW(p)

Thanks for the link. Their algorithm, the “multiplicative update rule,” which goes about "selecting each arm randomly with probabilities that evolve based on their past performance," does not seem to me to be the same strategy as Eliezer describes. So does this contradict his argument?

Replies from: jsteinhardt
comment by jsteinhardt · 2011-12-30T23:10:33.106Z · LW(p) · GW(p)

Yes.

comment by Larks · 2011-12-30T05:07:22.218Z · LW(p) · GW(p)

You should probably be prepared to change how much you plan to spend on exploring based on the initial information recieved.

Replies from: RomeoStevens, Andy_McKenzie
comment by RomeoStevens · 2011-12-31T09:12:56.236Z · LW(p) · GW(p)

This has me confused as well.
Assume a large area divided into two regions. Region A has slot machines with average payout 50, while region B has machines with average payout 500. I am blindfolded and randomly dropped into region A or B. The first slot machine I try has payout 70. I update in the direction of being in region A. Doesn't this affect how many resources I wish to spend doing exploration?

Replies from: TheOtherDave
comment by TheOtherDave · 2011-12-31T17:32:20.644Z · LW(p) · GW(p)

Are you also assuming that you know all of those assumed facts about the area?

I would certainly expect that how many resources I want to spend on exploration will be affected by how much a priori knowledge I have about the system. Without such knowledge, the amount of exploration-energy I'd have to expend to be confident that there are two regions A and B with average payout as you describe is enormous.

comment by Andy_McKenzie · 2011-12-30T18:29:19.174Z · LW(p) · GW(p)

Do you mean to set the parameter specifying the amount of resources (e.g., time steps) to spend exploring (before switching to full-exploiting) based on the info you receive upon your first observation? Also, what do you mean by "probably"?

comment by TheOtherDave · 2011-12-30T04:55:22.866Z · LW(p) · GW(p)

Sure. For example, if your environment is such that the process of exploitation can alter your environment in such a way that your earlier judgment of "the most valuable thing" is no longer reliable, then an iterative cycle of explore-exploit-explore can potentially get you better results.

Of course, you can treat each loop of that cycle as a separate optimization problem and use the abovementioned strategy.

Replies from: Andy_McKenzie
comment by Andy_McKenzie · 2011-12-30T18:31:00.888Z · LW(p) · GW(p)

Could I replace "can potentially get you better results" with "will get you better results on average"?

Replies from: TheOtherDave
comment by TheOtherDave · 2011-12-30T20:12:44.532Z · LW(p) · GW(p)

Would you accept "will get you better results, all else being equal" instead? I don't have a very clear sense of what we'd be averaging.

Replies from: Andy_McKenzie
comment by Andy_McKenzie · 2011-12-30T21:00:35.227Z · LW(p) · GW(p)

I meant averaging over the possible ways that the environment could change following your exploitation. For example, it's possible that a particular course of exploitation action could shape the environment such that your exploitation strategy actually becomes more valuable upon each iteration. In such a scenario, exploring more after exploiting would be an especially bad decision. So I don't think I can accept "will" without "on average" unless "all else" excludes all of these types of scenarios in which exploring is harmful.

Replies from: TheOtherDave
comment by TheOtherDave · 2011-12-30T22:22:35.723Z · LW(p) · GW(p)

OK, understood. Thanks for clarifying.

Hm. I expect that within the set of environments where exploitation can alter the results of what-to-exploit-next calculations, there more possible ways for it to do so such that the right move in the next iteration is further exploration than further exploitation.

So, yeah, I'll accept "will get you better results on average."

comment by Will_Newsome · 2011-12-29T23:37:28.192Z · LW(p) · GW(p)

So in Eliezer's meta-ethics he talks about the abstract computation called "right", whereas in e.g. CEV he talks about stuff like reflective endorsement. So in other words in one place he's talking about goodness as a formal cause and in another he's talking about goodness as a final cause. Does he argue anywhere that these should be expected to be the same thing? I realize that postulating their equivalence is not an unreasonable guess but it's definitely not immediately or logically obvious, non? I suspect that Eliezer's just not making a clear distinction between formal and final causes because his model of causality sees them as two sides of the same Platonic timeless coin, but as far as philosophy goes I think he'd need to flesh out his intuitions more before it's clear if that makes sense; is this fleshing out to be found or hinted at anywhere in the sequences?

Replies from: wedrifid, None
comment by wedrifid · 2011-12-30T01:20:12.264Z · LW(p) · GW(p)

So in Eliezer's meta-ethics he talks about the abstract computation called "right", whereas in e.g. CEV he talks about stuff like reflective endorsement. So in other words in one place he's talking about goodness as a formal cause and in another he's talking about goodness as a final cause. Does he argue anywhere that these should be expected to be the same thing?

Not explicitly. He does in various places talk about why alternative considerations of abstract 'rightness' - some sort of objective morality or something - are absurd. He does give some details on his reductionist moral realism about the place but I don't recall where.

Incidentally I haven't seen Eliezer talk about formal or final causes about anything, ever. (And they don't seem to be especially useful concepts to me.)

Replies from: None
comment by [deleted] · 2011-12-30T02:15:15.431Z · LW(p) · GW(p)

Incidentally I haven't seen Eliezer talk about formal or final causes about anything, ever. (And they don't seem to be especially useful concepts to me.)

Aren't "formal cause" and "final cause" just synonyms for "shape" and "purpose", respectively?

Replies from: endoself
comment by endoself · 2011-12-30T20:44:16.342Z · LW(p) · GW(p)

Basically, but Aristotle applied naive philosophical realism to them, and Will might have additional connotations in mind.

Replies from: Will_Newsome
comment by Will_Newsome · 2011-12-30T21:12:23.288Z · LW(p) · GW(p)

naive philosophical realism

Sweet phrase, thanks. Maybe there should be a suite of these? I've noticed naive physical realism and naive philosophical (especially metaphysical) realism.

comment by [deleted] · 2011-12-31T03:20:37.408Z · LW(p) · GW(p)

They're not the same. CEV is an attempt to define a procedure that can infer morality by examining the workings of a big bunch of sometimes confused human brains just like you might try to infer mathematical truths by examining the workings of a big bunch of sometimes buggy calculators. The hope is that CEV finds morality, but it's not the same as morality, any more than math is defined to be the output of a certain really well made calculator.

comment by Unweaver · 2011-12-30T04:20:53.430Z · LW(p) · GW(p)

I keep scratching my head over this comment made by Vladimir Nesov in the discussion following “A Rationalist’s Tale”. I suppose it would be ideal for Vladimir himself to weigh in and clarify his meaning, but because no objections were really raised to the substance of the comment, and because it in fact scored nine upvotes, I wonder if perhaps no one else was confused. If that’s the case, could someone help me comprehend what’s being said?

My understanding is that it’s the LessWrong consensus that gods do not exist, period; but to me the comment seems to imply that magical gods do in fact exist, albeit in other universes… or something like that? I must be missing something.

Replies from: ata, Larks
comment by ata · 2011-12-30T05:12:48.717Z · LW(p) · GW(p)

"Magical gods" in the conventional supernatural sense generally don't exist in any universes, insofar as a lot of the properties conventionally ascribed to them are logically impossible or ill-defined, but entities we'd recognize as gods of various sorts do in fact exist in a wide variety of mathematically-describable universes. Whether all mathematically-describable universes have the same ontological status as this one is an open question, to the extent that that question makes sense.

(Some would disagree with referring to any such beings as "gods", e.g. Damien Broderick who said "Gods are ontologically distinct from creatures, or they're not worth the paper they're written on", but this is a semantic argument and I'm not sure how important it is. As long as we're clear that it's probably possible to coherently describe a wide variety of godlike beings but that none of them will have properties like omniscience, omnipotence, etc. in the strongest forms theologians have come up with.)

Replies from: Unweaver
comment by Unweaver · 2011-12-30T05:35:05.040Z · LW(p) · GW(p)

Thanks, that makes more sense to me. I didn't think qualities like omnipotence and such could actually be realized. Any way you can give me an idea of what these godlike entities look like though? You indicate they aren't actually "magical" per se - so they would have to be subject to whatever laws of physics reign in their world, no? I take it we must talking about superintelligent AIs or alien simulators or something weird like that?

Replies from: orthonormal
comment by orthonormal · 2011-12-30T18:29:08.583Z · LW(p) · GW(p)

I take it we must talking about superintelligent AIs or alien simulators or something weird like that?

Yes.

Replies from: Vladimir_Nesov
comment by Vladimir_Nesov · 2011-12-30T20:37:59.032Z · LW(p) · GW(p)

Why, we could come up with abstract universes where the Magical Gods have exactly the powers and understanding of what's going on befitting Magical Gods. I wasn't thinking of normal and mundane things like superintelligent AIs or alien simulators. Take Thor, for example: he doesn't need to obey Maxwell's equations or believe himself to be someone other than a hummer-wielding god of lightning and thunder.

Replies from: Unweaver
comment by Unweaver · 2011-12-30T23:06:30.661Z · LW(p) · GW(p)

Maybe I'm just confused by your use of the term "magical". I am imagining magic as some kind of inexplicable, contracausal force - so for example, if Thor wanted to magically heal someone he would just will the person's wounds to disappear and, voila, without any physical process acting on the wounds to make them heal up, they just disappear. But surely that's not possible, right?

Replies from: Vladimir_Nesov
comment by Vladimir_Nesov · 2011-12-30T23:10:10.041Z · LW(p) · GW(p)

Your imagining what's hypothetically-anticipated to happen is the kind of lawful process that magical worlds obey by stipulation.

comment by Larks · 2011-12-30T05:09:35.864Z · LW(p) · GW(p)

Our world doesn't have gods; but if all possible worlds exist (which is an attractive belief for various reasons) then some of those have gods. However, they're irrelivant to us.

comment by [deleted] · 2011-12-30T02:21:23.088Z · LW(p) · GW(p)

When people talk about designing FAI, they usually say that we need to figure out how to make the FAI's goals remain stable even as the FAI changes itself. But why can't we just make the FAI incapable of changing itself?

Database servers can improve their own performance, to a degree, simply by performing statistical analysis on tables and altering their metadata. Then they just consult this metadata whenever they have to answer a query. But we never hear about a database server clobbering its own purpose (do we?), since they don't actually alter their own code; they just alter some pieces of data in a way that improves their own functioning.

Granted, any AGI we create is likely to "escape" and eventually gain access to its own software. This doesn't have to happen before the AGI matures.

Replies from: wedrifid, drethelin, Solvent, benelliott, Vladimir_Nesov
comment by wedrifid · 2011-12-30T03:08:48.894Z · LW(p) · GW(p)

But why can't we just make the FAI incapable of changing itself?

Because it would be weak as piss and incapable of doing most things that we want it to do.

Replies from: XiXiDu
comment by XiXiDu · 2011-12-30T14:02:45.905Z · LW(p) · GW(p)

...weak as piss...

Would upvote twice for this expression if I could :-)

comment by drethelin · 2011-12-30T02:25:44.895Z · LW(p) · GW(p)

The majority of Friendly AI's ability to do good comes from its ability to modify its own code. Recursive self improvement is key to gaining intelligence and ability swiftly. An AI that is about as powerful as a human is only about as useful as a human.

Replies from: jsteinhardt, None
comment by jsteinhardt · 2011-12-30T17:03:45.153Z · LW(p) · GW(p)

I disagree. AIs can be copied, which is a huge boost. You just need a single Stephen Hawking AI to come out of the population, then you make 1 million copies of it and dramatically speed up science.

comment by [deleted] · 2011-12-31T02:28:01.534Z · LW(p) · GW(p)

I don't buy any argument saying that an FAI must be able to modify its own code in order to take off. Computer programs that can't modify their own code can be Turing-complete; adding self-modification doesn't add anything to Turing-completeness.

That said, I do kind of buy this argument about how if an AI is allowed to write and execute arbitrary code, that's kind of like self-modification. I think there may be important differences.

Replies from: KenChen
comment by KenChen · 2012-01-05T18:14:36.260Z · LW(p) · GW(p)

It makes sense to say that a computer language is Turing-complete.

It doesn't make sense to say that a computer program is Turing-complete.

Replies from: None
comment by [deleted] · 2012-01-07T00:24:17.657Z · LW(p) · GW(p)

Arguably, a computer program with input is a computer language. In any case, I don't think this matters to my point.

comment by Solvent · 2011-12-30T10:09:44.634Z · LW(p) · GW(p)

In addition to these other answers, I read a paper, I think by Eliezer, which argued that it was almost impossible to stop an AI from modifying its own source code, because it would figure out that it would gain a massive efficiency boost from doing so.

Also, remember that the AI is a computer program. If it is allowed to write other algorithms and execute them, which it has to be to be even vaguely intelligent, then it can simply write a copy of its source code somewhere else, edit it as desired, and run that copy.

I seem to recall the argument being something like the "Beware Seemingly Simple Wishes" one. "Don't modify yourself" sounds like a simple instruction for a human, but isn't as obvious when you look at it more carefully.

However, remember that a competent AI will keep its utility function or goal system constant under self modification. The classic analogy is that Gandhi doesn't want to kill people, so he also doesn't want to take a pill that makes him want to kill people.

I wish I could remember where that paper was where I read about this.

Replies from: None
comment by [deleted] · 2011-12-31T02:43:06.134Z · LW(p) · GW(p)

Well, let me describe the sort of architecture I have in mind.

The AI has a "knowledge base", which is some sort of database containing everything it knows. The knowledge base includes a set of heuristics. The AI also has a "thought heap", which is a set of all the things it plans to think about, ordered by how promising the thoughts seem to be. Each thought is just a heuristic, maybe with some parameters. The AI works by taking a thought from the heap and doing whatever it says, repeatedly.

Heuristics would be restricted, though. They would be things like "try to figure out whether or not this number is irrational", or "think about examples". You couldn't say, "make two more copies of this heuristic", or "change your supergoal to something random". You could say "simulate what would happen if you changed your supergoal to something random", but heuristics like this wouldn't necessarily be harmful, because the AI wouldn't blindly copy the results of the simulation; it would just think about them.

It seems plausible to me that an AI could take off simply by having correct reasoning methods written into it from the start, and by collecting data about what questions are good to ask.

Replies from: Solvent, Solvent
comment by Solvent · 2012-01-02T02:45:20.303Z · LW(p) · GW(p)

I found the paper I was talking about. The Basic AI Drives, by Stephen M. Omohundro.

From the paper:

If we wanted to prevent a system from improving itself, couldn’t we just lock up its hardware and not tell it how to access its own machine code? For an intelligent system, impediments like these just become problems to solve in the process of meeting its goals. If the payoff is great enough, a system will go to great lengths to accomplish an outcome. If the runtime environment of the system does not allow it to modify its own machine code, it will be motivated to break the protection mechanisms of that runtime. For example, it might do this by understanding and altering the runtime itself. If it can’t do that through software, it will be motivated to convince or trick a human operator into making the changes. Any attempt to place external constraints on a system’s ability to improve itself will ultimately lead to an arms race of measures and countermeasures. Another approach to keeping systems from self-improving is to try to restrain them from the inside; to build them so that they don’t want to self-improve. For most systems, it would be easy to do this for any specific kind of self-improvement. For example, the system might feel a “revulsion” to changing its own machine code. But this kind of internal goal just alters the landscape within which the system makes its choices. It doesn’t change the fact that there are changes which would improve its future ability to meet its goals. The system will therefore be motivated to find ways to get the benefits of those changes without triggering its internal “revulsion”. For example, it might build other systems which are improved versions of itself. Or it might build the new algorithms into external “assistants” which it calls upon whenever it needs to do a certain kind of computation. Or it might hire outside agencies to do what it wants to do. Or it might build an interpreted layer on top of its machine code layer which it can program without revulsion. There are an endless number of ways to circumvent internal restrictions unless they are formulated extremely carefully.

comment by Solvent · 2011-12-31T02:50:39.632Z · LW(p) · GW(p)

I'm not really qualified to answer you here, but here goes anyway.

I suspect that either your base design is flawed, or the restrictions on heuristics would render the program useless. Also, I don't think it would be quite as easy to control heuristics as you seem to think.

Also, AI people who actually know what they're talking about, unlike me, seem to disagree with you. Again, I wish I could remember where it was I was reading about this.

comment by benelliott · 2011-12-30T12:12:37.292Z · LW(p) · GW(p)

Granted, any AGI we create is likely to "escape" and eventually gain access to its own software. This doesn't have to happen before the AGI matures.

Maturing isn't a magical process. It happens because of good modifications made to source code.

Replies from: jsteinhardt
comment by jsteinhardt · 2011-12-30T17:04:18.111Z · LW(p) · GW(p)

Why can't it happen because of additional data collected about the world?

Replies from: benelliott
comment by benelliott · 2011-12-30T17:24:08.115Z · LW(p) · GW(p)

It could, although frankly I'm sceptical. I've had 18 years to collect data about the world and so far it hasn't led me to a point where I'd be confident in modifying myself without changing my goals, if an AI takes much longer than that another UFAI will probably beat it to the punch? If it is possible to figure out friendliness only through empirical reasoning without intelligence enhancement, why not figure it out ourselves and then build the AI (this seems roughly the approach SIAI is counting on).

comment by Vladimir_Nesov · 2011-12-30T17:11:39.839Z · LW(p) · GW(p)

"Safety" of own source code is actually a weak form of the problem. An AI has to keep the external world sufficiently "safe" as well, because the external world might itself host AIs or other dangers (to the external world, but also to AI's own safety), that must either remain weak, or share AI's values, to keep AI's internal "safety" relevant.

comment by shminux · 2011-12-30T00:54:48.612Z · LW(p) · GW(p)

Are there any intermediate steps toward the CEV, such as individual EV, and if so, are they discussed anywhere?

Replies from: lukeprog
comment by lukeprog · 2011-12-30T02:16:41.496Z · LW(p) · GW(p)

Only preliminary research into the potential EV algorithms have been explored. See these citations...

Brandt 1979; Railton 1986; Lewis 1989; Sobel 1994; Zimmerman 2003; Tanyi 2006

...from The Singularity and Machine Ethics.

comment by AspiringKnitter · 2012-01-08T03:12:34.409Z · LW(p) · GW(p)
  1. Where should I ask questions like question 2?

  2. I've been here less than thirty days. Why does my total karma sometimes but not always show a different number from my karma from the last 30 days?

Replies from: wedrifid, daenerys
comment by wedrifid · 2012-01-08T05:13:12.158Z · LW(p) · GW(p)

I've been here less than thirty days. Why does my total karma sometimes but not always show a different number from my karma from the last 30 days?

Presumably because the respective caches are recalculated at different intervals.

comment by daenerys · 2012-01-08T04:13:36.967Z · LW(p) · GW(p)

I've been here less than thirty days. Why does my total karma sometimes but not always show a different number from my karma from the last 30 days?

One of the numbers updates a couple minutes before the other one does. I forget which.

comment by Peter Wildeford (peter_hurford) · 2012-01-04T09:53:31.625Z · LW(p) · GW(p)

Why are flowers beautiful? I can't think of any "just so" story why this should be true, so it's puzzled me. I don't think it's justification for a God or anything, just something I currently cannot explain.

Replies from: Emile, None, TheOtherDave
comment by Emile · 2012-01-04T15:18:44.601Z · LW(p) · GW(p)

Many flowers are optimized for being easily found by insects, who don't have particularly good eyesight. To stick out from their surroundings, they can use bright unnatural colors (i.e. not green or brown), unusual patterns (concentric circles is a popular one), have a large surface, etc.

Also, flowers are often quite short-lived, and thus mostly undamaged; we find smoothness and symmetry attractive (for evolutionary reasons - they're signs of health in a human).

In addition, humans select flowers that they themselves find pretty to place in gardens and the like, so when you think of "flowers", the pretty varieties are more likely to come to mind than the less attractive ones (like say that of the plane tree, or of many species of grass - many flowers are also prettier if you look at them in the UltraViolet.). If you take a walk in the woods, most plants you encounter won't have flowers you'll find that pretty; ugly or unremarkable flowers may not even register in your mind as "flowers".

Replies from: peter_hurford
comment by Peter Wildeford (peter_hurford) · 2012-01-05T05:57:48.502Z · LW(p) · GW(p)

Also, flowers are often quite short-lived, and thus mostly undamaged; we find smoothness and symmetry attractive (for evolutionary reasons - they're signs of health in a human).

That makes sense, thanks. Do you have any more references on this?

comment by [deleted] · 2012-01-04T15:54:20.842Z · LW(p) · GW(p)

I think one possible "just so" explanation is:

Humans find symmetry more beautiful than asymmetry. Flowers are symmetrical.

Standard caveats: There are more details, and that's not a complete explanation, but it might prove a good starting point to look into if you're curious about the explanation.

comment by TheOtherDave · 2012-01-04T14:29:11.794Z · LW(p) · GW(p)

Is it flowers in particular that puzzle you? Or is it more generally the fact that humans are wired so as to find anything at all beautiful?

Replies from: peter_hurford
comment by Peter Wildeford (peter_hurford) · 2012-01-05T05:58:53.611Z · LW(p) · GW(p)

I suppose it would be finding beauty in things that don't seem to convey a survival advantage or that I can't personally draw a connection to something with a survival advantage. Another good example would be the beauty of rainbows.

comment by Armok_GoB · 2011-12-30T15:31:48.141Z · LW(p) · GW(p)

How do I stop my brain from going: "I believe P and I believe something that implies not P -> principle of explosion -> all statements are true!" and instead go "I believe P and I believe something that implies not P -> I one of my beliefs are incorrect". It doesn't happen to often, but it'd be nice to have an actual formal refutation for when it does.

Replies from: endoself, Vladimir_Nesov, orthonormal
comment by endoself · 2011-12-30T20:39:27.715Z · LW(p) · GW(p)

Do you actually do this - "Oh, not P! I must be the pope." - or do you just notice this - "Not P, so everything's true. Where do I go from here?".

If you want to know why you shouldn't do this it's because you never really learn not P, you just learn evidence against P which you should update with Bayes' rule. If you want to understand this process more intuitively (and you've already read the sequences and are still confused), I would recommend this short tutorial or studying belief propagation in Bayesian networks, for which I don't know a great source for the intuitions behind, but units 3 and 4 of the online Stanford AI class might help.

Replies from: Armok_GoB
comment by Armok_GoB · 2011-12-31T00:03:38.569Z · LW(p) · GW(p)

I've actually done that class and gotten really good grades.

Looking at it, it seems I have automatic generation of nodes for new statements, and the creation of a new node does not check for an already existing node for it's inversion.

To complicate matters further, I don't go "I'm the pope" nor "all statements are true.", I go "NOT Bayes theorem, NOT induction, and NOT Occhams razor!"

Replies from: endoself
comment by endoself · 2011-12-31T04:08:37.109Z · LW(p) · GW(p)

Well, one mathematically right thing to do is to make a new node descending from both other nodes representing E = (P and not P) and then observe not E.

Did you read the first tutorial? Do you find the process of belief-updating on causal nets intuitive, or do you just understand the math? How hard would it be for you to explain why it works in the language of the first tutorial?

Strictly speaking, causal networks only apply to situations where the number of variables does not change, but the intuitions carry over.

Replies from: Armok_GoB
comment by Armok_GoB · 2011-12-31T13:22:35.290Z · LW(p) · GW(p)

Thats what I try to do, the problem is I end up observing E to be true. And E leads to an "everything" node.

I'm not sure how well I understand the math, but I feel like I probably do...

Replies from: endoself
comment by endoself · 2011-12-31T19:32:38.064Z · LW(p) · GW(p)

You don't observe E to be true, you infer it to be (very likely) true by propagating from P and from not P. You observe it to be false using the law of noncontradiction.

Parsimony suggests that if you think you understand the math, it's because you understand it. Understanding Bayesianism seems easier than fixing a badly-understood flaw in your brain's implementation of it.

Replies from: Armok_GoB
comment by Armok_GoB · 2011-12-31T19:51:58.760Z · LW(p) · GW(p)

How can I get this law of noncontradiction? it seems like an useful thing to have.

comment by Vladimir_Nesov · 2011-12-30T17:15:32.837Z · LW(p) · GW(p)

The reason is that you don't believe anything with logical conviction, if your "axioms" imply absurdity, you discard the "axioms" as untrustworthy, thus refuting the arguments for their usefulness (that always precede any beliefs, if you look for them). Why do I believe this? My brain tells me so, and its reasoning is potentially suspect.

Replies from: Armok_GoB
comment by Armok_GoB · 2011-12-30T17:57:36.111Z · LW(p) · GW(p)

I think I've found the problem: I don't have any good intuitive notion of absurdity. The only clear association I have with it is under "absurdity heuristic" as "a thing to ignore".

That is: It's not self evident to me that what it implies IS absurd. After all, it was implied by a chain of logic I grok and can find no flaw in.

Replies from: Vladimir_Nesov
comment by Vladimir_Nesov · 2011-12-30T18:47:35.152Z · LW(p) · GW(p)

I used "absurdity" in the technical math sense.

comment by orthonormal · 2011-12-30T18:32:12.129Z · LW(p) · GW(p)

To the (mostly social) extent that concepts were useful to your ancestors, one is going to lead to better decisions than the other, and so you should expect to have evolved the latter intuition. (You trust two friends, and then one of them tells you the other is lying- you feel some consternation of the first kind, but then you start trying to figure out which one is trustworthy.)

Replies from: Armok_GoB
comment by Armok_GoB · 2011-12-30T18:51:27.035Z · LW(p) · GW(p)

It seems a lot of intuitions all humans are supposed to have were overwritten by noise at some point...

comment by FiftyTwo · 2011-12-31T18:42:07.098Z · LW(p) · GW(p)

Is there an easy way to read all the top level posts in order starting from the beginning? There doesn't seem to be a 'first post' link anywhere.

Replies from: Costanza, MinibearRex
comment by Costanza · 2011-12-31T21:40:31.426Z · LW(p) · GW(p)

There is a draft of a suggested reading order.

As I understand it, the sequences of LessWrong more or less grew out of prior writings by Eliezer, especially out of his posts at Overcoming Bias, so, there isn't a definitive first post.

Replies from: FiftyTwo
comment by FiftyTwo · 2012-01-02T01:01:07.648Z · LW(p) · GW(p)

I've read most of the posts on the suggested order, its more to satisfy my completionist streak and because Eleizer's early posts have an ongoing narrative to them. The brute force solution would simply be to find an early post and click 'previous' until they run out, but I would hope there would be an easier way, as sort by oldest firs tends to be one of the default options in such things.

Replies from: Solvent, jaimeastorga2000
comment by Solvent · 2012-01-02T03:49:42.779Z · LW(p) · GW(p)

Isn't the first one "The Martial Art of Rationality"?

comment by jaimeastorga2000 · 2012-01-02T04:23:46.283Z · LW(p) · GW(p)

You may find one of these helpful. As a heads up, though, you may want to begin with the essays on Eliezer's website (Bayes' Theorem, Technical Explanation, Twelve Virtues, and The Simple Truth) before you start his OB posts.

comment by MinibearRex · 2012-01-02T04:20:40.113Z · LW(p) · GW(p)

Check out this page. Additionally, at the end of each post, there is a link labeled "Article Navigation". Click that, and it will open links to the previous and next post by the author.

comment by NancyLebovitz · 2011-12-30T14:33:13.966Z · LW(p) · GW(p)

Is there a proof that it's possible to prove Friendliness?

Replies from: Vladimir_Nesov, XiXiDu
comment by Vladimir_Nesov · 2011-12-30T17:02:20.602Z · LW(p) · GW(p)

No. There's also no proof that it's possible to prove that P!=NP, and for the Friendliness problem it's much, much less clear what the problem even means. You aren't entitled to that particular proof, it's not expected to be available until it's not needed anymore. (Many difficult problems get solved or almost solved without a proof of them being solvable appearing in the interim.)

Replies from: NancyLebovitz
comment by NancyLebovitz · 2011-12-30T17:43:35.740Z · LW(p) · GW(p)

Why is it plausible that Friendliness is provable? Or is it more a matter that the problem is so important that it's worth trying regardless?

Replies from: Vladimir_Nesov
comment by Vladimir_Nesov · 2011-12-30T18:54:39.734Z · LW(p) · GW(p)

There is no clearly defined or motivated problem of "proving Friendliness". We need to understand what goals are, what humane goals are, what process can be used to access their formal definition, and what kinds of things can be done with them how to what end. We need to understand these things well, which (on psychological level) triggers association with mathematical proofs, and will probably actually involve some mathematics suitable to the task. Whether the answers take the form of something describable as "provable Friendliness" seems to me an unclear/unmotivated consideration. Unpacking that label might make it possible to provide a more useful response to the question.

comment by XiXiDu · 2011-12-30T15:05:23.404Z · LW(p) · GW(p)

Is there a proof that it's possible to prove Friendliness?

I wonder what SI would do next if they could prove that friendly AI was not possible. For example if it could be shown that value drift was inevitable and that utility-functions are unstable under recursive self-improvement.

Replies from: TimS
comment by TimS · 2011-12-30T15:12:00.675Z · LW(p) · GW(p)

Something along the lines that value drift is inevitable and utility-functions are unstable under recursive self-improvement.

That doesn't seem like the only circumstances in which FAI is not possible. If moral nihilism is true, then FAI is impossible even if value drift is not inevitable.
In that circumstance, shouldn't we try to make any AI we decide to build "friendly" to present day humanity, even if it wouldn't be friendly to Aristotle or Plato or Confucius. Based on hidden complexity of wishes analysis, consistency with our current norms is still plenty hard.

Replies from: NancyLebovitz
comment by NancyLebovitz · 2011-12-30T16:38:14.618Z · LW(p) · GW(p)

My concerns are more that it will not be possible to adequately define "human", especially as, transhuman tech develops, and that there might not be a good enough way to define what's good for people.

Replies from: shminux
comment by shminux · 2011-12-30T20:54:00.352Z · LW(p) · GW(p)

As I understand it, the modest goal of building an FAI is that of giving an AGI a push in the "right" direction, what EY refers to as the initial dynamics. After that, all bets are off.

comment by FiftyTwo · 2011-12-31T18:15:07.160Z · LW(p) · GW(p)

How do I work out what i want and what I should do?

Replies from: Costanza, Will_Newsome
comment by Costanza · 2011-12-31T21:57:18.263Z · LW(p) · GW(p)

Strictly speaking, this question may be a bit tangential to LessWrong, but this was never supposed to be a exclusive thread.

The answer will depend on a lot of things, mostly specific to you personally. Bodies differ. Even your own single body changes over the course of time. You have certain specific goals and values, and certain constraints.

Maybe what you're really looking for is recommendations for a proper physical fitness forum, which is relatively free of pseudoscience and what they call "woo." I can't advise, myself, but I'm sure that some LessWrongians can.

Replies from: Oscar_Cunningham
comment by Oscar_Cunningham · 2012-01-01T14:34:38.432Z · LW(p) · GW(p)

I think you've taken the wrong meaning of the words "work out".

Replies from: Costanza
comment by Costanza · 2012-01-01T17:32:34.716Z · LW(p) · GW(p)

[Literally laughing out loud at myself.]

Replies from: FiftyTwo
comment by FiftyTwo · 2012-01-02T00:50:44.686Z · LW(p) · GW(p)

Oscar is correct.

A better phrasing might be: Given that I have difficulty inferring my own desires what might be useful methods for me to discover my pre-existing desires or choose long term goals?

[On an unrelated note, reddits r/fitness is an extremely good source for scientifically based 'work out' advice. Which I believe would satisfy Contanza's criteria.]

Replies from: Will_Newsome
comment by Will_Newsome · 2012-01-02T02:17:37.505Z · LW(p) · GW(p)

Read things like this at least, and assume there are a lot of things like that that we don't know about. That's another of my stopgap solutions.

Replies from: amcknight
comment by amcknight · 2012-01-04T02:22:12.125Z · LW(p) · GW(p)

Interesting link. But in general, I would like just a tiny bit of context so I might know why I want to click on "this".

comment by Will_Newsome · 2012-01-01T07:04:34.696Z · LW(p) · GW(p)

Devise and execute a highly precise ritual wherein you invoke the optimal decision theory. That's my stopgap solution.

comment by VKS · 2012-03-24T13:12:38.898Z · LW(p) · GW(p)

I think I may be incredibly confused.

Firstly, if the universe is distributions of complex amplitudes in configuration space, then shouldn't we describe our knowledge of the world as probability distributions of complex amplitude distributions? Is there some incredibly convenient simplification I'm missing?

Secondly, have I understood correctly that the universe, in quantum mechanics, is a distribution of complex values in an infinite-dimensional space, where each dimension corresponds to the particular values some atribute of some particle in the universe takes? With some symmetries across the dimensions to ensure indistinguishability between particles?

If that is true, and the perceptible universe splits off into alternate blob universes basically all the time, shouldn't the configuration space be packed full of other universe-blobs already, meaning that the particular state of the universe blob we are in has more than one predecessor? After all, there are about 3*(number of particles) dimensions in complex-amplitude space, but every particle is splitting off at about the same rate, so the full complex-amplitude space the universe is in should be "filling up" at about the same rate as if it contained only one particle in it.

Or have I gotten this all wrong and is the universe actually a distribution of complex values over three (or four) dimensional space, one for each force, with each particle corresponding to a particularly high magnitude distribution in a field at a point? If that is true, can somebody explain exactly how decoherence works?

Replies from: wedrifid
comment by wedrifid · 2012-03-24T13:15:27.346Z · LW(p) · GW(p)

Firstly, if the universe is distributions of complex amplitudes in configuration space, then shouldn't we describe our knowledge of the world as probability distributions of complex amplitude distributions?

More or less.

Is there some incredibly convenient simplification I'm missing?

I find I get a lot of mileage out of using words. It does lose a lot of information - which I suppose is rather the point.

Replies from: VKS
comment by VKS · 2012-03-24T21:43:58.041Z · LW(p) · GW(p)

What I meant by that is that distributions of other distributions are the sort of thing you would kind of expect to be incredibly impractical to use, but also could have some handy mathematical ways to look at them. Since I am unfamiliar with the formalisms involved, I was wondering if anybody could enlighten me.

comment by Modig · 2012-01-24T01:26:47.421Z · LW(p) · GW(p)

There's an argument that I run into occasionally that I have some difficulty with.

Let's say I tell someone that voting is pointless, because one vote is extremely unlikely to alter the outcome of the election. Then someone might tell me that if everyone thought the way I do, democracy would be impossible.

And they may be right, but since everyone doesn't think the way I do, I don't find it to be a persuasive argument.

Other examples would be littering, abusing community resources, overusing antibiotics, et cetera. They may all be harmful, but if only one additional person does them, the net increased negative effect is likely negligible.

Does this type of argument have a name and where can I learn more about it? Feel free to share your own opinions/reflections on it as well if you think it's relevant!

Replies from: J_Taylor, TheOtherDave
comment by J_Taylor · 2012-01-24T01:34:41.043Z · LW(p) · GW(p)

Try searching for "free rider problem" or "tragedy of the commons."

Here are the relevant Wiki pages:

http://en.wikipedia.org/wiki/Free_rider_problem

http://en.wikipedia.org/wiki/Tragedy_of_the_commons

Replies from: Modig
comment by Modig · 2012-01-25T12:36:12.512Z · LW(p) · GW(p)

That's exactly it. I used to know that, can't believe I forgot it. Thanks!

comment by TheOtherDave · 2012-01-24T03:41:19.936Z · LW(p) · GW(p)

The related behavior pattern where everyone contributes to the collective problem is sometimes referred to as the tragedy of the commons. I'm fonder of "no single raindrop feels responsible for the flood," myself.

comment by FiftyTwo · 2012-01-11T05:01:24.511Z · LW(p) · GW(p)

How would I set up a website with a similar structure to less wrong? So including user submitted posts, comments and an upvote downvote system.

Replies from: dbaupp
comment by dbaupp · 2012-02-14T07:53:21.259Z · LW(p) · GW(p)

(Depending on your level of technical ability) You could get the source of LW and follow the setup instructions (changing the icons and styling etc as appropriate).