Comment by Phil_Goetz5 on Is That Your True Rejection? · 2008-12-07T18:35:11.000Z · LW · GW

"You didn't know, but the predictor knew what you'll do, and if you one-box, that is your property that predictor knew, and you'll have your reward as a result."

No. That makes sense only if you believe that causality can work backwards. It can't.

"If predictor can verify that you'll one-box (after you understand the rules of the game, yadda yadda), your property of one-boxing is communicated, and it's all it takes."

Your property of one-boxing can't be communicated backwards in time.

We could get bogged down in discussions of free will; I am assuming free will exists, since arguing about the choice to make doesn't make sense unless free will exists. Maybe the Predictor is always right. Maybe, in this imaginary universe, rationalists are screwed. I don't care; I don't claim that rationality is always the best policy in alternate universes where causality doesn't hold and 2+2=5.

What if I've decided I'm going to choose based on a coin flip? Is the Predictor still going to be right? (If you say "yes", then I'm not going to argue with you anymore on this topic; because that would be arguing about how to apply rules that work in this universe in a different universe.)

Comment by Phil_Goetz5 on Is That Your True Rejection? · 2008-12-07T17:00:59.000Z · LW · GW

Vladimir, I understand the PD and similar cases. I'm just saying that the Newcomb paradox is not actually a member of that class. Any agent faced with either version - being told ahead of time that they will face the Predictor, or being told only once the boxes are on the ground - has a simple choice to make; there's no paradox and no PD-like situation. It's a puzzle only if you believe that there really is backwards causality.

Comment by Phil_Goetz5 on ...Recursion, Magic · 2008-11-25T19:17:00.000Z · LW · GW

"You speculate about why Eurisko slowed to a halt and then complain that Lenat has wasted his life with CYC, but you ignore that Lenat has his own theory which he gives as the reason he's been pursuing CYC. You should at least explain why you think his theory wrong; I find his theory quite plausible."

  • Around 1990, Lenat predicted that Cyc would go FOOM by 2000. In 1999, he told me he expected it to go FOOM within a couple of years. Where's the FOOM?
  • Cyc has no cognitive architecture. It's a database. You can ask it questions. It has templates for answering specific types of questions. It has (last I checked, about 10 years ago) no notion of goals, actions, plans, learning, or its own agenthood.
Comment by Phil_Goetz5 on Observing Optimization · 2008-11-21T21:27:08.000Z · LW · GW

If I want to predict that the next growth curve will be an exponential and put bounds around its doubling time, I need a much finer fit to the data than if I only want to ask obvious questions like..."Do the optimization curves fall into the narrow range that would permit a smooth soft takeoff?"
This implies that you have done some quantitative analysis giving a probability distribution of possible optimization curves, and finding that only a low-probability subset of that distribution allows for soft takeoff.

Presenting that analysis would be an excellent place to start.

Comment by Phil_Goetz5 on Ends Don't Justify Means (Among Humans) · 2008-10-15T23:15:22.000Z · LW · GW

Note for readers: I'm not responding to Phil Goetz and Jef Allbright. And you shouldn't infer my positions from what they seem to be arguing with me about - just pretend they're addressing someone else.
Is that on this specific question, or a blanket "I never respond to Phil or Jef" policy?

Huh. That doesn't feel very nice.
Nor very rational, if one's goal is to communicate.

Comment by Phil_Goetz5 on Ends Don't Justify Means (Among Humans) · 2008-10-15T21:37:43.000Z · LW · GW

All the discussion so far indicates that Eliezer's AI will definitely kill me, and some others posting here, as soon as he turns it on.

It seems likely, if it follows Eliezer's reasoning, that it will kill anyone who is overly intelligent. Say, the top 50,000,000 or so.

(Perhaps a special exception will be made for Eliezer.)

Hey, Eliezer, I'm working in bioinformatics now, okay? Spare me!

Eliezer: If you create a friendly AI, do you think it will shortly thereafter kill you? If not, why not?

Comment by Phil_Goetz5 on Ends Don't Justify Means (Among Humans) · 2008-10-15T00:24:35.000Z · LW · GW

He may have some model of an AI as a perfect Bayesian reasoner that he uses to justify neglecting this. I am immediately suspicious of any argument invoking perfection.
It may also be that what Eliezer has in mind is that any heuristic that can be represented to the AI, could be assigned priors and incorporated into Bayesian reasoning.

Eliezer has read Judea Pearl, so he knows how computational time for Bayesian networks scales with the domain, particularly if you don't ever assume independence when it is not justified, so I won't lecture him on that. But he may want to lecture himself.

(Constructing the right Bayesian network from sense-data is even more computationally demanding. Of course, if you never assume independence, then the only right network is the fully-connected one. I'm pretty certain that suggesting that a non-narrow AI will be reasoning over all of its knowledge with a fully-connected Bayesian network is computationally implausible. So all arguments that require AIs to be perfect Bayesian reasoners are invalid.)

I'd like to know how much of what Eliezer says depends on the AI using Bayesian logic as its only reasoning mechanism, and whether he believes that is the best reasoning mechanism in all cases, or only one that must be used in order to keep the AI friendly.

Kaj: I will restate my earlier question this way: "Would AIs also find themselves in circumstances such that game theory dictates that they act corruptly?" It doesn't matter whether we say that the behavior evolved from accumulated mutations, or whether an AI reasoned it out in a millisecond. The problem is still there, if circumstances give corrupt behavior an advantage.

Comment by Phil_Goetz5 on Ends Don't Justify Means (Among Humans) · 2008-10-14T22:29:56.000Z · LW · GW

Good point, Jef - Eliezer is attributing the validity of "the ends don't justify the means" entirely to human fallibility, and neglecting that part accounted for by the unpredictability of the outcome.

He may have some model of an AI as a perfect Bayesian reasoner that he uses to justify neglecting this. I am immediately suspicious of any argument invoking perfection.

I don't know what "a model of evolving values increasingly coherent over increasing context, with effect over increasing scope of consequences" means.

Comment by Phil_Goetz5 on Ends Don't Justify Means (Among Humans) · 2008-10-14T21:46:48.000Z · LW · GW

The tendency to be corrupted by power is a specific biological adaptation, supported by specific cognitive circuits, built into us by our genes for a clear evolutionary reason. It wouldn't spontaneously appear in the code of a Friendly AI any more than its transistors would start to bleed.
This is critical to your point. But you haven't established this at all. You made one post with a just-so story about males in tribes perceiving those above them as corrupt, and then assumed, with no logical justification that I can recall, that this meant that those above them actually are corrupt. You haven't defined what corrupt means, either.

I think you need to sit down and spell out what 'corrupt' means, and then Think Really Hard about whether those in power actually are more corrupt than those not in power;and if so, whether the mechanisms that lead to that result are a result of the peculiar evolutionary history of humans, or of general game-theoretic / evolutionary mechanisms that would apply equally to competing AIs.

You might argue that if you have one Sysop AI, it isn't subject to evolutionary forces. This may be true. But if that's what you're counting on, it's very important for you to make that explicit. I think that, as your post stands, you may be attributing qualities to Friendly AIs, that apply only to Solitary Friendly AIs that are in complete control of the world.

Comment by Phil_Goetz5 on Why Does Power Corrupt? · 2008-10-14T16:20:04.000Z · LW · GW
Eliezer: I don't get your altruism. Why not grab the crown? All things being equal, a future where you get to control things is preferable to a future where you don't, regardless of your inclinations. Even if altruistic goals are important to you, it would seem like you'd have better chances of achieving them if you had more power. ... If all people, including yourself, become corrupt when given power, then why shouldn't you seize power for yourself? On average, you'd be no worse than anyone else, and probably at least somewhat better; there should be some correlation between knowing that power corrupts and not being corrupted. ... Benevolence itself is a trap. The wise treat men as straw dogs; to lead men, you must turn your back on them.
These are all Very Bad Things to say to someone who wants to construct the first AI.
Do we know that not-yet-powerful Stalin would have disagreed (internally) with a statement like "preserving Communism is worth the sacrifice of sending a lot of political opponents to gulags"?

Let's think about the Russian revolution. You have 3 people, arrayed in order of increasing corruption before coming to power: Trotsky, Lenin, Stalin. Lenin was nasty enough to oust Trotsky. Stalin was nasty enough to dispose of everybody who was a threat to him. Steven's point is good - that these people were all pre-corrupted - but we also see the corrupt rise to the top.

In the Cuban revolution, Fidel was probably more corrupt than Che from the start. I imagine Fidel would likely have had Che killed, if he in fact didn't.

So we now have 4 hypotheses:

  1. Males are inclined to perceive those presently in power as corrupt. (Eliezer)

  2. People are corrupted by power.

  3. People are corrupt. (Steven)

  4. Power selects for people who are corrupt.

How can we select from among these?

Comment by Phil_Goetz5 on Why Does Power Corrupt? · 2008-10-14T00:32:21.000Z · LW · GW

I'm unclear whether you're saying that we perceive those in power to be corrupt, or that they actually are corrupt. The beginning focuses on the former; the second half, on the latter.

The idea that we have evolved to perceive those in power over us as being corrupt faces the objection that the statement, "Power corrupts", is usually made upon observing all known history, not just the present.

Comment by Phil_Goetz5 on AIs and Gatekeepers Unite! · 2008-10-09T20:22:46.000Z · LW · GW

Has Eliezer explained somewhere (hopefully on a web page) why he doesn't want to post a transcript of a successful AI-box experiment?

Have the successes relied on a meta-approach, such as saying, "If you let me out of the box in this experiment, it will make people take the dangers of AI more seriously and possibly save all of humanity; whereas if you don't, you may doom us all"?

Comment by Phil_Goetz5 on My Bayesian Enlightenment · 2008-10-07T16:36:54.000Z · LW · GW

David - Yes, a human-level AI could be very useful. Politics and economics alone would benefit greatly from the simulations you could run.

(Of course, all of us but manual laborers would soon be out of a job.)

Comment by Phil_Goetz5 on Excluding the Supernatural · 2008-10-06T23:02:40.000Z · LW · GW

could you elaborate on the psychology of mythical creatures? That some creatures are "spiritual" sounds to me like a plausible distinction. I count vampires, but not unicorns. To me, a unicorn is just another chimera. Why do you think they're more special than mermaids? magic powers? How much of a consensus do you think exists?
Sorry I missed this!

I think it may have to do with how heavy a load of symbolism the creature carries. Unicorns were used a lot to symbolize purity, and acquired magical and non-magical properties appropriate to that symbolism. Dragons, vampires, and werewolves are also used symbolically. Mermaids, basilisks, not so much. Centaurs have lost their symbolism (a Greek Apollo/Dionysus dual-nature-of-man thing, I think), and CS Lewis did much to destroy the symbolism associated with fauns by making them nice chaps who like tea and dancing.

Now that I think about it, Lewis and Tolkien both wrote fantasy that was very literal-minded, and replaced symbolism with allegory.

Comment by Phil_Goetz5 on On Doing the Impossible · 2008-10-06T21:31:06.000Z · LW · GW

Thousands of years ago, philosophers began working on "impossible" problems. Science began when some of them gave up working on the "impossible" problems, and decided to work on problems that they had some chance of solving. And it turned out that this approach eventually lead to the solution of most of the "impossible" problems.

Comment by Phil_Goetz5 on My Bayesian Enlightenment · 2008-10-06T21:06:11.000Z · LW · GW


If you tried to approximate The Rules because they were too computationally expensive to use directly, then, no matter how necessary that compromise might be, you would still end doing less than optimal.
You say that like it's a bad thing. Your statement implies that something that is "necessary" is not necessary. Just this morning I gave a presentation on the use of Bayesian methods for automatically predicting the functions of newly sequenced genes. The authors of the method I presented used the approximation P(A, B, C) ~ P(A) x P(B|A) x P(C|A) because it would have been difficult to compute P(C | B, A), and they didn't think B and C were correlated. Your statement condemns them as "less than optimal". But a sub-optimal answer you can compute is better than an optimal answer that you can't.
Do only that which you must do, and which you cannot do in any other way.

I am willing to entertain the notion that this is not utter foolishness, if you can provide us with some examples - say, ten or twenty - of scientists who had success using this approach. I would be surprised if the ratio of important non-mathematical discoveries made by following this maxim, to those made by violating it, was greater than .05. Even mathematicians often have many possible ways of approaching their problems.


Building an AGI and setting it at "human level" would be of limited value. Setting it at "human level" plus epsilon could be dangerous. Humans on their own are intelligent enough to develop dangerous technologies with existential risk. (Which prompts the question: Are we safer with AI, or without AI?)

Comment by Phil_Goetz5 on The Magnitude of His Own Folly · 2008-09-30T20:53:23.000Z · LW · GW

If the probability of AI (or grey goo, or some other exotic risk) existential risks were low enough (neglecting the creation of hell-worlds with negative utility), then you could neglect in favor of those other risks.
Asteroids don't lead to a scenario in which a paper-clipping AI takes over the entire light-cone and turns it into paper clips, preventing any interesting life from ever arising anywhere, so they aren't quite comparable.

Still, your point only makes me wonder how we can justify not devoting 10% of GDP to deflecting asteroids. You say that we don't need to put all resources into preventing unfriendly AI, because we have other things to prevent. But why do anything productive? How do you compare the utility of preventing possible annihilation to the utility of improvements in life? Why put any effort into any of the mundane things that we put almost all of our efforts into? (Particularly if happiness is based on the derivative of, rather than absolute, quality of life. You can't really get happier, on average; but action can lead to destruction. Happiness is problematic as a value for transhumans.)

This sounds like a straw man, but it might not be. We might just not have reached (or acclimatized ourselves to) the complexity level at which the odds of self-annihilation should begin to dominate our actions. I suspect that the probability of self-annihilation increases with complexity. Rather like how the probability of an individual going mad may increase with their intelligence. (I don't think that frogs go insane as easily as humans do, though it would be hard to be sure.) Depending how this scales, it could mean that life is inherently doomed. But that would result in a universe where we were unlikely to encounter other intelligent life... uh...

It doesn't even need to scale that badly; if extinction events have a power law (they do), there are parameters for which a system can survive indefinitely, and very similar parameters for which it has a finite expected lifespan. Would be nice to know where we stand. The creation of AI is just one more point on this road of increasing complexity, which may lead inevitably to instability and destruction.

I suppose the only answer is to say that destruction is acceptable (and possibly inevitable); total area under the utility curve is what counts. Wanting an interesting world may be like deciding to smoke and drink and die young - and it may be the right decision. The AIs of the future may decide that dooming all life in the long run is worth it.

In short, the answer to "Eliezer's wager" may be that we have an irrational bias against destroying the universe.

But then, deciding what are acceptable risk levels in the next century depends on knowing more about cosmology, the end of the universe, and the total amount of computation that the universe is capable of.

I think that solving aging would change people's utility calculations in a way that would discount the future less, bringing them more in line with the "correct" utility computations.

Re. AI hell-worlds: SIAI should put "I have no mouth, and I must scream" by Harlan Ellison on its list of required reading.

Comment by Phil_Goetz5 on The Magnitude of His Own Folly · 2008-09-30T17:09:22.000Z · LW · GW

We are entering into a Pascal's Wager situation.

"Pascal's wager" is the argument that you should be Christian, because if you compute the expected value of being a Christian vs. of being an atheist, then for any finite positive probability that Christianity is correct, that finite probability multiplied by (infinite +utility minus infinite -utility) outweights the other side of the equation.

The similar Yudkowsky wager is the argument that you should be an FAIer, because the negative utility of destroying the universe outweighs the other side of the equation, whatever the probabilities are. It is not exactly analogous, unless you believe that the universe can support infinite computation (if it isn't destroyed), because the negative utility isn't actually infinite.

I feel that Pascal's wager is not a valid argument, but have a hard time articulating a response.

Comment by Phil_Goetz5 on Friedman's "Prediction vs. Explanation" · 2008-09-29T16:59:46.000Z · LW · GW

I've seen too many cases of overfitting data to trust the second theory. Trust the validated one more.

The question would be more interesting if we said that the original theory accounted for only some of the new data.

If you know a lot about the space of possible theories and "possible" experimental outcomes, you could try to compute which theory to trust, using (surprise) Bayes' law. If it were the case that the first theory applied to only 9 of the 10 new cases, you might find parameters such that you should trust the new theory more.

In the given case, I don't think there is any way to deduce that you should trust the 2nd theory more, unless you have some a priori measure of a theory's likelihood, such as its complexity.

Comment by Phil_Goetz5 on Competent Elites · 2008-09-28T01:21:31.000Z · LW · GW

It's true that we don't like to think people better-off than us might be better than us. But two caveats:

  1. Just because the cream is concentrated at the top, doesn't mean that most of the cream (or the best cream) is at the top.

  2. Causation probably runs both ways on this one. There is a lot of evidence that richer and more-respected people are happier and healthier. Various explanations have been tried to explain this, including the explanation that health causes career success. That explanation turned out to have serious problems, although I can't now remember what they are, other than that I heard them summarized in a talk from a SAGE (anti-aging) conference circa 2004, which I can no longer find any information via Google on because there is now a different organization called SAGE that holds conferences on LGBT aging that totally dominates Google search results.

I think that, if we could measure the degree to which a culture is able to promote based on merit, it would turn out to be a powerful economic indicator - particularly for knowledge-based economies.

Comment by Phil_Goetz5 on The Level Above Mine · 2008-09-26T19:39:07.000Z · LW · GW
You're probably among the top 10, certainly in the top 20, most-intelligent people I've met. That's good enough for anything you could want to do.

Okay, I realize you're going to read that and say, "It's obviously not good enough for things requiring superhuman intelligence!"

I meant that, if you compare your attributes to those of other humans, and you sort those attributes, with the one that presents you the most trouble in attaining your goal at the top, intelligence will not be near the top of that list for you, for any goal.

Comment by Phil_Goetz5 on The Level Above Mine · 2008-09-26T15:53:36.000Z · LW · GW

Wow, chill out, Eliezer. You're probably among the top 10, certainly in the top 20, most-intelligent people I've met. That's good enough for anything you could want to do. You are ranked high enough that luck, money, and contacts will all be more important factors for you than some marginal increase in intelligence.

Comment by Phil_Goetz5 on That Tiny Note of Discord · 2008-09-24T22:02:17.000Z · LW · GW

What I think is a far more likely scenario than missing out on the mysterious essence of rightness by indulging the collective human id, is that what 'humans' want as a complied whole is not what we'll want as individuals. Phil might be aesthetically pleased by a coherent metamorality, and distressed if the CEV determines what most people want is puppies, sex, and crack. Remember that the percentage of the population that actually engages in debates over moral philosophy is diminishingly small, and everyone else just acts, frequently incoherently.
Ooh! I vote for puppies, sex, and crack.

(Just not all at the same time.)

Comment by Phil_Goetz5 on That Tiny Note of Discord · 2008-09-24T21:54:16.000Z · LW · GW

Eliezer says:

As far as I can tell, Phil Goetz is still pursuing a mysterious essence of rightness - something that could be right, when the whole human species has the wrong rule of meta-morals.


I have made this point twice now, and you've failed to comprehend it either time, and you're smart enough to comprehend it, so I conclude that you are overconfident. :)

The human species does not consciously have any rule of meta-morals. Neither do they consciously follow rules to evolve in a certain direction. Evolution happens because the system dynamics cause them to happen. There is a certain subspace of possible (say) genomes that is, by some objective measures, "good".

Likewise, human morality may have evolved in ways that are "good", without humans knowing how that happened. I'm not going to try to figure out here what "good" might mean; but I believe the analogy I'm about to make is strong enough that you should admit this as a possibility. And if you don't, you must admit (which you haven't) my accusation that CEV is abandoning the possibility that there is such a thing as "good".

(And if you don't admit any possibility that there is such a thing as goodness, you should close up shop, go home, and let the paperclipping AIs take over.)

If we seize control over our physical and moral evolution, we'd damn well better understand what we're replacing. CEV means replacing evolution with a system whereby people vote on what feature they'd like to evolve next.

I know you can understand this next part, so I'm hoping to hear some evidence of comprehension from you, or some point on which you disagree:

  • Dynamic systems can be described by trajectories through a state space. Suppose you take a snapshot of a bunch of particles traveling along these trajectories. For some open systems, the entropy of the set of particles can decrease over time. (You might instead say that, for the complete closed system, the entropy of the projection of a set of particles onto a manifold of its space can decrease. I'm not sure this is equivalent, but my instinct is that it is.) I will call these systems "interesting".
  • For a dynamic system to be interesting, it must have dimensions or manifolds in its space along which trajectories contract; in a bounded state space, this means that trajectories will end at a point, or in a cycle, or in a chaotic attractor.
  • We desire, as a rule of meta-ethics, for humanity to evolve according to rules that are interesting, in the sense just described. This is equivalent to saying that the complexity of humanity/society, by some measure, should increase. (Agree? I assume you are familiar enough with complex adaptive systems that I don't need to justify this.)
  • A system can be interesting only if there is some dynamic causing these attractors. In evolution, this dynamic is natural selection. Most trajectories for an organism's genome, without selection, would lead off of the manifold in which that genome builds a viable creature. Without selection, mutation would simply increase the entropy of the genome. Natural selection is a force pushing these trajectories back towards the "good" manifold.
  • CEV proposes to replace natural selection with (trans)human supervision. You want to do this even though you don't know what the manifold for "good" moralities is, nor what aspects of evolution have kept us near that manifold in the past. The only way you can NOT expect this to be utterly disastrous, is if you are COMPLETELY CERTAIN that morality is arbitrary, and there is no such manifold.

Since there OBVIOUSLY IS such a manifold for "fitness", I think the onus is on you to justify your belief that there is no such manifold for "morality". We don't even need to argue about terms. The fact that you put forth CEV, and that you worry about the ethics of AIs, proves that you do believe "morality" is a valid concept. We don't need to understand that concept; we need only to know that it exists, and is a by-product of evolution. "Morality" as developed further under CEV is something different than "morality" as we know it, by which I mean, precisely, that it would depart from the manifold. Whatever the word means, what CEV would lead to would be something different.

CEV makes an unjustified, arbitrary distinction between levels. It considers the "preferences" (which I, being a materialist, interpret as "statistical tendencies" of organisms, or of populations; but not of the dynamic system. Why do you discriminate against the larger system?

Carl writes,

If Approach 2 fails to achieve the aims of Approach 1, then humanity generally wouldn't want to pursue Approach 1 regardless. Are you asserting that your audience would tend to diverge from the rest of humanity if extrapolated, in the direction of Approach 1?
Yes; but reverse the way you say that. There are already forces in place that keep humanity evolving in ways that may be advantageous morally. CEV wants to remove those forces without trying to understand them first. Thus it is CEV that will diverge from the way human morality has evolved thus far.

Comment by Phil_Goetz5 on That Tiny Note of Discord · 2008-09-23T21:42:42.000Z · LW · GW

It sounds to me like this is leading towards collective extrapolated volition, and that you are presenting it as "patching" your previous set of beliefs so as to avoid catastrophic results in case life is meaningless.

It's not a patch. It's throwing out the possibility that life is not meaningless. Or, at least, it now opens up a big security hole for a set of new paths to catastrophe.

Approach 1: Try to understand morality. Try to design a system to be moral, or design a space for that system in which the gradient of evolution is similar to the gradient for morality.

Approach 2: CEV.

If there is some objective aspect to morality - perhaps not a specific morality, but let us say there are meta-ethics, rules that let us evaluate moral systems - then approach 1 can optimize above and beyond human morality.

Approach 2 can optimize accomplishment of our top-level goals, but can't further-optimize the top-level goals. It freezes-in any existing moral flaws at that level forever (such flaws do exist if there is an objective aspect to morality). Depending on the nature of the search space, it may inevitably lead to moral collapse (if we are at some point in moral space that has been chosen by adaptive processes that keep that point near some "ideal" manifold, and trajectories followed through moral space via CEV diverge from that manifold).

Comment by Phil_Goetz5 on Optimization · 2008-09-15T16:26:19.000Z · LW · GW

Eliezer - Consider maximizing y in the search space y = - vector_length(x). You can make this space as large as you like, by increasing the range or the dimensionality of x. But it does not get any more difficult, whether you measure by difficulty, power needed, or intelligence needed.

Comment by Phil_Goetz5 on Excluding the Supernatural · 2008-09-12T15:48:31.000Z · LW · GW

I thought about this a bit more last night. I think the right justification for religion - which is not one that any religious person would consciously agree with - is that it does not take on faith the idea that truth is always good.

Reductionism aims at learning the truth. Religion is inconsistent and false - and that's a feature, not a bug. Its social purpose is to grease the wheels of society where bare truth would create friction.

For example: In Rwanda, people who slaughtered the families of other people in their village, are now getting out of jail and coming back to live with the surviving relatives of their victims in the same villages. Rwanda needs this to happen; there are so many killers and conspirators, that they can't keep them in jail or kill them - these killers are a significant part of their nation's work force. Also, this would start the war all over again.

I have heard a few accounts of how they persuade the surviving relatives to forgive and live with the killers. They agree that the only way to do this is by using religious arguments.

Perhaps a true rationalist could be persuaded to leave the killer of their family alone, on grounds of self-interest. I'm easily more rational than 99.9% of the population, but I don't think I'm that rational.

If we had a population of purely rational thinking machines, perhaps we would need no religion. But since we have only humans to work with, it may play a valid role where the irrational nature of humans and the rational truth of science would, together, lead to disaster.

Comment by Phil_Goetz5 on Excluding the Supernatural · 2008-09-12T01:45:51.000Z · LW · GW

Once, in a LARP, I played Isaac Asimov on a panel which was arguing whether vampires were real. It went something like this (modulo my memory): I asked the audience to define "vampire", and they said that vampires were creatures that lived by drinking blood.

I said that mosquitoes were vampires. So they said that vampires were humanoids who lived by drinking blood.

I said that Masai who drank the blood of their cattle were vampires. So they said that vampires were humanoids who lived by drinking blood, and were burned by sunlight.

I (may have) said that a Masai with xeroderma pigmentosum was a vampire. And so on.

My point was that vampires were by definition not real - or at least, not understandable - because any time we found something real and understandable that met the definition of a vampire, we would change the definition to exclude it.

(Strangely, some mythical creatures, such as vampires and unicorns, seem to be defined in a spiritual way; whereas others, such as mermaids and centaurs, do not. A horse genetically engineered to grow a horn would probably not be thought of as a "real" unicorn; a genenged mermaid probably would be admitted to be a "real" mermaid.)

Comment by Phil_Goetz5 on Excluding the Supernatural · 2008-09-12T01:29:26.000Z · LW · GW

I had a similar, shorter conversation with a theologian. He had hired me to critique a book he was writing, which claimed that reductionist science had reached its limits, and that it was time to turn to non-reductionist science.

The examples he gave were all phenomena which science had difficulty explaining, and which he claimed to explain as being irreducibly complex. For instance, because people had difficulty explaining how cells migrate in a developing fetus, he suggested (as Aristotle might have) that the cells had an innate fate or desire that led them to the right location.

What he really meant by non-reductionist science, was that as a "non-reductionist scientist", one is allowed to throw up one's hands, and say that there is no explanation for something. A claim that a phenomenon is supernatural is always the assertion that something has no explanation. (I don't know that it needs to be presented as a mental phenomenon, as Eliezer says.) So to "do" non-reductionist science is simply to not do science.

It should be possible, then, for a religious person to rightly claim that their point of view is outside the realm of science. If they said, for instance, that lightning is a spirit, that is not a testable hypothesis.

In practice, religions build up webs of claims, and of connections to the non-spiritual world, that can be tested for consistency. If someone claims not just that lightning is a spirit, but that an anthropomorphic God casts lightning bolts at sinners, that is a testable hypothesis. Once, when I was a Christian, lightning struck the cross behind my church. This struck me as strong empirical evidence against the idea that God directed every bolt. (I suppose one could interpret it as divine criticism of the church. The church elders did not, however, pursue that angle.)

Comment by Phil_Goetz5 on Points of Departure · 2008-09-09T22:35:08.000Z · LW · GW

Perhaps this is how we generally explain the actions of others. The notion of a libertarian economist who wants to deregulate industry because he has thought about it and decided it is good for everyone in the long run, would be about as alien to most people as an AI. They find it much more believable that he is a tool of corporate oppression.

Whether this heuristic reduction to the simplest explanation is wrong more often than it is right, is another question.

Comment by Phil_Goetz5 on Magical Categories · 2008-08-25T14:45:28.000Z · LW · GW

There are several famous science fiction stories about humans who program AIs to make humans happy, which then follow the letter of the law and do horrible things. The earliest is probably "With folded hands", by Jack Williamson (1947), in which AIs are programmed to protect humans, and they do this by preventing humans from doing anything or going anywhere. The most recent may be the movie "I, Robot."

I agree with E's general point - that AI work often presupposes that the AI magically has the same concepts as its inventor, even outside the training data - but the argument he uses is insidious and has disastrous implications:

Which is the correct classification? This is not a property of the training data; it is a property of your preferences (or, if you prefer, a property of the idealized abstract dynamic you name "right").
This is the most precise assertion of the relativist fallacy than I've ever seen. It's so precise that its wrongness should leap out at you. (It's a shame that most relativists don't have the computational background for me to use it to explain why they're wrong.)

By "relativism", I mean (at the moment) the view that almost everything is just a point of view: There is no right or wrong, no beauty or ugliness. (Pure relativism would also claim that 2+2=5 is as valid as 2+2=4. There are people out there who think that. I'm not including that claim in my temporary definition.)

The argument for relativism is that you can never define anything precisely. You can't even come up with a definition for the word "game". So, the argument goes, whatever definition you use is okay. Stated more precisely, it would be Eliezer's claim that, given a set of instances, any classifier that agrees with the input set is equally valid.

The counterargument is, in part, that some classifiers are better than others, even when all of them satisfy the training data completely. The most obvious criterion to use is the complexity of the classifier.

Eliezer's argument, if he followed it through, would conclude that neural networks, and induction in general, can never work. The fact is that it often does.

Comment by Phil_Goetz5 on Mirrors and Paintings · 2008-08-25T14:00:45.000Z · LW · GW
Phil Goetz, why should I care what sort of creatures the universe "tends to produce"? What makes this a moral argument that should move me? Do you think that most creatures the universe produces must inevitably evolve to be moved by such an argument?

I stated the reason:

We MUST make this meta-level argument that the universe inherently produces creatures with pretty-valuable values. We have no other way of claiming to be better than pebble-sorters.

I don't think that we can argue for our framework of ideas from within our framework of ideas. If we continue to insist that we are better than pebble-sorters, we can justify it only by claiming that the processes that lead to our existence tend to produce good outcomes, whereas the hypothetical pebble-sorters are chosen from a much larger set of possible beings, with a much lower average moral acceptability.

A problem with this is that all sorts of insects and animals exist with horrifying "moral systems". We might convince ourselves that morals improve as a society becomes more complex. (That's just a thought in postscript.)

One possible conclusion - not one that I have reached, but one that you might conclude if the evidence comes out a certain way - is that the right thing to do is not to make any attempt to control the morals of AIs, because general evolutionary processes may be better at designing morals than we are.

Comment by Phil_Goetz5 on Mirrors and Paintings · 2008-08-24T17:52:03.000Z · LW · GW

Thinking about this post leads me to conclude that CEV is not the most right thing to do. There may be a problem with my reasoning, in that it could also be used by pebble-sorters to justify continued pebble-sorting. However, my reasoning includes the consequence that pebble-sorters are impossible, so that is a non-issue.

Think about our assumption that we are in fact better than pebble-sorters. It seems impossible for us to construct an argument concluding this, because any argument we make presumes the values we are trying to conclude.

Yet we continue to use the pebble-sorters, not as an example of another, equally-valid ethical system, but as an example of something wrong.

We can justify this by making a meta-level argument that the universe is biased to produce organisms with relatively valuable values. (I'm worried about the semantics of that statement, but let me continue.) Pebble-sorting, and other futile endeavors, are non-adaptive, and will lose any evolutionary race to systems that generate increased complexity (from some energy input).

We MUST make this meta-level argument that the universe inherently produces creatures with pretty-valuable values. We have no other way of claiming to be better than pebble-sorters.

Given this, we could use CEV to construct AIs... but we can also try to understand WHY the universe produces good values. Once we understand that, we can use the universe's rules to direct the construction of AIs. This could result in AIs with wildly different values than our own, but it may be more likely to result in non-futile AIs, or to produce more-optimal AIs (in terms of their values).

It may, in fact, be difficult or impossible to construct AIs that aren't eventually subject to the universe's benevolent, value-producing bias - since these AIs will be in the universe. But we have seen in human history that, although there are general forces causing societies with some of our values to prosper, we nonetheless find societies in local minima in which they are in continual warfare, pain, and poverty. So some effort on our part may increase the odds of, or the decrease the time until, a good result.