**pengvado**on Open Problems Related to Solomonoff Induction · 2015-01-07T12:58:28.635Z · score: 2 (2 votes) · LW · GW

Use a prefix-free encoding for the hypotheses. There's *not* 2^n hypotheses of length n: Some of the length-n bitstrings are incomplete and you'd need to add more bits in order to get a hypothesis; others are actually a length <n hypothesis plus some gibberish on the end.

Then the sum of the probabilities of all programs of all lengths combined is 1.0. After excluding the programs that don't halt, the normalization constant is Chaitin's Omega.

**pengvado**on Open thread, Dec. 1 - Dec. 7, 2014 · 2014-12-05T10:56:04.628Z · score: 0 (0 votes) · LW · GW

The true point of no return has to be indeed much later than we believe it to be now.

Who is "we", and what do "we" believe about the point of no return? Surely you're not talking about ordinary doctors pronouncing medical death, because that's just irrelevant (pronouncements of medical death are assertions about what current medicine can repair, not about information-theoretic death). But I don't know what other consensus you could be referring to.

**pengvado**on Stupid Questions (10/27/2014) · 2014-12-03T18:48:18.704Z · score: 1 (1 votes) · LW · GW

I think your answer is in The Domain of Your Utility Function. That post isn't specifically about cryonics, but is about how you can care about possible futures in which you will be dead. If you understand both of the perspectives therein and are still confused, then I can elaborate.

**pengvado**on Causal decision theory is unsatisfactory · 2014-09-18T08:42:44.171Z · score: 3 (3 votes) · LW · GW

Why would a self-improving agent not improve its own decision-theory to reach an optimum without human intervention, given a "comfortable" utility function in the first place?

A self-improving agent does improve its own decision theory, but it uses its current decision theory to predict which self-modifications would be improvements, and broken decision theories can be wrong about that. Not all starting points converge to the same answer.

**pengvado**on Causal decision theory is unsatisfactory · 2014-09-16T05:00:22.492Z · score: 2 (2 votes) · LW · GW

That strategy is optimal if and only if the probably of success was reasonably high after all. Otoh, if you put an unconditional extortioner in an environment mostly populated by decision theories that refuse extortion, then the extortioner will start a war and end up on the losing side.

**pengvado**on Confused as to usefulness of 'consciousness' as a concept · 2014-07-15T15:17:36.786Z · score: 1 (1 votes) · LW · GW

Jbay didn't specify that the drug has to leave people able to answer questions about their own emotional state. And in fact there are some people who can't do that, even though they're otherwise functional.

**pengvado**on Friendly AI ideas needed: how would you ban porn? · 2014-03-27T00:10:21.583Z · score: 0 (0 votes) · LW · GW

There are many such operators, and different ones give different answers when presented with the same agent. Only a human utility function distinguishes the right way of interpreting a human mind as having a utility function from all of the wrong ways of interpreting a human mind as having a utility function. So you need to get a bunch of Friendliness Theory right before you can bootstrap.

**pengvado**on Open thread, January 25- February 1 · 2014-01-30T12:14:48.219Z · score: 0 (0 votes) · LW · GW

fanficdownloader. I haven't tried the webapp version of it, but I'm happy with the CLI.

**pengvado**on Why CFAR? · 2014-01-07T08:54:39.300Z · score: 54 (54 votes) · LW · GW

I donated $40,000.00

**pengvado**on Open thread for January 1-7, 2014 · 2014-01-04T13:42:57.673Z · score: 2 (2 votes) · LW · GW

If you can encode microstate s in n bits, that implies that you have a prior that assigns P(s)=2^-n. The set of all possible microstates is countably infinite. There is no such thing as a uniform distribution over a countably infinite set. Therefore, even the ignorance prior can't assign equal length bitstrings to all microstates.

**pengvado**on Walkthrough of "Definability of Truth in Probabilistic Logic" · 2013-12-11T04:09:45.387Z · score: 1 (1 votes) · LW · GW

- Can we instead do "probability distribution over equivalence classes of models of L", where equivalence is determined by agreement on the truth-values of all first order sentences? There's only 2^ℵ₀ of those, and the paper never depends on any distinction within such an equivalence class.

**pengvado**on The Relevance of Advanced Vocabulary to Rationality · 2013-11-30T18:18:09.696Z · score: 3 (3 votes) · LW · GW

Yes, that's the usual application, but it's the wrong level of generality to make them synonyms. "Fully general counterargument" is one particular absurdity that you can reduce things to. Even after you've specified that you're performing a reductio ad absurdum against the proposition "argument X is sound", you still need to say what the absurd conclusion is, so you still need a term for "fully general counterargument".

**pengvado**on Quantum versus logical bombs · 2013-11-18T08:50:13.670Z · score: 11 (11 votes) · LW · GW

Why should you not have preferences about something just because you can't observe it? Do you also not care whether an intergalactic colony-ship survives its journey, if the colony will be beyond the cosmological horizon?

**pengvado**on Looking for opinions of people like Nick Bostrom or Anders Sandberg on current cryo techniques · 2013-10-20T05:20:42.449Z · score: 3 (3 votes) · LW · GW

Here's a citation for the claim of DRAM persisting with >99% accuracy for seconds at operating temperature or hours at LN2. (The latest hardware tested there is from 2007. Did something drastically change in the last 6 years?)

**pengvado**on Timeless Identity · 2013-10-01T18:09:33.703Z · score: 2 (2 votes) · LW · GW

What relevance does personal identity have to TDT? TDT doesn't depend on whether the other instances of TDT are in copies of you, or in other people who merely use the same decision theory as you.

**pengvado**on Open thread, September 2-8, 2013 · 2013-09-21T11:13:14.216Z · score: 3 (3 votes) · LW · GW

That works with caveats: You can't just publish the seed in advance, because that would allow the player to generate the coin in advance. You can't just publish the seed in retrospect, because the seed is an ordinary random number, and if it's unknown then you're just dealing with an ordinary coin, not a logical one. So publish in advance the first k bits of the pseudorandom stream, where k > seed length, thus making it information-theoretically possible but computationally intractable to derive the seed; use the k+1st bit as the coin; and then publish the seed itself in retrospect to allow verification.

Possible desiderata that are still missing: If you take multiple coins from the same pseudorandom stream, then you can't allow verification until the end of the whole experiment. You could allow intermediate verification by committing to N different seeds and taking one coin from each, but that fails wedrifid's desideratum of a single indexable problem (which I assume is there to prevent Omega from biasing the result via nonrandom choice of seed?).

I can get both of those desiderata at once using a different protocol: Pick a public key cryptosystem, a key, and a hash function with a 1-bit output. You need a cryptosystem where there's only one possible signature of any given input+key, i.e. one that doesn't randomize encryption. To generate the Nth coin: sign N, publish the signature, then hash the signature.

**pengvado**on The genie knows, but doesn't care · 2013-09-07T19:22:29.453Z · score: 9 (9 votes) · LW · GW

In fact, the question itself seems superficially similar to the halting problem, where "running off the rails" is the analogue for "halting"

If you want to draw an analogy to halting, then what that analogy actually says is: There are lots of programs that provably halt, and lots that provably don't halt, and lots that aren't provable either way. The impossibility of the halting problem is irrelevant, because we don't need a fully general classifier that works for every possible program. We only need to find a single program that provably has behavior X (for some well-chosen value of X).

If you're postulating that there are some possible friendly behaviors, and some possible programs with those behaviors, but that they're *all* in the unprovable category, then you're postulating that friendliness is *dissimilar* to the halting problem in that respect.

**pengvado**on Yet more "stupid" questions · 2013-09-04T01:29:52.751Z · score: 1 (1 votes) · LW · GW

Then what you should be asking is "which problems are in BQP?" (if you just want a summary of the high level capabilities that have been proved so far), or "how do quantum circuits work?" (if you want to know what role individual qubits play). I don't think there's any meaningful answer to "a qubit's specs" short of a tutorial in the aforementioned topics. Here is one such tutorial I recommend.

**pengvado**on Yet more "stupid" questions · 2013-08-31T02:18:47.136Z · score: 3 (3 votes) · LW · GW

"do not affect anything outside of this volume of space"

Suppose you, standing outside the specified volume, observe the end result of the AI's work: Oops, that's an example of the AI affecting you. Therefore, the AI isn't allowed to do anything at all. Suppose the AI does nothing: Oops, you can see that too, so that's also forbidden. More generally, the AI is made of matter, which will have gravitational effects on everything in its future lightcone.

**pengvado**on Open thread, August 19-25, 2013 · 2013-08-21T17:37:45.561Z · score: 1 (1 votes) · LW · GW

Suppose I say "I prefer state X to Z, and don't express a preference between X and Y, or between Y and Z." I am not saying that X and Y are equivalent; I am merely refusing to judge.

If the result of that partial preference is that you start with Z and then decline the sequence of trades Z->Y->X, then you got dutch booked.

Otoh, maybe you want to accept the sequence Z->Y->X if you expect both trades to be offered, but decline each in isolation? But then your decision procedure is dynamically inconsistent: Standing at Z and expecting both trade offers, you have to precommit to using a different algorithm to evaluate the Y->X trade than you will want to use once you have Y.

**pengvado**on Does Checkers have simpler rules than Go? · 2013-08-19T04:38:19.979Z · score: 1 (1 votes) · LW · GW

I interpret Daniel_Burfoot's idea as: "import java.util.*" makes subsequent mentions of List longer, since there are more symbols in scope that it has to be distinguished from.

But I don't think that idea actually works. You can decompose the probability of a conjunction into a product of conditional probabilities, and you get the same number regardless of the order of said decomposition. Whatever probability (and corresponding total compressed size) you assign to a certain sequence of imports and symbols, you could just as well record the symbols first and then the imports. In which case by the time you get to the imports you already know that the program didn't invoke any utils other than List and Map, so even being able to represent the distinction between the two forms of import is counterproductive for compression.

The intuition of "there are more symbols in scope that it has to be distinguished from" goes wrong because it fails to account for updating a probability distribution over what symbol comes next based on what symbols you've seen. Such an update can include knowledge of which symbols come in the same header, if that's correlated with which symbols are likely to appear in the same program.

**pengvado**on Open thread, July 29-August 4, 2013 · 2013-07-30T20:44:40.801Z · score: 1 (1 votes) · LW · GW

Eliezer's proposal was a different notation, not an actual change in the strength of Solomonoff Induction. The usual form of SI with deterministic hypotheses is already equivalent to one with probabilistic hypotheses. Because a single hypothesis with prior probability P that assigns uniform probability to each of 2^N different bitstrings, makes the same predictions as an ensemble of 2^N deterministic hypotheses each of which has prior probability P*2^-N and predicts one of the bitstrings with certainty; and a Bayesian update in the former case is equivalent to just discarding falsified hypotheses in the latter. Given any computable probability distribution, you can with O(1) bits of overhead convert it into a program that samples from that distribution when given a uniform random string as input, and then convert that into an ensemble of deterministic programs with different hardcoded values of the random string. (The other direction of the equivalence is obvious: a computable deterministic hypothesis is just a special case of a computable probability distribution.)

Yes, if you put a Solomonoff Inductor in an environment that contains a fair coin, it would come up with increasingly convoluted Turing machines. This is a problem only if you care about the value of an intermediate variable (posterior probability assigned to individual programs), rather than the variable that SI was actually designed to optimize, namely accurate predictions of sensory inputs. This manifests in AIXI's limitation to using a sense-determined utility function. (Granted, a sense-determined utility function really isn't a good formalization of my preferences, so you couldn't build an FAI that way.)

**pengvado**on Open thread, July 23-29, 2013 · 2013-07-25T01:16:18.233Z · score: 1 (1 votes) · LW · GW

Is there a benefit from doing that server-side rather than client-side? I've long since configured my web browser to always use my favorite font rather than whatever is suggested by any website.

**pengvado**on Harry Potter and the Methods of Rationality discussion thread, part 24, chapter 95 · 2013-07-19T08:17:18.620Z · score: 1 (1 votes) · LW · GW

"Oh," said Professor Quirrell, "don't worry about a little rough handling. You could toss this diary in a fireplace and it would emerge unscathed.

That isn't necessarily the same level of indestructibility as a horcrux. It could just be a standard charm placed on rare books.

**pengvado**on Evidential Decision Theory, Selection Bias, and Reference Classes · 2013-07-19T03:14:10.786Z · score: 1 (1 votes) · LW · GW

If I already know "I am EDT", then "I saw myself doing X" does imply "EDT outputs X as the optimal action". Logical omniscience doesn't preclude imagining counterfactual worlds, but imagining counterfactual worlds is a different operation than performing Bayesian updates. CDT constructs counterfactuals by severing some of the edges in its causal graph and then assuming certain values for the nodes that no longer have any causes. TDT does too, except with a different graph and a different choice of edges to sever.

**pengvado**on Evidential Decision Theory, Selection Bias, and Reference Classes · 2013-07-18T10:03:12.993Z · score: 1 (1 votes) · LW · GW

The way EDT operates is to perform the following three steps for each possible action in turn:

- Assume that I saw myself doing X.
- Perform a Bayesian update on this new evidence.
- Calculate and record my utility.

Ideal Bayesian updates assume logical omniscience, right? Including knowledge about logical fact of what EDT would do for any given input. If you know that you are an EDT agent, and condition on all of your past observations and also on the fact that you do X, but X is not in fact what EDT does given those inputs, then as an ideal Bayesian you will know that you're conditioning on something impossible. More generally, what update you perform in step 2 depends on EDT's input-output map, thus making the definition circular.

So, is EDT really underspecified? Or are you supposed to search for a fixed point of the circular definition, if there is one? Or does it use some method other than Bayes for the hypothetical update? Or does an EDT agent really break if it ever finds out its own decision algorithm? Or did I totally misunderstand?

**pengvado**on New report: Intelligence Explosion Microeconomics · 2013-07-17T04:01:09.096Z · score: 0 (0 votes) · LW · GW

Yes (At least that's the general consensus among complexity theorists, though it hasn't been proved.) This doesn't contradict anything Eliezer said in the grandparent. The following are all consensus-but-not-proved:

P⊂BQP⊂EXP

P⊂NP⊂EXP

BQP≠NP (Neither is confidently predicted to be a subset of the other, though BQP⊂NP is at least plausible, while NP⊆BQP is not.)

If you don't measure any distinctions finer than P vs EXP, then you're using a ridiculously coarse scale. There are lots of complexity classes strictly between P and EXP, defined by limiting resources other than time-on-a-classical-computer. Some of them are tractable under our physics and some aren't.

**pengvado**on Open Thread, July 1-15, 2013 · 2013-07-14T00:32:47.217Z · score: 0 (0 votes) · LW · GW

If instead the simulator can read the real probability on an infinite tape... obviously it can't read the whole tape before producing an output. So it has to read, then output, then read, then output. It seems intuitive that with this strategy, it can place an absolute limit on the advantage that any attacker can achieve, but I don't have a proof of that yet.

In this model, a simulator can exactly match the desired probability in O(1) expected time per sample. (The distribution of possible running times extends to arbitrarily large values, but the L1-norm of the distribution is finite. If this were a decision problem rather than a sampling problem, I'd call it ZPP.)

Algorithm:

- Start with an empty string S.
- Flip a coin and append it to S.
- If S is exactly equal to the corresponding-length prefix of your probability tape P, then goto 2.
- Compare (S < P)

**pengvado**on For FAI: Is "Molecular Nanotechnology" putting our best foot forward? · 2013-06-24T10:45:39.510Z · score: 2 (2 votes) · LW · GW

Answering "how will this protein most likely fold?" is computationally much easier (as far as we can tell) than answering "what protein will fold like this?"

Got a reference for that? It's not obvious to me (CS background, not bio).

What if you have an algorithm that attempts to solve the "how will this protein most likely fold?" problem, but is only tractable on 1% of possible inputs, and just gives up on the other 99%? As long as the 1% contains enough interesting structures, it'll still work as a subroutine for the "what protein will fold like this?" problem. The search algorithm just has to avoid the proteins that it doesn't know how to evaluate. That's how human engineers work, anyway: "what does this pile of spaghetti code do?" is uncomputable in the worst case, but that doesn't stop programmers from solving "write a program that does X".

**pengvado**on Open Thread, June 16-30, 2013 · 2013-06-23T11:28:51.237Z · score: 2 (2 votes) · LW · GW

The selection effect you mention only applies to offering bets, not accepting them. If Alice announces her betting odds and then Bob decides which side of the bet to take, Alice might be doing something irrational there (if she didn't have a bid-ask spread), but we can still talk about dutch books from Bob's perspective. If you want to eliminate the effect whereby Bob updates on the existence of Alice's offer before making his decision, then replace Alice with an automated market maker (setup by someone who expects to lose money in exchange for outsourcing the probability estimate). Or assume some natural process with a naturally occurring payoff ratio that isn't determined by the payoff frequencies nor by anyone's state of knowledge.

**pengvado**on Prisoner's Dilemma (with visible source code) Tournament · 2013-06-11T07:03:44.689Z · score: 1 (1 votes) · LW · GW

I had in mind an automated wrapper generator for the "passed own sourcecode" version of the contest:

```
(define CliqueBot
(lambda (self opponent)
(if (eq? self opponent) 'C 'D)))
(define Wrapper
(lambda (agent)
(lambda (self opponent)
(agent agent opponent))))
(define WrappedCliqueBot
(Wrapper CliqueBot))
```

Note that for all values of X and Y, (WrappedCliqueBot X Y) == (CliqueBot CliqueBot Y), and there's no possible code you could add to CliqueBot that would break this identity. Now I just realized that the very fact that WrappedCliqueBot *doesn't* depend on its "self" argument, provides a way to distinguish it from the unmodified CliqueBot using only blackbox queries, so in that sense it's not quite functionally identical. Otoh, if you consider it unfair to discriminate against agents just because they use old-fashioned quine-type self-reference rather than exploiting the convenience of a "self" argument, then this transformation is fair.

**pengvado**on Prisoner's Dilemma (with visible source code) Tournament · 2013-06-10T13:23:04.378Z · score: 2 (2 votes) · LW · GW

How does that help? A quine-like program could just as well put its real payload in a string with a cryptographic signature, verify the signature, and then eval the string with the string as input; thus emulating the "passed its own sourcecode" format. You could mess with that if you're smart enough to locate and delete the "verify the signature" step, but then you could do that in the real "passed its own sourcecode" format too.

Conversely, even if the tournament program itself is honest, contestants can lie to their simulations of their opponents about what sourcecode the simulation is of.

**pengvado**on Estimates vs. head-to-head comparisons · 2013-05-09T03:11:14.398Z · score: 1 (1 votes) · LW · GW

Assume you have noisy measurements X1, X2, X3 of physical quantities Y1, Y2, Y3 respectively; variables 1, 2, and 3 are independent; X2 is much noisier than the others; and you want a point-estimate of Y = Y1+Y2+Y3. Then you shouldn't use either X1+X2+X3 or X1+X3. You should use E[Y1|X1] + E[Y2|X2] + E[Y3|X3]. Regression to the mean is involved in computing each of the conditional expectations. Lots of noise (relative to the width of your prior) in X2 means that E[Y2|X2] will tend to be close to the prior E[Y2] even for extreme values of X2, but E[Y2|X2] is still a better estimate of that portion of the sum than E[Y2] is.

**pengvado**on Can somebody explain this to me?: The computability of the laws of physics and hypercomputation · 2013-04-22T14:07:44.511Z · score: 2 (2 votes) · LW · GW

unless you are allowed to pose infinitely many problems

Or one selected at random from an infinite class of problems.

Also, if the universe is spatially infinite, it can solve the halting problem in a deeply silly way, namely there could be an infinite string of bits somewhere, each a fixed distance from the next, that just hardcodes the solution to the halting problem.

That's why both computability theory and complexity theory require algorithms to have finite sized sourcecode.

**pengvado**on Time turners, Energy Conservation and General Relativity · 2013-04-22T00:05:07.504Z · score: 2 (2 votes) · LW · GW

Novikov consistency is synonymous with Stable Time Loop, where all time travelers observe the same events as they remember from their subjectively-previous iteration. This is as opposed to MWI-based time travel, where the no paradox rule merely requires that the overall distribution of time travelers arriving at t0 is equal to the overall distribution of people departing in time machines at t1.

Yes, Novikov talked about QM. He used the sum-over-histories formulation, restricted to the subset of histories that each singlehandedly form a classical stable time loop. This allows some form of multiple worlds, but not standard MWI: This forbids any Everett branching from happening during the time loop (if any event that affects the time traveler's state branched two ways, one of them would be inconsistent with your memory), and instead branches only on the question of what comes out of the time machine.

**pengvado**on Bitcoins are not digital greenbacks · 2013-04-21T11:45:48.106Z · score: 5 (5 votes) · LW · GW

The US government made Tor? Awesome. I wonder which part of the government did it.

**pengvado**on Time turners, Energy Conservation and General Relativity · 2013-04-18T13:50:15.306Z · score: 3 (3 votes) · LW · GW

You can certainly postulate a physics that's both MWI and contains something sorta like Time-Turners except without the Novikov property. The problem with that isn't paradox, it just doesn't reproduce the fictional experimental evidence we're trying to explain. What's impossible is MWI with something exactly like Time-Turners including Novikov.

**pengvado**on Fermi Estimates · 2013-04-07T05:07:26.916Z · score: 5 (5 votes) · LW · GW

More precisely, you can compute the variance of the logarithm of the final estimate and, as the number of pieces gets large, it will shrink compared to the expected value of the logarithm (and even more precisely, you can use something like Hoeffding's inequality).

If success of a fermi estimate is defined to be "within a factor of 10 of the correct answer", then that's a *constant* bound on the allowed error of the logarithm. No "compared to the expected value of the logarithm" involved. Besides, I wouldn't expect the value of the logarithm to grow with number of pieces either: the log of an individual piece can be negative, and the true answer doesn't get bigger just because you split the problem into more pieces.

So, assuming independent errors and using either Hoeffding's inequality or the central limit theorem to estimate the error of the result, says that you're better off using as few inputs as possible. The reason fermi estimates even involve more than 1 step, is that you can make the per-step error smaller by choosing pieces that you're somewhat confident of.

**pengvado**on Reflection in Probabilistic Logic · 2013-03-24T02:52:38.086Z · score: 0 (0 votes) · LW · GW

What if, in building a non-Löb-compliant AI, you've already failed to give it part of your inference ability / trust-in-math / whatever-you-call-it? Even if the AI figures out how to not lose any more, that doesn't mean it's going to get back the part you missed.

Possibly related question: Why try to solve decision theory, rather than just using CDT and let it figure out what the right decision theory is? Because CDT uses its own impoverished notion of "consequences" when deriving what the consequence of switching decision theories is.

**pengvado**on Bayesian Adjustment Does Not Defeat Existential Risk Charity · 2013-03-19T01:20:42.136Z · score: 0 (0 votes) · LW · GW

"50:4" in the post refers to "P(V=1|A=100)*1 : P(V=100|A=100)*100", not "EV(A=1) : EV(A=100)". EV(A=1) is irrelevant, since we know that A is in fact 100.

**pengvado**on Save the princess: A tale of AIXI and utility functions · 2013-02-02T10:53:06.360Z · score: 4 (4 votes) · LW · GW

IIUC this addresses the ontology problem in AIXI by assuming that the domain of our utility function already covers every possible computable ontology, so that whichever one turns out to be correct, we already know what to do with it. If I take the AIXI formalism to be a literal description of the universe (i.e. not just dualism, but also that the AI is running on a hypercomputer, the environment is running on a turing computer, and the utility function cares only about the environment, not the AI's internals), then I think the proposal works.

But under the more reasonable assumption of an environment at least as computationally complex as the AI itself (whether AIXI in an uncomputable world or AIXI-tl in an exptime world or whatever), Solomonoff Induction still beats any computable sensory prediction, but the proposed method of dealing with utility functions fails: The Solomonoff mixture doesn't contain a simulation of the true state of the environment, so the utility function is never evaluated with the true state of the environment as input. It's evaluated on various approximations of the environment, but those approximations are selected so as to correctly predict sensory inputs, not to correctly predict utilities. If the proposal was supposed to work for arbitrary utility functions, then there's no reason to expect those two approximation problems to be at all related even before Goodhart's law comes into the picture; or if you intended to assume some smoothness-under-approximation constraint on the utility function, then that's exactly the part of the problem that you handwaved.

**pengvado**on 2012 Survey Results · 2013-01-10T23:50:31.359Z · score: 1 (1 votes) · LW · GW

As far as I can tell from wikipedia's description of admissibility, it makes the same assumptions as CDT: That the outcome depends only on your action and the state of the environment, and not on any other properties of your algorithm. This assumption fails in multi-player games.

So your quote actually means: *If you're going to use CDT* then Bayes is the optimal way to derive your probabilities.

**pengvado**on So you think you understand Quantum Mechanics · 2013-01-02T13:28:55.807Z · score: 3 (3 votes) · LW · GW

The intuition: For a high dimensional ball, most of the volume is near the surface, and most of the surface is near the equator (for any given choice of equator). The extremity of "most" and "near" increases with number of dimensions. The intersection of two equal-size balls is a ball minus a slice through the equator, and thus missing most of its volume even if it's a pretty thin slice.

The calculation:
Let %20=%202\int%20_{y=0}%5E{r}%20v(n-1,\sqrt{r%5E2-y%5E2})%20dy%20=%20(2%20\pi%5E{n/2}%20r%5En)%20/%20(n%20\Gamma(n/2))) which is the volume of a n-dimensional ball of radius r.

Then the fraction of overlap between two balls displaced by x is %20dy}{v(n,r)}) (The integrand is a cross-section of the intersection (which is a lower-dimensional ball), and y proceeds along the axis of displacement.) Numeric result.

**pengvado**on So you think you understand Quantum Mechanics · 2013-01-02T11:54:05.253Z · score: 2 (2 votes) · LW · GW

First consider a 10000-dimensional unit ball. If we shift this ball by two units in one of the dimensions, it would no longer intersect at all with the original volume. But if we were to shift it by 1/1000 units in each of the 10000 dimensions, the shifted ball would still mostly overlap with the original ball even though we've shifted it by a total of 10 units (because the distance between the centers of the balls is only sqrt(10000)/1000 = 0.1).

Actually no, it doesn't mostly overlap. If we consider a hypercube of radius 1 (displaced along the diagonal) instead of a ball, for simplicity, then the overlap fraction is 0.9995^10000 = 0.00673. If we hold the manhattan distance (10) constant and let number of dimensions go to infinity, then overlap converges to 0.00674 while euclidean distance goes to 0. If we hold the euclidean distance (0.1) constant instead, then overlap converges to 0 (exponentially fast).

For the ball, I calculate an overlap fraction of 5.6×10^-7, and the same asymptotic behaviors.

(No comment on the physics part of your argument.)

**pengvado**on 2012 Winter Fundraiser for the Singularity Institute · 2012-12-09T03:32:38.270Z · score: 10 (10 votes) · LW · GW

you're too young (and didn't have much income before anyway) to have significant savings.

Err, I haven't yet earned as much from the lazy entrepreneur route as I would have if I had taken a standard programming job for the past 7 years (though I'll pass that point within a few months at the current rate). So don't go blaming my cohort's age if they haven't saved and/or donated as much as me. I'm with Rain in spluttering at how people can have an income and not have money.

**pengvado**on 2012 Winter Fundraiser for the Singularity Institute · 2012-12-08T21:51:53.982Z · score: 22 (22 votes) · LW · GW

I value my free time far too much to work for a living. So your model is correct on that count. I had planned to be mostly unemployed with occasional freelance programming jobs, and generally keep costs down.

But then a couple years ago my hobby accidentally turned into a business, and it's doing well. "Accidentally" because it started with companies contacting me and saying "We know you're giving it away for free, but free isn't good enough for us. We want to buy a bunch of copies." And because my co-founder took charge of the negotiations and other non-programming bits, so it still feels like a hobby to me.

Both my non-motivation to work and my willingness to donate a large fraction of my income have a common cause, namely thinking of money in far-mode, i.e. not alieving The Unit of Caring on either side of the scale.

**pengvado**on 2012 Winter Fundraiser for the Singularity Institute · 2012-12-07T02:46:21.226Z · score: 100 (102 votes) · LW · GW

I donated 20,000$ now, in addition to 110,000$ earlier this year.

**pengvado**on Philosophy Needs to Trust Your Rationality Even Though It Shouldn't · 2012-11-30T23:55:38.963Z · score: 3 (3 votes) · LW · GW

On your account, how do you learn causal models from observing someone else perform an experiment? That doesn't involve any interventions or counterfactuals. You only see what actually happens, in a system that includes a scientist.

**pengvado**on Launched: Friendship is Optimal · 2012-11-29T11:24:19.550Z · score: 4 (4 votes) · LW · GW

The idea that "it wouldn't be you" isn't something I thought would be a problem

It probably doesn't help that Celestia implies "it wouldn't be you" when explaining why Hanna uploaded. If the shut-down authority was tied to her biological body, then Celestia fails to say so, and talks instead about identity. If it was tied to her name, then conflating that with the uploading is misleading. If the point of uploading was to protect her against coercion, then that would be sufficient even without any change in authority, and "Hanna no longer exists" / "Now that she is not Hanna" are misleading. Either way, I endorse the pattern theory of identity, but I don't see any plausible way to interpret that exchange in support of it.

**pengvado**on Causal Universes · 2012-11-28T20:41:41.623Z · score: 0 (0 votes) · LW · GW

Now, were their experiences real? Did we

makethem real by marking them with a 1 - by applying the logical filter using a causal computer?

You can apply the brute-force/postselection method to CGoL without timetravel too... But in that case verifying that a proposed history obeys the laws of CGoL involves all the same arithmetic ops as simulating forwards from the initial state. (The ops can, but don't have to, be in the same order.) Likewise if there are any linear-time subregions of CGoL+timetravel. So I might guess that the execution of such a filter could generate observers in some of the rejected worlds too.

There are laws of which verification is easier than simulation, but CGoL isn't one of them.