An Informal Introduction to Solomonoff Induction and Pascal Mugging

post by Michaël Trazzi (mtrazzi) · 2021-04-05T22:02:08.784Z · LW · GW · 1 comments

[Epistemic status: explaining the intuition behind math I don't fully understand to consolidate my knowledge. Below is my attempt at distilling Solomonoff's first conjectures about what will later become known as Solomonoff induction. I find the first linked paper (1960) easier to understand (although less formal) than the later ones from 1964 (pt1, pt2). I've tried to learn more about it when trying to understand Pascal Mugging, so I'll briefly explain that as well for illustrative purposes.]


(The following is a dialogue between me, and you, the reader.)


me: essentially, Solomonoff says that string priors should decrease exponentially with the amount of code needed to generate the string

you: this strongly depends on the programming language, right?

me: we assume a turing complete programming language with fixed memory. it could simulate (modulo memory) any turing machine  which takes input a concatenation of symbols  and outputs some concatenation of symbols . cf. the abstract:

"extrapolation probabilities obtained will be largely independent of just which universal Turing machine was used, providing that the sequence to be extrapolated has an adequate amount of information in it"

you: so it would be "universal turing machine" dependent in some way?

me: it doesn't need to ! Your " code" should be able to be transpiled to the simplest machine , which encodes "aaaabbabb" by assigning integer to permutations, repeating the process until there's no more unexploited statistical property. Here is what the paper says:

you: hum, and how do you measure the "amount of code" required? what's the formula?

me: first you come up with the shortest code  such that . For instance:

The "complexity" is =, so we define your prior as .

you: uh, but then my distribution over possible outputs does not sum to 1!

me: just assume an automata-generated binary string, where there's a small chance  of halting at each step. otherwise you get digits 0 or 1, with equal probability  your prior is then 

you: but isn't the shortest restriction weird? do we really want our AI to code golf?

me: sure, so assuming we want to predict if a or b comes after T. we would sum the priors for  and  (where + is for concatenation) for any size, and compare the ratio.

you: but then we don't test our model on characters after a or b... so maybe we overfit?

me: precisely. to fix this we add some binary noise after, & compare how well our models perform on  vs. —if   doesn't make sense, it will be hard to write short code for , where the noise comes from an alphabet of  letters


you: i get the intuition... but I don't see where this is going

me: let's start with our universal prior over all "theories" A, our programming languages we could say. now, let's assume some random datapoint D appears out of nowhere, what do you do?

(crowd starts cheering)

'do the trick do the trick'

(you obtemperate, computing your posterior using Bayes)

(crowd magically disappears)

you: that's just bayes over models right? I read LW. I know that.

me: a bit more than that. it implements bayes, giving you universal priors, and allows you to predict future events F, given the posteriors over theories you computed above!!

you: I see, so why don't we just use that?

me: there's actually a problem with such prior. An agent using Solomonoff induction would give a high prior to 3^^^3 (where ^ is Knuth’s up arrow notation) because a simple automaton computes it.o

you: well, that just means 3^^^3 is simple

me: true. but in a situation of "mugger-less Pascal Mugging [LW · GW]” our Solomonoff agent would estimate “this person could kill 3^^^3 people in another universe” to be of astronomical negative expected value. We wouldn't really want that.

you: sure, but I imagine you have another patch

me: not quite, Hanson’sleverage penalty [LW · GW]” uses an anthropic principle to discount the possibility of an agent controlling the fates of 3^^^3 agents linearly, so the prior is now (1/3^^^3) and the expected value -1.

you: ah! this sounds plausible

me: However, under such a penalty it would not update on evidence of magical powers when facing a Pascal's Muggle who, this time, wants to save lives.

(after staring at the wall for a minute, you start writing a patch in the comments)

1 comments

Comments sorted by top scores.

comment by Michaël Trazzi (mtrazzi) · 2021-04-05T22:05:33.576Z · LW(p) · GW(p)

(note to mods: Ideally I would prefer to have larger Latex equations, not sure how to do that. If someone could just make those bigger, or even replace the equation screenshot with real Latex that would be awesome.)