**sil-ver**on Insights from Linear Algebra Done Right · 2019-07-17T19:07:00.464Z · score: 2 (3 votes) · LW · GW

I honestly don't think the tradeoff is real (but please tell me if you don't find my reasons compelling). If I study category theory next and it does some cool stuff with the base map, I won't reject that on the basis of it contradicting this book. Ditto if I actually use LA and want to do calculations. The philosophical understanding that matrix-vector multiplication isn't ultimately a thing can peacefully coexist with me doing matrix-vector multiplication whenever I want to. Just like the understanding that the natural number 1 is a different object from the integer number 1 peacefully coexists with me treating them as equal in any other context.

I don't agree that this view is theoretically limiting (if you were meaning to imply that), because it allows any calculation that was possible before. It's even compatible with the base map.

**sil-ver**on Insights from Linear Algebra Done Right · 2019-07-17T18:59:06.996Z · score: 1 (1 votes) · LW · GW

I wouldn't be heartbroken if it was defined like that, but I wouldn't do it if I were writing a textbook myself. I think the LADR approach makes the most sense – vectors and matrices are fundamentally different – and if you want to bring a vector into the matfrix world, then why not demand that you do it explicitly?

If you actually use LA in practice, there is nothing stopping you from writing . You can be 'sloppy' in practice if you know what you're doing while thinking that drawing this distinction is a good idea in a theoretical text book.

**sil-ver**on Insights from Linear Algebra Done Right · 2019-07-16T19:18:59.515Z · score: 1 (1 votes) · LW · GW

That looks like it also works. It's a different philosophy I think, where LADR says "vectors and matrices are fundamentally different objects and vectors aren't dependent on bases, ever" and your view says "each basis defines a bijective function that maps vectors from the no-basis world into the basis-world (or from the basis1 world into the basis2 world)" but it doesn't insist on them being fundamentally different objects. Like if then they're the same kind of object, and you just need to know which world you're in (i.e. relative to which basis, if any, you need to interpret your vector to).

II don't think not having matrix-vector multiplication is an issue. The LADR model still allows you to do everything you can do in normal LA. If you want to multiply a matrix with a vector , you just make into the n-by-1 matrix and then multiply two matrices. So you multiply rather than . It forces you to be explicit about which basis you want the vector to be relative to, which seems like a good thing to me. If is the standard basis, then will have the same entries as , it'll just be written as rather than .

**sil-ver**on No nonsense version of the "racial algorithm bias" · 2019-07-15T16:20:15.824Z · score: 1 (1 votes) · LW · GW

Afaik, in ML, the term bias is used to describe any move away from the uniform / mean case. But in common speech, such a move would only be called a bias if it's inaccurate. So if the algorithm learns a true pattern in the data (X is more likely to be classified as 1 than Y is) that wouldn't be called a bias. Unless I misunderstand your point.

**sil-ver**on Insights from Linear Algebra Done Right · 2019-07-14T22:56:36.507Z · score: 2 (2 votes) · LW · GW

Ow. Yes, you do. This wasn't a typo either, I remembered the result incorrectly. Thanks for pointing it out, and props for being attentive enough to catch it.

Or to be more precise, you only need one scalar, but the scalar is for not , because isn't given. The theorem says that, given and , there is a scalar and a vector such that and is orthogonal to .

**sil-ver**on Insights from Linear Algebra Done Right · 2019-07-14T21:02:41.828Z · score: 1 (1 votes) · LW · GW

I wonder, what do you think about the chapter about dual spaces, dual maps, annihilator, etc.?

Nothing, because it wasn't in the material. I worked through the second edition of the book, and the parts on duality seem to be new to the third edition.

I believe when mathematicians say that in general P(x) holds, they mean that for any x in the domain of interest P(x) holds. Perhaps you want to youtypicalinstead ofgeneralhere. E.g. there is a notion calledtypicaltensor rank of tensors of given shape, which means a tensor rank which occurs with non-zero probability when a random tensor of given shape is sampled.

Thanks for that, I changed it.

## Insights from Linear Algebra Done Right

2019-07-13T18:24:50.753Z · score: 52 (22 votes)**sil-ver**on Contest: $1,000 for good questions to ask to an Oracle AI · 2019-07-04T14:27:11.833Z · score: 1 (1 votes) · LW · GW

It seems to me that one of the qualities of a good question for the LBO is that a correct answer has high utility, while another is the ability to judge the oracle's answer. If we take only the intersection between the two sets of questions that meet 1 and 2, it'll be a relatively small set. But if there is a set of questions where one might or might not be able to judge the answer but which are high utility, one could ask the LBO a set of such questions. Then, once the episode is over, give reward on those where one can evaluate the answers, and give null reward on the others.

Not really a submission, just thinking out loud.

**sil-ver**on Contest: $1,000 for good questions to ask to an Oracle AI · 2019-07-02T19:20:47.403Z · score: 2 (2 votes) · LW · GW

Submission for LBO:

Input a corpus of text (could be multiple posts) describing technical approaches to align a powerful AI. Split this into a finite number of items that are relatively short (such as paragraphs). Ask the oracle to choose the part that is most worth spending more time on. (For example, there might be a paragraph with a dangerous hidden assumption in an otherwise promising approach, and thinking more about it might reveal that and lead to conceptual progress.)

Have a team of researches look into it for an adequate amount of time which is fixed (and told to the oracle) in advance (maybe three months?) After the time is over, have them rate the progress they made compared to some sensible baseline. Use this as the oracle's reward.

Of course this has the problem of maximizing for apparent insight rather than actual insight.

**sil-ver**on On pointless waiting · 2019-06-11T09:32:13.421Z · score: 4 (2 votes) · LW · GW

Yes. Well put. This is related (though not identical) to the excellent Rest in Motion post from Nate Soares.

**sil-ver**on Drowning children are rare · 2019-05-29T21:44:39.504Z · score: 4 (1 votes) · LW · GW

Either charities like the Gates Foundation and Good Ventures are hoarding money at the price of millions of preventable deaths

My assumption before reading this has been that this is the case. Given that, does a reason remain to update away from the position that the GiveWell claim is basically correct?

For the rest of this post, let's suppose the true amount of money needed to save a life through GiveWell's top charities is 50.000$. I don't think anything about Singer's main point changes.

For one, it's my understanding that decreasing animal suffering is at least an order of magnitude more effective than decreasing human suffering. If the arguments you make here apply equally to that (which I don't think they do), and we take the above number, well that's 5000$ for a benefit-as-large-as-one-life-saved, which is still sufficient for Singer's argument

Secondly, I don't think your arguments apply to existential risk prevention and even if they did and we decrease effectiveness there by one order of magnitude, that'd also still validate Singer's argument if we take my priors.

I notice that I'm very annoyed at your on-the-side link to the article about OpenAI with the claim that they're doing the opposite of what the argument justifying the intervention recommends. It's my understanding that the article, though plausible at the time, was very speculative and has been falsified since it's been written. In particular, OpenAI has pledged not to take part in an arms race under reasonable conditions, which directly contradicts one of the points of that article. Quote:

Therefore, if a value-aligned, safety-conscious project comes close to building AGI before we do, we commit to stop competing with and start assisting this project. We will work out specifics in case-by-case agreements, but a typical triggering condition might be “a better-than-even chance of success in the next two years.”

That, and they seem to have an ethics board with significant power (this is based on deciding not to release the full version of GPT). I believe they also said that they won't publish capability results in the future, which also contradicts one of the main concerns (which, again, was reasonable at the time). Please either reply or amend your post.

**sil-ver**on Rationalist Vipassana Meditation Retreat · 2019-04-21T17:00:29.110Z · score: 1 (1 votes) · LW · GW

I'll also be attending the full 10 day version. I've only been meditating for a couple of months so the prospect of such a long retreat feels fairly threatening, but looking at the mean outcome, I think it's the correct call.

**sil-ver**on Open Thread April 2019 · 2019-04-06T09:09:29.732Z · score: 10 (6 votes) · LW · GW

What is the best textbook on analysis out there?

My goto source is Miri's guide, but analysis seems to be the one topic that's missing. TurnTrout mentioned this book which looks decent on first glance. Are there any competing opinions?

**sil-ver**on Humans Who Are Not Concentrating Are Not General Intelligences · 2019-03-28T22:50:19.852Z · score: 7 (4 votes) · LW · GW

I’ve noticed that I cannot tell, from casual conversation, whether someone is intelligent in the IQ sense.

I can't really do anything except to state this as a claim: I think a few minutes of conversation with anyone almost always gives me significant information about their intelligence in an IQ sense. That is, I couldn't tell you the exact number, and probably not even reliably predict it with an error of less than 20 (maybe more), but nonetheless, I know significantly more than zero. Like, if I talked to 9 people evenly spaced within [70, 130], I'm pretty confident that I'd get most of them into the correct half.

This does not translate into and kind of disagreement wrt to GPT's texts seeming normal if I just skim them. Or to Robin Hanson's thesis.

**sil-ver**on Ask LW: Have you read Yudkowsky's AI to Zombie book? · 2019-03-23T10:37:46.041Z · score: 3 (2 votes) · LW · GW

No, but I've read almost all of the sequences on website, I think. I didn't do it systematically, so it's almost a guarantee that I missed a few, but not many. Read some stuff twice, but again, not systematically.

I think they're amazing, and they've had a profound impact on me.

**sil-ver**on [Question] Tracking accuracy of personal forecasts · 2019-03-21T09:22:29.838Z · score: 1 (1 votes) · LW · GW

I do have a spreadsheet where I keep track of predictions, though only tracking the prediction, my confidence, and whether it came true or false. It's low effort and I think worth doing, but I can't confidently say that it has improved my calibration.

## Insights from Munkres' Topology

2019-03-17T16:52:46.256Z · score: 27 (9 votes)**sil-ver**on What kind of information would serve as the best evidence for resolving the debate of whether a centrist or leftist Democratic nominee is likelier to take the White House in 2020? · 2019-02-02T21:01:31.224Z · score: 4 (3 votes) · LW · GW

This does not answer the question, but it seems plausible to me that the leftist-centrist axis only has a very small impact on who is likely to win, which would be consistent with PredictIt's estimates.

**sil-ver**on Drexler on AI Risk · 2019-02-01T17:36:55.916Z · score: 13 (4 votes) · LW · GW

6.7 Systems composed of rational agents need not maximize a utility functionThere is no canonical way to aggregate utilities over agents, andgame theory shows that interacting sets of rational agents need not achieve even Pareto optimality.

Is [underlined] true? I know it's true if you have agents following CDT, but does it still hold if agents follow FDT? (I think if you say 'rational' it should not mean 'CDT' since CDT is strictly worse than FDT).

**sil-ver**on Topological Fixed Point Exercises · 2018-12-04T13:00:24.452Z · score: 1 (1 votes) · LW · GW

is defined just for one particular graph. It's the first edge in that graph such that . (So it could have been called ). Then for the next graph, it's a different . Basically, looks at where the first graph skips over the zero mark, then picks the last vertex before that point, then looks at the next larger graph, and if that graph skips later, it updates to the last vertex before that point in that graph, etc. I think the reason I didn't add indices to was just that there ar ealready the with two indices, but I see how it can be confusing since having no index makes it sound like it's the same value all throughout.

**sil-ver**on How rapidly are GPUs improving in price performance? · 2018-11-25T22:52:51.044Z · score: 4 (3 votes) · LW · GW

That makes perfect sense. Thanks.

**sil-ver**on How rapidly are GPUs improving in price performance? · 2018-11-25T21:47:55.077Z · score: 7 (5 votes) · LW · GW

When trying to fit an exponential curve, don't weight all the points equally.

Um... why?

**sil-ver**on Topological Fixed Point Exercises · 2018-11-24T23:23:25.357Z · score: 9 (2 votes) · LW · GW

Ex 5 (fixed version)

Let denote the triangle. For each , construct a 2-d simplex with nodes in , where the color of a point corresponds to the place in the disk that carries that point to, then choose to be a point within a trichromatic triangle in the graph. Then is a bounded sequence having a limit point . Let be the center of the disc; suppose that . Then there is at least one region of the disc that doesn't touch. Let be the distance to the furthest side, that is, let . Since the sides are closed regions, we have . Using continuity of , choose small enough such that . Then choose large enough so that (1) all triangles in have diameter less than and (2) . Then, given any other point in the triangle around in , we have that , so that . This proves that the triangle in does not map points to all three sides of the disc, contradicting the fact that it is trichromatic.

Ex 6

(This is way easier to demonstrate in a picture in a way that leaves no doubt that it works than it is to write down, but I tried to do it anyway considering that to be part of the difficulty.)

(Assume the triangle is equilateral.) Imbed into such that , , . Let be continuous. Then given by is also continuous. If then . Let be the circle with radius 2 around ; then because (it is in fact contained in the circle with radius 1, but the size of the circle is inconsequential). We will use exercise 5 to show that maps a point to the center, which is , from which the desired result follows. For this, we shall show that it has the needed properties, with the modification that points on any side may map precisely into the center. It's obvious that weakening the requirement in this way preserves the result.

Rotate the disk so that the red shape is on top. In polar coordinates, the green area now contains all points with angles between and , the blue area contains those between and , and the red area those between and and those between and . We will show that does not intersect the red area, except at the origin. First, note that we have

Since both and are convex combinations of finitely many points, it suffices to check all combinations that result by taking a corner from each. This means we need to check the points

and and and and and .

Which are easily computed to be

and and and and and

Two of those are precisely the origin, the other four have angles and and and . Indeed, they are all between and .

Now one needs to do the same for the sets and , but it goes through analogously.

**sil-ver**on Topological Fixed Point Exercises · 2018-11-23T21:22:41.367Z · score: 13 (4 votes) · LW · GW

I'm late, but I'm quite proud of this proof for #4:

Call the large triangle a graph and the triangles simply triangles. First, note that for any size, there is a graph where the top node is colored red, the remaining nodes on the right diagonal are colored green, and all nodes not on the right diagonal are colored blue. This graph meets the conditions, and has exactly one trichromatic triangle, namely the one at the top (no other triangle contains a red node). It is trivial to see that this graph can be changed into an arbitrary graph by re-coloring finitely many nodes. This will affect up to six triangles; we say that a triangle has changed iff it was trichromatic before the recoloring but not after, or vice versa, and we shall show that re-coloring any node leads to an even number of triangles being changed. This proves that the number of trichromatic triangles never stops being odd.

Label the three colors , and . Let be the node being recolored, wlog from to . Suppose first that has six neighbors. It is easy to see that a triangle between and two neighbors has changed if and only if one of the neighbors has color and the other has color or . Thus, we must show that the number of such triangles is even. If all neighbors have color , or if none of them do, then no triangles have changed. If exactly one node has color , then the two adjacent triangles have changed. Otherwise, let and denote two different neighbors of color . There are two paths using only neighbors of between and . Both start and end at a node of color . By the 1-D Sperner Lemma (assuming the more general result), it follows that both paths have an even number of edges between nodes of color and ; these correspond to the triangles that have changed.

If is a node on one of the graph's boundaries changing color from to , it has exactly 4 neighbors and three adjacent triangles. The two neighbors that are also on the boundary cannot have color , so either none, one, or both of the ones that aren't do. If it's none, no triangle has changed; if it's one, the two neighboring triangles have changed; and if it's both, then the two triangles with two nodes on the graph's boundary have changed.

**sil-ver**on Topological Fixed Point Exercises · 2018-11-22T20:12:15.427Z · score: 10 (3 votes) · LW · GW

Ex 1

Let and . Given an edge , let denote the map that maps the color of the left to that of the right node. Given a , let . Let denote the color blue and the color green. Let be 1 if edge is bichromatic, and 0 otherwise. Then we need to show that . We'll show , which is a striclty stronger statement than the contrapositive.

For , the LHS is equivalent to , and indeed equals if is bichromatic, and otherwise. Now let and let it be true for . Suppose . Then, if , that means , so that , and if , then , so that . Now suppose . If , then , so that , and if , then , so that . This proves the lemma by induction.

Ex 2

Ordinarily I'd proof by contradiction, using sequences, that can neither be greater nor smaller than 0. I didn't manage a short way to do it using the two lemmas, but here's a long way.

Set . Given , let be a path graph of vertices , where . If for any and we have , then we're done, so assume we don't. Define to be 1 if and have s different sign, and 0 otherwise. Sperner's Lemma says that the number of edges with are odd; in particular, there's at least one. Let the first one be denoted , then set .

Now consider the sequence . It's bounded because . Using the Bolzano-Weierstrass theorem, let be a convergent subsequence. Since for all , we have . On the other hand, if , then, using continuity of , we find a number such that . Choose and such that , then for all , so that and then for all , so that , a contradiction.

Ex 3

Given such a function , let be defined by . We have . If either inequality isn't strict, we're done. Otherwise, . Taking for granted that the intermediate value theorem generalizes to this case, find a root of , then .

**sil-ver**on Preschool: Much Less Than You Wanted To Know · 2018-11-21T18:03:21.591Z · score: 3 (3 votes) · LW · GW

Mostly agreed. But I think the obvious counter point is that you're arguing for a slightly different standard. Like, if the question is 'does pre-school basically make sense' then you're right, it doesn't, and the black box approach is weird. But if the question is 'should you send your children to pre-school' then the black box approach seems solid. Even if you could come up with something better in five minutes, you can't implement it, so the standard for it being worthwhile might be really low.

**sil-ver**on Diagonalization Fixed Point Exercises · 2018-11-18T23:11:24.616Z · score: 13 (4 votes) · LW · GW

Ex 4

Given a computable function , define a function by the rule . Then is computable, however, because for , we have that and .

Ex 5:

We show the contrapositive: given a function halt, we construct a surjective function from to as follows: enumerate all turing machines, such that each corresponds to a string. Given a , if does not decode to a turing machine, set . If it does, let denote that turning machine. Let with input first run halt; if halt returns , put out , otherwise will halt on input ; run on and put out the result.

Given a computable function , there is a string such that implements (if the turing thesis is true). Then , so that is surjective.

Ex 6:

Let be a parametrization of the circle given by . Given and , write to denote the point , where . First, note that, regardless of the topology on , it holds true that if is continuous, then so is for any , because given a basis element of the circle, we have which is open because is continuous.

Let be a continuous function from to . Then is continuous, and so is the diagonal function . Fix any , then given by is also continuous, but given any , one has and thus . It follows that is not surjective.

Ex 7:

I did it in java. There's probably easier ways to do this, especially in other languages, but it still works. It was incredibly fun to do. My basic idea was to have a loop iterate 2 times, the first time printing the program, the second time printing the printing command. Escaping the " characters is the biggest problem, the main idea here was to have a string q that equals " in the first iteration and " + q + " in the second. That second string (as part of the code in an expression where a string is printed) will print itself in the console output. Code:

package maths;public class Quine{public static void main(String[]args){for(int i=0;i<2;i++){String o=i==1?""+(char)34:"";String q=""+(char)34;q=i==1?q+"+q+"+q:q;String e=i==1?o+"+e);}}}":"System.out.print(o+";System.out.print(o+"package maths;public class Quine{public static void main(String[]args){for(int i=0;i<2;i++){String o=i==1?"+q+""+q+"+(char)34:"+q+""+q+";String q="+q+""+q+"+(char)34;q=i==1?q+"+q+"+q+"+q+"+q:q;String e=i==1?o+"+q+"+e);}}}"+q+":"+q+"System.out.print(o+"+q+";"+e);}}}

**sil-ver**on Diagonalization Fixed Point Exercises · 2018-11-18T20:57:20.965Z · score: 11 (3 votes) · LW · GW

Ex 1

Exercise 1: Let and let . Suppose that , then let be an element such that . Then by definition, and . So , a contradiction. Hence , so that is not surjective.

Ex 2

Exercise 2: Since is nonempty, it contains at least one element . Let be a function without a fixed point, then , so that and are two different elements in (this is the only thing we shall use the function for).

Let for nonempty. Suppose by contradiction that is surjective. Define a map by the rule . Given any subset , let be given by Since is surjective, we find a such that . Then . This proves that is surjective, which contradicts the result from (a).

Ex 3

Exercise 3: By (2) we know that , and so and where . That means for any . and .

**sil-ver**on Sam Harris and the Is–Ought Gap · 2018-11-16T19:32:26.013Z · score: 5 (3 votes) · LW · GW

Good post. I think Carroll's PoV is correct and Sam's is probably correct. Thinking about it, I would have phrased that one very differently, but I think there'd be zero difference on substance.

Edit: Having caught myself referencing this to explain Harris position, I amend my post to say that the way you put is actually exactly right, and the way I would have put it would at best have been a mildly confused version of the same thing.

**sil-ver**on Open Thread September 2018 · 2018-09-04T13:04:03.502Z · score: 2 (2 votes) · LW · GW

I wouldn't call the answer obvious. I'm not even sure if I could have guessed the majority view on this beforehand. Why do you think it's obvious? Are there no upsides to changing or are the downsides too significant?

**sil-ver**on Turning Up the Heat: Insights from Tao's 'Analysis II' · 2018-08-24T19:29:00.870Z · score: 7 (4 votes) · LW · GW

There's a lot I wanted to say here about topology, but I don't think my understanding is good enough to break things down - I'll have to read an actual book on the subject.

I'm working through Munkres' book on topology at the moment, which is part of Miri's reserach guide. It's super awesome; rigorous, comprehensive, elegant, and quite long (with *lots* of exercises). I'm planing to do a similar post once I'm done, but it's taking me a while. if you get to it eventually, you'll probably beat me to it.

**sil-ver**on Anthropic Probabilities: a different approach · 2018-07-28T15:22:32.092Z · score: 1 (1 votes) · LW · GW

1000:1 on tails (with tails -> create large universe). It's a very good question. My answer is late because it made me think about some stuff that confused me at first, an I wanted to make sure that everything I say now is coherent with everything I said in the post.

If god flipped enough logical coins for you to be able to make the approximation that half of them came up heads, you can update on the color of your logical coin based on the fact that your current version is green. This is a thousand times as likely if the green coin came up tails vs heads. You can't do the same if god only created one universe.

If god created more than one but still only a few universes, let's say two, then the chance that your coin came up heads is a bit more than a quarter, which comes from the heads-heads case. The heads-tails case is possible but highly unlikely.

**sil-ver**on Anthropic Probabilities: a different approach · 2018-07-25T14:35:55.610Z · score: 1 (1 votes) · LW · GW

Yes.

I'm not used to the concept of a logical coin, but yes, that's equivalent.

You need the consciousness condition & that god only does this once. Then my theory outputs the SSA answer.

**sil-ver**on Anthropic Probabilities: a different approach · 2018-07-24T21:28:18.410Z · score: 1 (1 votes) · LW · GW

No; like I said, procedures tend to be repeatable. Maybe there is one, but I haven't come up with one yet. What's wrong with the presumptuous philosopher problem (about two possible universes) as an example?

**sil-ver**on Anthropic Probabilities: a different approach · 2018-07-24T14:56:42.344Z · score: 1 (1 votes) · LW · GW

Ok, so I think our exchange can be summarized like this: I am operating on the assumption that numerical non-identity given qualitative identity is not a thing, and you doubt that assumption. We both agree that the assumption is necessary for the argument I made to be convincing.

**sil-ver**on Anthropic Probabilities: a different approach · 2018-07-24T14:23:14.516Z · score: 1 (1 votes) · LW · GW

I was under the impression that the opposite was the case, that numerical non-identity given qualitative identity is moonshine. I'm not a physician though, so I can't argue with you on the object level. Do you think that your position would be a majority view on LW?

**sil-ver**on Anthropic Probabilities: a different approach · 2018-07-24T13:27:12.339Z · score: 1 (1 votes) · LW · GW

You mean The two bodies aren't the selfsame body (numerical identity)

You mean identity of particles? I was just assuming that there is no such thing. I agree that if there was, that would be a simpler explanation.

**sil-ver**on Anthropic Probabilities: a different approach · 2018-07-24T13:22:28.484Z · score: 1 (1 votes) · LW · GW

Ok, but I said those two things you quoted only in a short argument why I think individual consciousness is not true. That's not required for anything relating to the theory. All I need there is that there *are* different ways that consciousness could work, and that they can play a role for probability. I think that can be kept totally separate from a discussion about which of them is true.

So the argument I made was meant to illustrate that individual consciousness requires a mechanism by which the universe remembers that your conscious experience is anchored to your body in particular, and that it's hard to see how such a mechanism could exist. People generally fear death not because they are afraid of losing the particular conscious experience of their mind, but because they are afraid of losing all conscious conscious experience, period. This only makes sense if there is such a mechanism.

The reductio ad absurdum is making a perfect clone of someone. Either both versions are subjectively different people, so that if one of them died it wouldn't be any consolation for her that the other one is still alive; or they are one person living in two different bodies, and either one would have to care about the other as much as about herself, even on a purely egoistical level. One of those two things has to be the case.

If it's the former, that means the universe somehow knows which one is which even though they are identical on a material level. That's why I meant by super-material information. There must be something not encoded in particles that the universe can use to tell them apart. I think many of us would agree that such a thing doesn't exist.

If it's the latter, then that begs the question what happens if you have one copy be slightly imperfect. Is it a different person once you change one atom? Maybe not. But there must be some number such that if you change that many atoms, then they are subjectively different people. If there is such a number, there's also a smallest number that does this. What follows is that if you change 342513 atoms they are subjectively the same, but if you change 342514 they're subjectively different. Or alternatively it could turn a few particular atoms?

Either way seems ridiculous, so my conclusion is that there most likely is no mechanism for conscious individuality, period. That means I rationally have no reason to care about my own well-being any more than about anyone else's, because anyone else is really just another versions of myself. I think most people find this super unintuitive, but it's actually a simpler theory, it doesn't give you any trouble with the cloning experiment because now both cones are always the same person no matter how much you change, and it solves the problem of "what a surprise that I happen to be born instead of person-X-who-never-existed!". It seems to be the far more plausible theory.

But again, you don't need to agree that one theory of consciousness is more plausible for any of the probability stuff, you only need to agree that there are two different ways it could work.

**sil-ver**on Anthropic Probabilities: a different approach · 2018-07-24T08:17:26.328Z · score: 1 (1 votes) · LW · GW

No, I don't think there is. The examples I already gave you were my actual best guesses for the cleanest case. Anything purely procedural seems like it will inevitably come up one way sometimes and another other times, if we lump it together with similar-seeming procedures, which we have to do. In those cases SIA is always correct. You could probably come up with something not involving consciousness, but you do need some logically uncertain fact to check, and it needs to be very distinct.

I definitely think of part of the point of this post to argue against SSA. Anything that's covered by the model of randomness I laid out seems very clear-cut to me. That includes normal Sleeping Beauty.

But I really want to know, what is confusing about the consciousness distinction? Is it unclear what the difference is, or do you just doubt that it is allowed to matter?

**sil-ver**on Anthropic Probabilities: a different approach · 2018-07-23T21:17:12.291Z · score: 1 (1 votes) · LW · GW

The view that most people who have thought about it at all have of consciousness is that there's some subjective experience that they have and some that other people have and that they're different. They don't imagine that if they die they keep living as other people, they imagine that if they die it's lights out. I call that individual consciousness (idk if there's a more common term). In that case, existence would be subjectively surprising. The alternative theory, or at least one alternative theory, is that "you are everyone", that the observer that's sitting in your brain and the one that's sitting in my brain is in fact the same. If you can imagine that there's an alternate you in another universe that has the same consciousness, then all you need to do is to extend that concept to everyone. Or if you can imagine that you could wake up as me tomorrow, then all you need to do is to imagine that you wake up as everyone tomorrow. I call that singular consciousness.

If individual consciousness is in fact true, then it gets very hard to claim that a smaller universe is as likely as a large one, independent of SIA or SSA. I know most people would probably claim both things, but that leads to some pretty absurd consequences if you think it through.

But if singular consciousness is true there's no problem. And my honest opinion is that it probably is true. Individual consciousness seems incredibly implausible. If I put you in a coma and make a perfect clone, either that clone and you are the same person or not. If not, then the universe has super-material information, and if so, then there has to be a qualitative difference between a perfect clone and a clone with one atom in the wrong place. Either way seems ridiculous.

**sil-ver**on Anthropic Probabilities: a different approach · 2018-07-23T20:30:36.170Z · score: 1 (1 votes) · LW · GW

if existing isn't subjectively surprising, and if there's only one universe (or if all universes are equally large), then my theory is indifferent between a universe with N and one with a trillion N observers, whereas SIA says the latter one is a trillion times as likely. SIA Doomsday which avturchin mentioned is also a good one. If the filter is always at the same position and if, again, existing isn't subjectively surprising, my theory rejects it but SIA obviously doesn't.

The assumptions are necessary. If there are lots of different (simulated) universes, some large some small, then living in a larger universe is more likely. If existence is subjectively surprising, if it's actually a trillion times more likely in the larger universe, then the smaller universe is unlikely. That's the same as updating downward on extinction risk if Many Worlds is false.

There might be a cleaner example I haven't thought of yet. You'd ned something where every similar observation is guaranteed to refer to the same proposition, and where you can't update on having subjective experience at all.

**sil-ver**on Anthropic Probabilities: a different approach · 2018-07-23T14:07:13.228Z · score: 1 (1 votes) · LW · GW

No.

The principled distinction is not about the type of coin, that was a summary. The principled distinction is about sets of observations and how frequently they correspond to which outcome. And because we don't have perfect deductive skills, sets of observations that are indistinguishable to the observer with respect to the proposition in question are summarized into one equivalence class.

If you set up the experiment that way, then the equivalence class of the agent's set of observations is something like "I'm doing a sleeping beauty experiment, the experimenter gave me a hash of the coin's outcome". This observation is made lots of times in different worlds, and the outcome varies. it behaves randomly (=SIA answer is correct).

It also behaves randomly if you choose the number of interviews based on the chromatic number of the plane, because sleeping beauty cannot differentiate that from other logically uncertain problems that have other outcomes. That was the example I used in the post.

**sil-ver**on Anthropic Probabilities: a different approach · 2018-07-23T08:25:00.402Z · score: 4 (1 votes) · LW · GW

I added links, but I don't want to explain what they are in the post.

I can do it here, though. So there are two major ideas, SSA and SIA. The simplest example where they differ is that god tosses a coin, if it's heads he creates two copies of you, if its tails he creates one. You are one of the copies created. What's the probability that the coin came up heads? One view says that it's one half, because a-priori we don't count observers or observations. That's the SSA answer. The other says that it's one third, roughly because there are fewer copies of you in that case. That's the SIA answer.

The way SSA and SIA they are phrased according to the LW wiki is like so:

SSA: All other things equal, an observer should reason as if they are randomly selected from the set of allactually existentobservers (past, present and future) in their reference class

SIA: All other things equal, an observer should reason as if they are randomly selected from the set of allpossibleobservers.

The term reference class in this case refers to the set of all observers that exist given the coin fell a certain way. So if the coin fell heads, there's one observer, if it fell tails there are two. The heads-reference class consists of one observer, the tails-reference class consists of two.

The way they were originally phrased and understood was different, so if you read on past papers it might be confusing. Originally, SSA just said that given one reference class, you should assume to be randomly selected from that class, and SIA said that the reference classes become more likely the more observers are in them. Taken both, you have what the above formulation of SIA states, which is that all possible observers are equally likely. What we now call SIA is what used to be SSA + SIA and what we now call SSA used to be SSA without SIA.

Generally, both aren't formulated very well. Full-nonindexical conditioning is an entire paper with an entirely new name just to give SIA a better justification. It outputs the same results as SIA, so really there are still only two theories.

And in this post, I argue that sometimes it's valid to count observers and sometimes it's not, and whether it is depends on what kind of experiment it is. Roughly, if an experiment goes one way half the time and another way half the time, then both happen a lot, so there are actually more observers on one side. Then counting is legit (SIA answer). If an experiment only ever goes one way but you don't know which, then you can't compare numbers because you don't know them, so you just take the prior probabilities (SSA answer). This distinction is I think super duper important and absolutely must be made, but neither existing theory is making it.

Also I don't count observers but just observations which strikes me as a much cleaner concept. It removes any discussion what qualifies as an observer. Similarly, there are no reference classes in my theory.

**sil-ver**on Let's Discuss Functional Decision Theory · 2018-07-23T07:50:07.319Z · score: 6 (4 votes) · LW · GW

I read it several months ago and was wondering the same thing. Although it's more than a branding change, FDT is much clearer put and much easier to understand.

**sil-ver**on Anthropic Probabilities: a different approach · 2018-07-21T13:54:43.335Z · score: 2 (2 votes) · LW · GW

No, in God's coin toss the probability is random. At least that's what I took it as, since it's described as a coin toss. The reason the answer is 1/2 there is just because the number of observations of being in room 1-10 is equal in the heads-case and the tails-case (10 in both). This is the image of the experiment I made in the post. If it was 2000 people in the tails-case, 2 in every room number, then the answer would be 1/3 for heads.

**sil-ver**on Anthropic Probabilities: a different approach · 2018-07-21T12:20:49.362Z · score: 2 (2 votes) · LW · GW

That should be conclusive from this post, so for everyone who already read it, you can try to predict it before reading further.

So you'd have to input your probability distribution of possible universes, in this case that just has to specify where the filter is for different species. If you think the filter is at the same place for all species, then your distribution should look something like *1/3 * filter always late, 1/3 * filter always middle, 1/3 * filter always early* and the SIA doomsday doesn't apply (you'd have 3 trivial experiments). If you think for some it's early and for some middle and for some late, then your distribution would just be *1 * filter varies for different species*, then you'd have just one experiment which rolls a die in the beginning to decide where it puts the filter for us. Then the argument works. You could also mix those, if you think maybe it's the same for everyone and maybe not. Then the argument kinda works.

But plausibly the filter, if it exists, is at the same place for everyone. So my theory mostly rejects the argument.

**sil-ver**on Anthropic Probabilities: a different approach · 2018-07-21T07:45:16.163Z · score: 2 (2 votes) · LW · GW

I can't do it in one line, but I made a shorter summary and put it at the beginning.

**sil-ver**on Anthropic Probabilities: a different approach · 2018-07-21T07:43:46.058Z · score: 1 (1 votes) · LW · GW

That's a valid question. I think both kinds behave identically, but I admit that's not a compelling reason to stretch the term in this way.

**sil-ver**on Anthropic Probabilities: a different approach · 2018-07-21T07:25:46.983Z · score: 1 (1 votes) · LW · GW

Ok. I added a one-paragraph summary at the start of the post.

## Anthropic Probabilities: a different approach

2018-07-20T13:02:29.349Z · score: 5 (4 votes)**sil-ver**on Probability is fake, frequency is real · 2018-07-11T07:34:54.687Z · score: 1 (1 votes) · LW · GW

I think you would be right if we lived in a classical universe. But given many worlds, there is a principled way in which a coin flip can be random, and a principled difference between flipping a coin and checking the trillionth digit of the decimal expansion of .

Edit: I know you acknowledge this, but you don't seem to draw the above conclusion.

**sil-ver**on Open Thread July 2018 · 2018-07-10T18:07:55.768Z · score: 3 (2 votes) · LW · GW

In general, are we also encouraged to ask questions here?

I have one about layout: is the page width of posts the same for everyone, given that the window is wide enough? That would mean everyone has line breaks etc at the same positions.

**sil-ver**on An introduction to worst-case AI safety · 2018-07-05T16:34:57.360Z · score: 4 (3 votes) · LW · GW

This sounds totally convincing to me.

Do you think that ethical questions could be more relevant for this than they are for alignment? For example, the difference between [getting rid of all humans] and [uploading all humans and making them artificially incredibly happy] isn't important for AI alignment since they're both cases of unaligned AI, but it might be important when the goal is to navigate between different modes of unaligned AI.