**will_sawin**on The "I Already Get It" Slide · 2017-08-13T18:33:13.640Z · score: 0 (0 votes) · LW · GW

Just a quick note on your main example - in math, and I'm guessing in theoretic areas of CS as well, we often find that searching for fundamental obstructions to a solution is the very thing that allows us to find the solution. This is true for a number of reasons. First, if we find no obstructions, we are more confident that there is some way to find a solution, which always helps. Second, if we find a partial obstruction to solutions of a certain sort, we learn something crucial about how a solution must look. Third, and perhaps most importantly, when we seek to find obstructions and fail, we may find out way blocked by some kind of obstruction to an obstruction, which is a shadow of the very solution we seek to find, and by feeling it out we can find our way to the solution.

**will_sawin**on Beyond Statistics 101 · 2015-06-27T23:29:34.207Z · score: 0 (0 votes) · LW · GW

Thank you for all these interesting references. I enjoyed reading all of them, and rereading in Thurston's case.

Do people pathologize Grothendieck as having gone crazy? I mostly think people think of him as being a little bit strange. The story I heard was that because of philosophical disagreements with military funding and personal conflicts with other mathematicians he left the community and was more or less refusing to speak to anyone about mathematics, and people were sad about this and wished he would come back.

**will_sawin**on Beyond Statistics 101 · 2015-06-27T22:01:32.700Z · score: 1 (1 votes) · LW · GW

One thing that most scientists in these soft scientists already have a good grasp on, but a lot of laypeople do not, is the idea of appropriately normalizing parameters. For instance dividing something by the mass of the body, or the population of a nation, to do comparisons between individuals/nations of different sizes.

People will often make bad comparisons where they don't normalize properly. But hopefully most people reading this article are not at risk for that.

**will_sawin**on Beyond Statistics 101 · 2015-06-27T21:58:14.006Z · score: 0 (0 votes) · LW · GW

Conservation gives a local symmetry but there may not be a global symmetry.

For instance, you can imagine a physical system with no forces at all, so everything is conserved. But there are still some parameters that define the location of the particles. Then the physical system is locally very symmetric, but it may still have some symmetric global structure where the particles are constrained to lie on a surface of nontrivial topology.

**will_sawin**on Rationality Quotes July 2014 · 2014-08-06T18:48:36.339Z · score: 1 (1 votes) · LW · GW

Do you often read physicist's response to claims of FTL signalling? It seems to me like there is not much value in reading these, per the quote.

**will_sawin**on Is my view contrarian? · 2014-03-11T21:33:23.973Z · score: 2 (4 votes) · LW · GW

No, you should focus on founding a research field, which mainly requires getting other people interested in the research field.

**will_sawin**on Rationality Quotes March 2014 · 2014-03-04T02:42:39.991Z · score: 2 (2 votes) · LW · GW

I don't think that's really relevant to the original quote.

**will_sawin**on Rationality Quotes March 2014 · 2014-03-03T17:31:05.658Z · score: 1 (1 votes) · LW · GW

True, but that doesn't mean we're laboring in the dark. It just means we've got our eyes closed.

**will_sawin**on Political Skills which Increase Income · 2014-03-03T04:00:35.535Z · score: 8 (8 votes) · LW · GW

I would be interested in a post about how to acquire political knowledge!

**will_sawin**on Rationality Quotes March 2014 · 2014-03-01T18:57:46.942Z · score: 5 (5 votes) · LW · GW

10% isn't that bad as long as you continue the programs that were found to succeed and stop the programs that were found to fail. Come up with 10 intelligent-sounding ideas, obtain expert endorsements, do 10 randomized controlled trials, get 1 significant improvement. Then repeat.

**will_sawin**on Link: Poking the Bear (Podcast) · 2014-03-01T02:10:05.528Z · score: 0 (0 votes) · LW · GW

Yes.

**will_sawin**on Link: Poking the Bear (Podcast) · 2014-02-28T22:51:04.246Z · score: 1 (1 votes) · LW · GW

None of those sound like they require military intervention?

**will_sawin**on Lifestyle interventions to increase longevity · 2014-02-28T14:54:30.560Z · score: 1 (1 votes) · LW · GW

Difficult math is SNS-heavy?

**will_sawin**on Polling Thread · 2014-01-24T10:45:27.375Z · score: 0 (0 votes) · LW · GW

I rated the second question as more likely than the first because I think "most traits" means something different in the two questions.

**will_sawin**on Rationality Quotes January 2014 · 2014-01-22T02:15:31.701Z · score: 0 (0 votes) · LW · GW

Only this particular thing.

**will_sawin**on Dangers of steelmanning / principle of charity · 2014-01-16T22:50:58.938Z · score: 0 (0 votes) · LW · GW

That's what the Great Filter is, no?

**will_sawin**on Dangers of steelmanning / principle of charity · 2014-01-16T18:30:27.631Z · score: 3 (3 votes) · LW · GW

It would be amusing if the single primary reason that the universe is not buzzing with life and civilization is that any sufficiently advanced society develops terminology and jargon too complex to be comprehensible, and inevitably collapses because of that.

**will_sawin**on Results from MIRI's December workshop · 2014-01-16T02:03:26.511Z · score: 0 (0 votes) · LW · GW

For that purpose a better example is a computationally difficult statement, like "There are at least X twin primes below Y". We could place bets, and then acquire more computing power, and then resolve bets.

The mathematical theory of statements like the twin primes conjecture should be essentially the same, but simpler.

**will_sawin**on A basis for pattern-matching in logical uncertainty · 2014-01-16T02:00:33.421Z · score: 0 (0 votes) · LW · GW

Nope. The key point is that as computing power becomes lower, Abram's process allows more and more inconsistent models.

So does every process.

The probability of a statement appearing first in the model-generating process is not equal to the probability that it's modeled by the end.

True. But for two very strong statements that contradict each other, there's a close relationship.

**will_sawin**on A basis for pattern-matching in logical uncertainty · 2014-01-15T14:40:52.410Z · score: 0 (0 votes) · LW · GW

What is "the low computing power limit"? If our theories behave badly when you don't have computing power, that's unsurprising. Do you mean "the large computing power limit".

I think probability ( "the first 3^^^3 odd numbers are 'odd', then one isn't, then they go back to being 'odd'." ) / probability ("all odd numbers are 'odd'") is approximately 2^(length of 3^^^3) in Abram's system, because the probability of them appearing in the random process is supposed to be this ratio. I don't see anything about the random process that would make the first one more likely to be contradicted before being stated than the second.

**will_sawin**on Results from MIRI's December workshop · 2014-01-15T14:35:59.813Z · score: 0 (0 votes) · LW · GW

Yeah, updating probabilty distributions over models is believed to be good. The problem is, sometimes our probability distributions over models are wrong, as demonstrated by bad behavior when we update on certain info.

The kind of data that would make you want to zeroi out non-90% models. Is when you observe a bunch of random data points and 90% of them are true, but there are no other patterns you can detect.

The other problem is that updates can be hard to compute.

**will_sawin**on Results from MIRI's December workshop · 2014-01-15T06:05:42.580Z · score: 0 (0 votes) · LW · GW

It's actually not too hard to demonstrate things about the limit for Abram's original proposal, unless there's another one that's original-er than the one I'm thinking of. It limits to the distribution of outcomes of a certain incomputable random process which uses a halting oracle to tell when certain statements are contradictory.

You are correct that it doesn't converge to a limit of assigning 1 to true statements and 0 to false statements. This is of course impossible, so we don't have to accept it. But it seems like we should not have to accept divergence - believing something with high probability, then disbelieving with high probability, then believing again, etc. Or perhaps we should?

**will_sawin**on A basis for pattern-matching in logical uncertainty · 2014-01-15T06:01:08.500Z · score: 0 (0 votes) · LW · GW

Abram Demski's system does exactly this if you take his probability distribution and update on the statements "3 is odd", "5 is odd", etc. in a Bayesian manner. That's because his distribution assigns a reasonable probability to statements like "odd numbers are odd". Updating gives you reasonable updates on evidence.

**will_sawin**on The Inefficiency of Theoretical Discovery · 2013-12-19T20:41:25.196Z · score: 0 (0 votes) · LW · GW

Doesn't the non-apocryphal version of that story have some relevance?

http://en.wikipedia.org/wiki/Space_Pen

http://www.snopes.com/business/genius/spacepen.asp

Using a space pencil could cause your spaceship to light on fire. Sometimes it pays to be careful.

**will_sawin**on Chocolate Ice Cream After All? · 2013-12-12T04:26:54.451Z · score: 0 (0 votes) · LW · GW

Suppose I am deciding now whether to one-box or two-box on the problem. That's a reasonable supposition, because I am deciding now whether to one-box or two-box. There are a couple possibilities for what Omega could be doing:

- Omega observes my brain, and predicts what I am going to do accurately.
- Omega makes an inaccurate prediction, probabilistically independent from my behavior.
Omega modifies my brain to a being it knows will one-box or will two-box, then makes the corresponding prediction.

If Omega uses predictive methods that aren't 100% effective, I'll treat it as combination of case 1 and 2. If Omega uses very powerful mind-influencing technology that isn't 100% effective, I'll treat it as a combination of case 2 and 3.

In case 1 , I should decide now to one-box. In case 2, I should decide now to two-box. In case 3, it doesn't matter what I decide now.

If Omega is 100% accurate, I know for certain I am in case 1 or case 3. So I should definitely one-box. This is true even if case 1 is vanishingly unlikely.

If Omega is even 99.9% accurate, then I am in some combination of case 1, case 2, or case 3. Whether I should decide now to one-box or two-box depends on the relative probability of case 1 and case 2, ignoring case 3. So even if Omega is very accurate, ensuring that the probability of case 2 is small, if the probability of case 1 is even smaller, I should decide now to two-box.

I mean, I am describing a very specific forecasting technique that you can use to make forecasts right now. Perhaps a more precise version is, you observer children in one of two different preschools, and observe which school they are in. You observe that almost 100% of the children in one preschool end up richer than the children in the other preschool. You are then able to forecast that future children observed in preschool A will grow up to be rich, and future children observed in preschool B will grow up to be poor. You then have a child. Should you bring them to preschool A? (Here I don't mean have them attend the school. They can simply go to the building at whatever time of day the study was conducted, then leave. That is sufficient to make highly accurate predictions, after all!)

I don't really know what you mean by "the set A of all factors involved"

**will_sawin**on Chocolate Ice Cream After All? · 2013-12-11T05:52:35.054Z · score: 0 (0 votes) · LW · GW

We believe in the forecasting power, but we are uncertain as to what mechanism that forecasting power is taking advantage of to predict the world.

analogously, I know Omega will defeat me at Chess, but I do not know which opening move he will play.

In this case, the TDT decision depends critically on which causal mechanism underlies that forecasting power. Since we do not know, we will have to apply some principles for decision under uncertainty, which will depend on the payoffs, and on other features of the situation. The EDT decision does not. My intuitions and, I believe, the intuitions of many other commenters here, are much closer to the TDT approach than the EDT approach. Thus your examples are not very helpful to us - they lump things we would rather split, because our decisions in the sort of situation you described would depend in a fine-grained way on what causal explanations we found most plausible.

Suppose it is well-known that the wealthy in your country are more likely to adopt a certain distinctive manner of speaking due to the mysterious HavingRichParents gene. If you desire money, could you choose to have this gene by training yourself to speak in this way?

**will_sawin**on A critique of effective altruism · 2013-12-02T21:10:12.722Z · score: 5 (5 votes) · LW · GW

Finding good nudges!

**will_sawin**on A critique of effective altruism · 2013-12-02T01:54:17.035Z · score: 9 (9 votes) · LW · GW

Arguably trying for apostasy, failing due to motivated cognition, and producing only nudging is a good strategy that should be applied more broadly.

**will_sawin**on Book Review: Computability and Logic · 2013-11-22T22:32:20.059Z · score: 6 (6 votes) · LW · GW

So that I can better update on this information, can you tell me what the first exercise is?

**will_sawin**on Yes, Virginia, You Can Be 99.99% (Or More!) Certain That 53 Is Prime · 2013-11-09T22:39:15.307Z · score: 6 (6 votes) · LW · GW

Even if true announcments are just 9 times more likely than false announcements, then a true announcment should raise your confidence that the lottery numbers were 4 2 9 7 9 3 to 90%. This is because the probability P (429783 announced | 429783 is the number) is just the probability of a true announcement, but the probability P( 429783 announced | 429783 is not the number) is the probability of a false announcement, divided by a million.

A false announcer would have little reason to fake the number 429793. This already completely annihilates the prior probability.

**will_sawin**on Very Basic Model Theory · 2013-11-01T18:39:28.622Z · score: 1 (1 votes) · LW · GW

You said arbitrarily large *finite* models, however. First-order arithmetic has no finite models. : )

**will_sawin**on Very Basic Model Theory · 2013-11-01T03:16:42.179Z · score: 1 (1 votes) · LW · GW

I don't see how 3 and 4 are stronger than 1 and 2. They are just the special cases of 1 and 2 where the sentence is a contradiction.

Arbitrarily large finite models are certainly not allowed in the theory of arithmetic.

**will_sawin**on Gains from trade: Slug versus Galaxy - how much would I give up to control you? · 2013-07-20T20:08:43.524Z · score: 2 (2 votes) · LW · GW

Due to the Pareto improvement problem, I don't think this actually describes what people mean by the word "trade".

**will_sawin**on Gains from trade: Slug versus Galaxy - how much would I give up to control you? · 2013-07-20T19:55:41.106Z · score: 0 (0 votes) · LW · GW

which is itself a special case of a Nash equilibrium.

**will_sawin**on Pascal's Muggle: Infinitesimal Priors and Strong Evidence · 2013-07-16T18:12:29.710Z · score: 1 (1 votes) · LW · GW

You do get to pick the languages first because there is a large but finite (say no more than 10^6) set of reasonable languages-modulo-trivial-details that could form the basis for such a measurement.

**will_sawin**on Recent MIRI workshop results? · 2013-07-16T07:02:06.397Z · score: 6 (6 votes) · LW · GW

This sample may be unrepresentative. At least one researcher would have been perfectly happy talking about the researchers' lives.

**will_sawin**on Evidential Decision Theory, Selection Bias, and Reference Classes · 2013-07-08T05:34:20.806Z · score: 2 (4 votes) · LW · GW

How useful is it to clarify EDT until it becomes some decision theory with a different, previously determined name?

**will_sawin**on Robust Cooperation in the Prisoner's Dilemma · 2013-06-24T04:50:28.255Z · score: 0 (0 votes) · LW · GW

This is clearly not true for proposal 2. No matter the formal system, you will find a proof (YouDefect => OpponentCooperate), and therefore defect.

**will_sawin**on Robust Cooperation in the Prisoner's Dilemma · 2013-06-22T16:38:44.471Z · score: 0 (0 votes) · LW · GW

You can search for reasons to cooperate in a much stronger formal system than you search for reasons to defect in. Is there any decision-theoretic justification for this?

**will_sawin**on Giving Now Currently Seems to Beat Giving Later · 2013-06-20T02:38:47.933Z · score: 1 (3 votes) · LW · GW

I would guess that current laws do not allow this, and that changing the laws would not be a good way to increase total donations, because it would strike people as a bad signal and they wouldn't want to do it. If you want more gifts next holiday season, should you offer your relatives the ability to give you refundable gifts?

**will_sawin**on Do Earths with slower economic growth have a better chance at FAI? · 2013-06-15T17:30:52.347Z · score: 0 (0 votes) · LW · GW

but it can't AIXI the other halting oracles?

**will_sawin**on Earning to Give vs. Altruistic Career Choice Revisited · 2013-06-15T04:06:56.835Z · score: 0 (0 votes) · LW · GW

I would worry more about negative flow-through effects of a decline in trust and basic decency in society. I think those are much more clear than flow-through effects of positive giving. I'm not sure if this outweighs the 20-to-1 ratio.

**will_sawin**on Many Weak Arguments vs. One Relatively Strong Argument · 2013-06-14T03:21:10.827Z · score: 1 (1 votes) · LW · GW

Most physicists most of the time aren't Dirac, Pauli, Yang, Mills, Feynmann, Witten, etc.

**will_sawin**on Many Weak Arguments vs. One Relatively Strong Argument · 2013-06-14T02:08:32.293Z · score: 2 (2 votes) · LW · GW

But mathematicians also frequently dream up highly nontrivial things that are true, that mathematicians (and physicists) don't understand sufficiently well to be able to prove even after dozens of years of reflection. The Riemann hypothesis is almost three times as old as quantum field theory. There are also the Langlands conjectures, Hodge conjecture, etc., etc. So it's not clear that something fundamentally different is going on here.

**will_sawin**on Robust Cooperation in the Prisoner's Dilemma · 2013-06-13T21:46:10.505Z · score: 0 (0 votes) · LW · GW

One way I would think about PrudentBot is as not trying to approximate a decision theory, but rather a special consideration to the features of this particular format, where diverse intelligent agents are examining your source code. Rather than submitting a program to make optimal decisions, you submit a program which is simplified somewhat in a way that errs on the side of cooperation, to make it easier for people who are trying to cooperate with you.

But something about my reasoning is wrong, as it doesn't fit very well to the difference of the actual codes of ADTBot and PrudentBot.

**will_sawin**on The Use of Many Independent Lines of Evidence: The Basel Problem · 2013-06-13T18:47:56.368Z · score: 1 (1 votes) · LW · GW

I've just seen the claim that von Neumann had a fake proof in a couple places, and it always bothers me, since it seems to me like one can construct a hidden variable theory that explains any set of statistical predictions. Just have the hidden variables be the response to every possible measurement! Or various equivalent schemes. One needs a special condition on the type of hidden variable theory, like Bell's nonlocality.

**will_sawin**on Many Weak Arguments vs. One Relatively Strong Argument · 2013-06-13T07:07:57.779Z · score: 1 (1 votes) · LW · GW

There might just be a terminological distinction here. When I think of the reasoning used by mathematicians/physicists, I think of the reasoning used to guess what is true - in particular to produce a theory with >50% confidence. I don't think as much of the reasoning used to get you from >50% to >99%, because this is relatively superfluous for a mathematician's utility function - at best, it doubles your efficiency in proving theorems. Whereas you are concerned more with getting >99%.

This is sort of a stupid point but Euler's argument does not have very many parts, and the parts themselves are relatively strong. Note that if you take away the first, conceptual point, the argument is not very convincing at all - although this depends on how much calculation of how many even zeta values Euler does. It's still a pretty far cry from the arguments frequently used in the human world.

Finally, while I can see why Euler's reasoning may be representative of the sort of reasoning that physicists use, I would like to see more evidence that it is representative. If all you have is the advice of this chauffer, that's perfectly alright and I will go do something else.

**will_sawin**on The Use of Many Independent Lines of Evidence: The Basel Problem · 2013-06-13T06:51:30.084Z · score: 0 (0 votes) · LW · GW

BTW what was John von Neumann's "proof"?

**will_sawin**on Many Weak Arguments vs. One Relatively Strong Argument · 2013-06-13T06:31:06.738Z · score: 2 (2 votes) · LW · GW

But that still doesn't tell you whether to invest in the startup. If an ORSA-er is just paralyzed by indecision here and decides to leave VC and go into theoretical math or whatever, he or she is not really winning.

Unrelatedly, a fun example of MWA triumphing over ORSA could be geologists vs. physicists on the age of the Earth.

**will_sawin**on Many Weak Arguments vs. One Relatively Strong Argument · 2013-06-13T06:27:20.024Z · score: 2 (2 votes) · LW · GW

Personally I found the quantitative majors example a very vivid introduction to this style of argument, and much more vivid than the Penrose example. I think the quantitative majors does a very good job of illustrating the kind of reasoning you are supporting, and why it is helpful. I don't understand the relevance of many weak arguments to the Penrose debate - it seems like a case of some strong and some weak arguments vs. one weak argument or something. If others are like me, a different example might be more helpful.