## Posts

## Comments

**oscar_cunningham**on Wolf's Dice · 2019-07-17T13:06:40.118Z · score: 7 (3 votes) · LW · GW

Right. But also we would want to use a prior that favoured biases which were near fair, since we know that Wolf at least thought they were a normal pair of dice.

**oscar_cunningham**on Open Thread April 2019 · 2019-04-04T07:04:15.002Z · score: 4 (2 votes) · LW · GW

Suppose I'm trying to infer probabilities about some set of events by looking at betting markets. My idea was to visualise the possible probability assignments as a high-dimensional space, and then for each bet being offered remove the part of that space for which the bet has positive expected value. The region remaining after doing this for all bets on offer should contain the probability assignment representing the "market's beliefs".

My question is about the situation where there is no remaining region. In this situation for every probability assignment there's some bet with a positive expectation. Is it a theorem that there is always an arbitrage in this case? In other words, can one switch the quantifiers from "for all probability assignments there exists a positive expectation bet" to "there exists a bet such that for all probability assignments the bet has positive expectation"?

**oscar_cunningham**on The Kelly Criterion · 2018-10-17T00:38:14.810Z · score: 3 (2 votes) · LW · GW

I believe you missed one of the rules of Gurkenglas' game, which was that there are at most 100 rounds. (Although it's possible I misunderstood what they were trying to say.)

If you assume that play continues until one of the players is bankrupt then in fact there are lots of winning strategies. In particular betting any constant proportion less than 38.9%. The Kelly criterion isn't unique among them.

My program doesn't assume anything about the strategy. It just works backwards from the last round and calculates the optimal bet and expected value for each possible amount of money you could have, on the basis of the expected values in the next round which it has already calculated. (Assuming each bet is a whole number of cents.)

**oscar_cunningham**on The Kelly Criterion · 2018-10-16T18:39:28.929Z · score: 6 (4 votes) · LW · GW

If you wager one buck at a time, you win almost certainly.

But that isn't the Kelly criterion! Kelly would say I should open by betting *two* bucks.

In games of that form, it seems like you should be more-and-more careful as the amount of bets gets larger. The optimal strategy doesn't tend to Kelly in the limit.

EDIT: In fact my best opening bet is $0.64, leading to expected winnings of $19.561.

EDIT2: I reran my program with higher precision, and got the answer $0.58 instead. This concerned me so I reran again with infinite precision (rational numbers) and got that the best bet is $0.21. The expected utilities were very similar in each case, which explains the precision problems.

EDIT3: If you always use Kelly, the expected utility is only $18.866.

**oscar_cunningham**on The Kelly Criterion · 2018-10-16T16:33:39.971Z · score: 3 (2 votes) · LW · GW

Can you give a concrete example of such a game?

**oscar_cunningham**on The Kelly Criterion · 2018-10-16T14:06:19.033Z · score: 3 (2 votes) · LW · GW

even if your utility outside of the game is linear, inside of the game it is not.

Are there any games where it's a wise idea to use the Kelly criterion even though your utility outside the game is linear?

**oscar_cunningham**on The Kelly Criterion · 2018-10-16T13:34:57.081Z · score: 2 (1 votes) · LW · GW

Marginal utility is decreasing, but in practice falls off far less than geometrically.

I think this is only true if you're planning to give the money to charity or something. If you're just spending the money on yourself then I think marginal utility is literally zero after a certain point.

**oscar_cunningham**on Open Thread September 2018 · 2018-09-26T12:34:52.460Z · score: 3 (2 votes) · LW · GW

Yeah, I think that's probably right.

I thought of that before but I was a bit worried about it because Löb's Theorem says that a theory can never prove this axiom schema about itself. But I think we're safe here because we're assuming "If T proves φ, then φ" while not actually working in T.

**oscar_cunningham**on Open Thread September 2018 · 2018-09-26T11:00:25.121Z · score: 2 (1 votes) · LW · GW

I'm arguing that, for a theory T and Turing machine P, "T is consistent" and "T proves that P halts" aren't together enough to deduce that P halts. And as I counter example I suggested T = PA + "PA is inconsistent" and P = "search for an inconsistency in PA". This P doesn't halt even though T is consistent and proves it halts.

So if it doesn't work for that T and P, I don't see why it would work for the original T and P.

**oscar_cunningham**on Open Thread September 2018 · 2018-09-25T18:07:25.927Z · score: 2 (1 votes) · LW · GW

Consistency of T isn't enough, is it? For example the theory (PA + "The program that searches for a contradiction in PA halts") is consistent, even though that program doesn't halt.

**oscar_cunningham**on Quantum theory cannot consistently describe the use of itself · 2018-09-25T09:53:31.606Z · score: 10 (6 votes) · LW · GW

https://www.scottaaronson.com/blog/?p=3975

**oscar_cunningham**on Open Thread September 2018 · 2018-09-20T20:11:55.066Z · score: 4 (3 votes) · LW · GW

This is a good point. The Wikipedia pages for other sites, like Reddit, also focus unduly on controversy.

**oscar_cunningham**on Zut Allais! · 2018-09-06T20:35:20.377Z · score: 3 (2 votes) · LW · GW

And the fact that situations like that occurred in humanity's evolution explains why humans have the preference for certainty that they do.

**oscar_cunningham**on Open Thread September 2018 · 2018-09-03T09:10:35.767Z · score: 14 (5 votes) · LW · GW

As well as ordinals and cardinals, Eliezer's construction also needs concepts from the areas of computability and formal logic. A good book to get introduced to these areas is Boolos' "Computability and Logic".

**oscar_cunningham**on Open Thread September 2018 · 2018-09-03T08:08:29.260Z · score: 2 (1 votes) · LW · GW

being unable to imagine a scenario where something is possible

This isn't an accurate description of the mind projection fallacy. The mind projection fallacy happens when someone thinks that some phenomenon occurs in the real world but in fact the phenomenon is a part of the way their mind works.

But yes, it's common to almost all fallacies that they are in fact weak Bayesian evidence for whatever they were supposed to support.

**oscar_cunningham**on Open Thread September 2018 · 2018-09-01T17:12:14.313Z · score: 9 (5 votes) · LW · GW

Eliezer made this attempt at naming a large number computable by a small Turing machine. What I'm wondering is exactly what axioms we need to use in order to prove that this Turning machine does indeed halt. The description of the Turing machine uses a large cardinal axiom ("there exists an I0 rank-into-rank cardinal"), but I don't think that assuming this cardinal is enough to prove that the machine halts. Is it enough to assume that this axiom is consistent? Or is something stronger needed?

**oscar_cunningham**on You Play to Win the Game · 2018-08-31T19:10:06.529Z · score: 4 (2 votes) · LW · GW

games are a specific case where the utility (winning) is well-defined

Lots of board games have badly specified utility functions. The one that springs to mind is Diplomacy; if a stalemate is negotiated then the remaining players "share equally in a draw". I'd take this to mean that each player gets utility 1/n (where there are n players, and 0 is a loss and 1 is a win). But it could also be argued that they each get 1/(2n), sharing a draw (1/2) between them (to get 1/n each wouldn't they have to be "sharing equally in a win"?).

Another example is Castle Panic. It's allegedly a cooperative game. The players all "win" or "lose" together. But in the case of a win one of the players is declared a "Master Slayer". It's never stated how much the players should value being the Master Slayer over a mere win.

Interesting situations occur in these games when the players have different opinions about the value of different outcomes. One player cares more about being the Master Slayer than everyone else, so everyone else lets them be the Master Slayer. They think that they're doing much better that everyone else, but everyone else is happy so long as they all keep winning.

**oscar_cunningham**on Open Thread August 2018 · 2018-08-16T19:31:54.613Z · score: 26 (8 votes) · LW · GW

I actually learnt quantum physics from that sequence, and I'm now a mathematician working in Quantum Computing. So it can't be too bad!

The explanation of quantum physics is the best I've seen anywhere. But this might be because it explained it in a style that was particularly suited to me. I really like the way it explains the underlying reality first and only afterwards explains how this corresponds with what we perceive. A lot of other introductions follow the historical discovery of the subject, looking at each of the famous experiments in turn, and only building up the theory in a piecemeal way. Personally I hate that approach, but I've seen other people say that those kind of introductions were the only ones that made sense to them.

The sequence is especially good if you don't want a math-heavy explantation, since it manages to explain exactly what's going on in a technically correct way, while still not using any equations more complicated than addition and multiplication (as far as I can remember).

The second half of the sequence talks about interpretations of quantum mechanics, and advocates for the "many-worlds" interpretation over "collapse" interpretations. Personally I found it sufficient to convince me that collapse interpretations were bullshit, but it didn't quite convince me that the many-worlds interpretation is obviously true. I find it plausible that the true interpretation is some third alternative. Either way, the discussion is very interesting and worth reading.

As far as "holding up" goes, I once read through the sequence looking for technical errors and only found one. Eliezer says that the wavefunction can't become more concentrated because of Liouville's theorem. This is completely wrong (QM is time-reversible, so if the wavefunction can become more spread out it must also be able to become more concentrated). But I'm inclined to be forgiving to Eliezer on this point because he's making exactly the mistake that he repeatedly warns us about! He's confusing the distribution described by the wavefunction (the uncertainty that we *would* have if we performed a measurment) with the uncertainty we *do* have *about* the wavefunction (which is what Liouville's theorem actually applies to).

**Oscar_Cunningham**on [deleted post] 2018-08-15T13:34:41.626Z

Really, the fact that different sizes of moral circle can incentivize coercion is just a trivial corollary of the fact that value differences in general can incentivize coercion.

**Oscar_Cunningham**on [deleted post] 2018-08-15T07:04:31.147Z

When people have a wide circle of concern and advocate for its widening as a norm, this makes me nervous because it implies huge additional costs forced on me, through coercive means like taxation or regulations

At the moment I (and many others on LW) are experiencing the opposite. We would prefer to give money to people in Africa, but instead we are forced by taxes to give to poor people in the same country as us. Since charity to Africa is much more effective, this means that (from our point of view) 99% of the taxed money is being wasted.

**oscar_cunningham**on Open Thread August 2018 · 2018-08-14T13:06:38.396Z · score: 4 (3 votes) · LW · GW

Okay, sure. But an idealized rational reasoner wouldn't display this kind of uncertainty about its own beliefs, but it would still have the phenomenon you were originally asking about (where statements assigned the same probability update by different amounts after the introduction of evidence). So this kind of second-order probability can't be used to answer the question you originally asked.

**oscar_cunningham**on Open Thread August 2018 · 2018-08-13T11:22:04.778Z · score: 3 (2 votes) · LW · GW

It seems like you're describing a Bayesian probability distribution over a frequentist probability estimate of the "real" probability.

Right. But I was careful to refer to f as a frequency rather than a probability, because f isn't a description of our beliefs but rather a physical property of the coin (and of the way it's being thrown).

Agreed that this works in cases which make sense under frequentism, but in cases like "Trump gets reelected" you need some sort of distribution over a Bayesian credence, and I don't see any natural way to generalise to that.

I agree. But it seems to me like the other replies you've received are mistakenly treating all propositions as though they do have an f with an unknown distribution. Unnamed suggests using the beta distribution; the thing which it's the distribution of would have to be f. Similarly rossry's reply, containing phrases like "something in the ballpark of 50%" and "precisely 50%", talks as though there is some unknown percentage to which 50% is an estimate.

A lot of people (like in the paper Pattern linked to) think that our distribution over f is a "second-order" probability describing our beliefs about our beliefs. I think this is wrong. The number f doesn't describe our beliefs at all; it describes a physical property of the coin, just like mass and diameter.

In fact, any kind of second-order probability must be trivial. We have introspective access to our own beliefs. So given any statement about our beliefs we can say for certain whether or not it's true. Therefore, any second-order probability will either be equal to 0 or 1.

**oscar_cunningham**on Open Thread August 2018 · 2018-08-13T10:06:29.757Z · score: 2 (1 votes) · LW · GW

The Open Thread appears to no longer be stickied. Try pushing the pin in harder next time.

**oscar_cunningham**on Open Thread August 2018 · 2018-08-12T06:43:20.191Z · score: 3 (2 votes) · LW · GW

It doesn't really matter for the point I was making, so long as you agree that the probability moves further for the second coin.

**oscar_cunningham**on Open Thread August 2018 · 2018-08-11T22:13:19.688Z · score: 4 (3 votes) · LW · GW

This is related to the problem of predicting a coin with an unknown bias. Consider two possible coins: the first which you have inspected closely and which looks perfectly symmetrical and feels evenly weighted, and the second which you haven't inspected at all and which you got from a friend who you have previously seen cheating at cards. The second coin is much more likely to be biased than the first.

Suppose you are about to toss one of the coins. For each coin, consider the event that the coin lands on heads. In both cases you will assign a probability of 50%, because you have no knowledge that distinguishes between heads and tails.

But now suppose that before you toss the coin you learn that the coin landed on heads for each of its 10 previous tosses. How does this affect your estimate?

- In the case of the first coin it doesn't make very much difference. Since you see no way in which the coin could be biased you assume that the 10 heads were just a coincidence, and you still assign a probability of 50% to heads on the next toss (maybe 51% if you are beginning to be suspicious despite your inspection of the coin).
- But when it comes to the second coin, this evidence would make you very suspicious. You would think it likely that the coin had been tampered with. Perhaps it simply has two heads. But it would also still be possible that the coin was fair. Two headed coins are pretty rare, even in the world of degenerate gamblers. So you might assign a probability of around 70% to getting heads on the next toss.

This shows the effect that you were describing; both events had a prior probability of 50%, but the probability changes by different amounts in response to the same evidence. We have a lot of knowledge about the first coin, and compared to this knowledge the new evidence is insignificant. We know much less about the second coin, and so the new evidence moves our probability much further.

Mathematically, we model each coin as having a fixed but unknown frequency with which it comes up heads. This is a number 0 ≤ f ≤ 1. If we knew f then we would assign a probability of f to any coin-flip except those about which we have direct evidence (i.e. those in our causal past). Since we don't know f we describe our knowledge about it by a probability distribution P(f). The probability of the next coin-flip coming up heads is then the expected value of f, the integral of P(f)f.

Then in the above example our knowledge about the first coin would be described by a function P(f) with a sharp peak around 1/2 and almost zero probability everywhere else. Our knowledge of the second coin would be described by a much broader distribution. When we find out that the coin has come up heads 10 times before our probability distribution updates according to Bayes' rule. It changes from P(f) to P(f)f^10 (or rather the normalisation of P(f)f^10). This doesn't affect the sharply pointed distribution very much because the function f^10 is approximately constant over the sharp peak. But it pushes the broad distribution strongly towards 1 because 1^10 is 1024 times larger than 1/2^10 and P(f) isn't 1024 times taller near 1/2 than near 1.

So this is a nice case where it is possible to compare between two cases how much a given piece of evidence moves our probability estimate. However I'm not sure whether this can be extended to the general case. A proposition like "Trump gets reelected" can't be thought of as being like a flip of a coin with a particular frequency. Not only are there no "previous flips" we can learn about, it's not clear what another flip would even look like. The election that Trump won doesn't count, because we had totally different knowledge about that one.

**oscar_cunningham**on Open Thread August 2018 · 2018-08-04T19:28:29.654Z · score: 2 (1 votes) · LW · GW

I see, thanks. I had been looking at the page https://www.lesswrong.com/daily, linked to from the sidebar under the same phrase "All Posts".

**oscar_cunningham**on Open Thread August 2018 · 2018-08-04T11:59:37.965Z · score: 2 (1 votes) · LW · GW

I don't see it there. Have you done the update yet?

**oscar_cunningham**on Open Thread August 2018 · 2018-08-03T20:59:30.993Z · score: 2 (1 votes) · LW · GW

What does "stickied" do?

**oscar_cunningham**on What are your plans for the evening of the apocalypse? · 2018-08-02T09:42:00.526Z · score: 7 (4 votes) · LW · GW

The financial effects would be immediate and extreme. All sorts of mad things would happen to stock prices, inflation, interest rates, etc. The people who quit their jobs to live off their savings might well find that their savings don't stretch as far as they thought, which is probably a good thing since the whole system would collapse much faster than five years if a significant proportion of people were to quit their jobs.

**oscar_cunningham**on Open Thread August 2018 · 2018-08-01T20:20:49.510Z · score: 3 (2 votes) · LW · GW

Okay, great.

**oscar_cunningham**on Open Thread August 2018 · 2018-08-01T11:45:15.744Z · score: 6 (4 votes) · LW · GW

Is it possible to subscribe to a post so you get notifications when new comments are posted? I notice that individual *comments* have subscribe buttons.

**oscar_cunningham**on Open Thread August 2018 · 2018-08-01T11:16:46.942Z · score: 6 (3 votes) · LW · GW

Old LW had a link to the open thread in the sidebar. Would it be good to have that here so that comments later in the month still get some attention?

**oscar_cunningham**on Applying Bayes to an incompletely specified sample space · 2018-07-30T22:01:44.566Z · score: 4 (3 votes) · LW · GW

I've always thought that chapter was a weak point in the book. Jaynes doesn't treat probabilities of probabilities in quite the right way (for one thing they're really probabilities of frequencies). So take it with a grain of salt.

**oscar_cunningham**on Bayesianism (Subjective or Objective) · 2018-07-30T14:42:59.792Z · score: 2 (1 votes) · LW · GW

I'm not quite sure what you mean here, but I don't think the idea of calibration is directly related to the subjective/objective dichotomy. Both subjective and objective Bayesians could desire to be well calibrated.

**oscar_cunningham**on Bayesianism (Subjective or Objective) · 2018-07-30T12:48:53.195Z · score: 2 (1 votes) · LW · GW

Also, here's Eliezer on the subject: Probability is Subjectively Objective

Under his definitions he's subjective. But he would definitely say that agents with the same state of knowledge must assign the same probabilities, which rules him out of the very subjective camp.

**oscar_cunningham**on Bayesianism (Subjective or Objective) · 2018-07-30T12:38:21.300Z · score: 2 (1 votes) · LW · GW

I think everyone agrees on the directions "more subjective" and "more objective", but they use the words "subjective"/"objective" to mean "more subjective/objective than me".

A very subjective position would be to believe that there are no "right" prior probabilities, and that it's okay to just pick any prior depending on personal choice. (i.e. Agents with the same knowledge can assign different probabilities)

A very objective position would be to believe that there are some probabilities that must be the same even for agents with different knowledge. For example they might say that you must assign probability 1/2 to a fair coin coming up heads, no matter what your state of knowledge is. (i.e. Agents with different knowledge must (sometimes) assign the same probabilities)

Jaynes and Yudkowsky are somewhere in between these two positions (i.e. agents with the same knowledge must assign the same probabilities, but the probability of any event can vary depending on your knowledge of it), so they get called "objective" by the maximally subjective folk, and "subjective" by the maximally objective folk.

The definitions in the SEP above would definitely put Jaynes and Yudkowsky in the objective camp, but there's a lot of room on the scale past the SEP definition of "objective".

**oscar_cunningham**on Bayesianism (Subjective or Objective) · 2018-07-29T14:41:41.883Z · score: 13 (7 votes) · LW · GW

The SEP is quite good on this subject:

Subjective and Objective Bayesianism.Are there constraints on prior probabilities other than the probability laws? Consider a situation in which you are to draw a ball from an urn filled with red and black balls. Suppose you have no other information about the urn. What is the prior probability (before drawing a ball) that, given that a ball is drawn from the urn, that the drawn ball will be black? The question divides Bayesians into two camps:

(a)Subjective Bayesiansemphasize the relative lack of rational constraints on prior probabilities. In the urn example, they would allow that any prior probability between 0 and 1 might be rational (though some Subjective Bayesians (e.g., Jeffrey) would rule out the two extreme values, 0 and 1). The most extreme Subjective Bayesians (e.g., de Finetti) hold that the only rational constraint on prior probabilities is probabilistic coherence. Others (e.g., Jeffrey) classify themselves as subjectivists even though they allow for some relatively small number of additional rational constraints on prior probabilities. Since subjectivists can disagree about particular constraints, what unites them is that their constraints rule out very little. For Subjective Bayesians, our actual prior probability assignments are largely the result of non-rational factors—for example, our own unconstrained, free choice or evolution or socialization.

(b)Objective Bayesians(e.g., Jaynes and Rosenkrantz) emphasize the extent to which prior probabilities are rationally constrained. In the above example, they would hold that rationality requires assigning a prior probability of 1/2 to drawing a black ball from the urn. They would argue that any other probability would fail the following test: Since you have no information at all about which balls are red and which balls are black, you must choose prior probabilities that are invariant with a change in label (“red” or “black”). But the only prior probability assignment that is invariant in this way is the assignment of prior probability of 1/2 to each of the two possibilities (i.e., that the ball drawn is black or that it is red).

In the limit, an Objective Bayesian would hold that rational constraints uniquely determine prior probabilities in every circumstance. This would make the prior probabilitieslogical probabilitiesdeterminable purelya priori.

Under these definitions, Eliezer and LW in general fall under the Objective category. We tend to believe that two agents with the same knowledge should assign the same probability.

**oscar_cunningham**on Open Thread July 2018 · 2018-07-17T15:54:16.771Z · score: 2 (1 votes) · LW · GW

Sure, the inductor doesn't know which systems are consistent, but nevertheless it eventually starts believing the proofs given by any system which is consistent.

**oscar_cunningham**on Open Thread July 2018 · 2018-07-11T12:43:15.305Z · score: 2 (1 votes) · LW · GW

Is there a preferred way to flag spam posts like this one: https://www.lesswrong.com/posts/g7LgqmEhaoZnzggzJ/teaching-is-everything-and-more ?

**oscar_cunningham**on Open Thread July 2018 · 2018-07-10T21:41:21.485Z · score: 5 (3 votes) · LW · GW

Could logical inductors be used as a partial solution to Hilbert's Second Problem (of putting mathematics on a sure footing)? Thanks to Gödel we know that there are lots of things that any given theory can't prove. But by running a logical inductor we could at least say that these things are true with some probability. Of course a result proved in the "Logical Induction" paper is that the probability of an undecidable statement tends to a value that is neither 0 or 1, so we can't use this approach to justify belief in a stronger theory. But I noticed a weaker result that does hold. There's a certain class of statements such that (assuming ZF is consistent) an inductor over PA will think that they're very likely as soon as it finds a proof for them in ZF.

This class of statements is those with only bounded quantifiers; those where every "∀" and "∃" are restricted to a predefined range. This class of statements is decidable, meaning that there's a Turing machine that will take a bounded sentence and will always halt and tell you whether or not it holds in *ℕ*. Because of this every bounded sentence has a proof (or a proof of its negation) in both PA and ZF (and PA and ZF agree which it is).

But the proofs of a bounded sentence in PA and ZF can have very different lengths. Consider the self-referential bounded sentence "PA cannot prove this sentence in fewer than 1000000 symbols". This must have a proof in PA, since we can just check all sentence with fewer than 1000000 symbols by brute force, but its proof must be longer than 1000000 symbols, or else we would get a contradiction. But the preceding sentences constitute a proof in ZF with much fewer than 1000000 symbols. So the sentence is provable in both PA and ZF, but the ZF proof is much shorter.

It might seem like the bounded sentences can't express many interesting concepts. But in fact I'd contend that they can express most (if not all) things that you might actually need to know. For example, it seems like the fact "For all x and y, x + y = y + x" is a useful unbounded sentence. But whenever you face a situation where you would want to use it there are always some particular x and y that apply in that situation. So then we can use the bounded sentence "x + y = y + x" instead, where x and y stand for whichever values actually occurred.

Now I'll show that logical inductors over PA eventually trust proofs in ZF of bounded sentences (assuming ZF is consistent). Consider the Turing machine that takes as input a number n and searches through all strings in length order, keeping track of any that are a ZF proof of a bounded sentence. When it's been searching for n timesteps it stops and outputs whichever bounded sentence it found a proof for last. Call this sentence ϕ_n. Now let P be a logical inductor over PA. Assuming that ZF is consistent, the sequence ϕ_n are all theorems of PA, and by construction there's a polynomial time Turning machine that outputs them. So by a theorem in the logical inductor paper, we have that P_n(ϕ_n) tends to 1 as n goes to infinity, meaning that for large n the logical inductor becomes confident in ϕ_n sometime around day n. If a bounded statement ϕ has a ZF proof in m symbols then it's equal to ϕ_n for n ~ exp(m). So P begins to think that ϕ is very likely from day exp(m) onward.

Assuming that the logical inductor is working with a deductive process that searches through PA proofs in length order, this can occur long before the deductive process actually proves that ϕ is true. The exponential doesn't really make a difference here, since we don't know exactly how fast the deductive process is working. But it hardly matters, because the length of ZF proofs can be arbitrarily better than those in PA. For example the shortest proof of the sentence "PA cannot prove this sentence in fewer than exp(exp(exp(n))) symbols" in PA is longer than exp(exp(exp(n))) symbols, whereas the length of the shortest proof in ZF is about log(n).

So in generality what we have proved is that weak systems will accept as very good evidence proofs given in stronger theories, so long as the target of the proof is a bounded sentence, and so long as the stronger theory is in fact consistent. This is an interesting partial answer to Hilbert's question, since it explains why we would care about proofs in ZF, even if we only believe in PA.

**oscar_cunningham**on What could be done with RNA and DNA sequencing that's 1000x cheaper than it's now? · 2018-06-26T19:17:14.270Z · score: 9 (2 votes) · LW · GW

If we can do testing quickly then we could use it for security. Perhaps (further into the future) your phone will test your DNA when you try to use it?

**oscar_cunningham**on UDT can learn anthropic probabilities · 2018-06-25T19:39:02.806Z · score: 7 (1 votes) · LW · GW

Can I actually do this experiment, and thereby empirically determine (for myself but nobody else) which of SIA and SSA is true?

**oscar_cunningham**on Set Up for Success: Insights from 'Naïve Set Theory' · 2018-02-28T09:00:45.042Z · score: 15 (4 votes) · LW · GW

This was valuable feedback for calibration, and I intend to continue this practice. I'm still worried that down the line and in the absence of teachers, I may believe that I've learnt the research guide with the necessary rigor, go to a MIRIx workshop, and realize I hadn't been holding myself to a sufficiently high standard. Suggestions for ameliorating this would be welcome.

I think if you read more textbooks you'll naturally get used to the correct level of rigour.

**oscar_cunningham**on ProbDef: a game about probability and inference · 2018-01-02T07:01:33.240Z · score: 4 (2 votes) · LW · GW

Fun game! And the music is *really* nice. By the way you have a typo in there somewhere. It says you refresh to ten shields in a level where you only get three.

**oscar_cunningham**on Can we see light? · 2017-12-08T18:48:13.081Z · score: 1 (1 votes) · LW · GW

What about photon-photon interactions? :-)

**oscar_cunningham**on Simple refutation of the ‘Bayesian’ philosophy of science · 2017-11-01T14:45:56.176Z · score: 5 (5 votes) · LW · GW

However, if T is an explanatory theory (e.g. ‘the sun is powered by nuclear fusion’), then its negation ~T (‘the sun is not powered by nuclear fusion’) is not an explanation at all.

The words "explanatory theory" seem to me to have a lot of fuzziness hiding behind them. But to the extent that "the sun is powered by nuclear fusion" is an explanatory theory I would say that the proposition ~T is just the union of many explanatory theories: "the sun is powered by oxidisation", "the sun is powered by gravitational collapse", and so on for all explanatory theories except "nuclear fusion".

Therefore, suppose (implausibly, for the sake of argument) that one could quantify ‘the property that science strives to maximise’. If T had an amount q of that, then ~T would have none at all, not 1-q as the probability calculus would require if q were a probability.

There are lots of negative facts that are worth knowing and that scientists did good work to discover. When Michelson and Morley discovered that light did *not* travel through luminiferous aether that was a fact worth knowing, and lead to the discovery of special relativity. So even if you don't call ~T an explanatory theory it seems like it still has a lot of "the property that science strives to maximise"

Also, the conjunction (T₁ & T₂) of two mutually inconsistent explanatory theories T₁ and T₂ (such as quantum theory and relativity) is provably false, and therefore has zero probability. Yet it embodies some understanding of the world and is definitely better than nothing.

A Bayesian might instead define theories T₁' = "quantum theory leads to approximately correct results in the following circumstances ..." and T₂' "relativity leads to approximately correct results in the following circumstances ...". Then T₁' and T₂' would both have a high probability and be worth knowing, and so would their conjunction. The original conjunction, T₁ & T₂, would mean "both quantum theory and relativity are exactly true". This of course is provably false, and so has probability 0.

Furthermore if we expect, with Popper, that all our best theories of fundamental physics are going to be superseded eventually, and we therefore believe their negations, it is still those false theories, not their true negations, that constitute all our deepest knowledge of physics.

Right, right. The statement T₁ is false; but the statement T₁' is true.

What science really seeks to ‘maximise’ (or rather, create) is explanatory power.

Does Deutsch write anywhere about what a precise definition of "explanation" would be?

**oscar_cunningham**on Just a photo · 2017-10-20T11:11:15.691Z · score: 0 (0 votes) · LW · GW

It's also similar to this image:

It's difficult to see it as anything until it "snaps" and then it's impossible to not see it.

**oscar_cunningham**on Stupid Questions - September 2017 · 2017-09-27T15:59:49.144Z · score: 1 (1 votes) · LW · GW

This is a good question. The answer is that it shouldn't take any energy to hold something in place, but your arms are very inefficient. When you keep one of your muscles contracted the individual cells in that muscle are all contracting and relaxing repeatedly. This burns energy. So for a human holding a dumbbell takes energy. But this is just an unfortunate consequence of the way muscles work. If the human body had some way to "lock" the skeleton into place then you would be able to hold a dumbbell for as long as you wanted.

**oscar_cunningham**on Open thread, September 25 - October 1, 2017 · 2017-09-25T10:03:03.833Z · score: 1 (1 votes) · LW · GW

If you fail to get your n flips in a row, your expected number of flips on that attempt is the sum from i = 1 to n of i*2^-i, divided by (1-2^-n). This gives (2-(n+2)/2^n)/(1-2^-n). Let E be the expected number of flips needed in total. Then:

E = (2^-n)n + (1-2^-n)[(2-(n+2)/2^n)/(1-2^-n) + E]

Hence (2^-n)E = (2^-n)n + 2 - (n+2)/2^n, so E = n + 2^(n+1) - (n+2) = 2^(n+1) - 2

**oscar_cunningham**on Open thread, August 28 - September 3, 2017 · 2017-09-01T09:57:53.785Z · score: 0 (0 votes) · LW · GW

I think you must just have an error in your code somewhere. Consider going round 3. Let the probability you say "3" be p_3. Then according to your numbers

164/512 = 15/64 + (1 - 15/64)*(1/2)*p_3

Since the probability of escaping by round 3 is the probability of escape by round 2, plus the probability you don't escape by round 2, multiplied by the probability the coin lands tails, multiplied by the probability you say "3".

But then p_3 = 11/49, and 49 is not a power of two!