Hiring decisions are not suitable for prediction markets 2024-01-08T21:11:14.304Z
Prediction Markets aren't Magic 2023-12-21T12:54:07.754Z
How to Interpret Prediction Market Prices as Probabilities 2023-05-09T14:12:27.394Z
Is Metaculus Slow to Update? 2022-03-25T19:44:33.624Z
Scott Alexander 2021 Predictions: Market Prices - Resolution 2022-01-02T11:55:47.553Z
Risk Premiums vs Prediction Markets 2021-07-28T23:03:50.398Z
Scott Alexander 2021 Predictions: Market Prices 2021-04-27T14:03:10.995Z
Never Go Full Kelly 2021-02-25T12:53:50.618Z
Kelly isn't (just) about logarithmic utility 2021-02-23T12:12:24.999Z


Comment by SimonM on Why have insurance markets succeeded where prediction markets have not? · 2024-01-22T11:51:26.827Z · LW · GW

I wrote about exactly this recently-

Comment by SimonM on Hiring decisions are not suitable for prediction markets · 2024-01-09T17:23:29.251Z · LW · GW

I don't give much weight to his diagnosis of problematic group decision mechanisms

I have quite a lot of time for it personally.

The world is dominated by a lot of large organizations that have a lot of dysfunction. Anybody over the age of 40 will just agree with me on this. I think it's pretty hard to find anybody who would disagree about that who's been around the world. Our world is full of big organizations that just make a lot of bad decisions because they find it hard to aggregate information from all the different people.

This is roughly Hanson's reasoning, and you can spell out the details a bit more. (Poor communication between high level decision makers and shop-floor workers, incentives at all levels dissuading truth telling etc). Fundamentally though I find it hard to make a case this isn't true in /any/ large organization. Maybe the big tech companies can make a case for this, but I doubt it. Office politics and self-interest are powerful forces.

For employment decisions, it's not clear that there is usable (legally and socially tolerated) information which a market can provide

I roughly agree - this is the point I was trying to make. All the information is already there in interview evaluations. I don't think Robin is expecting new information though - he's expecting to combine the information more effectively. I just don't expect that to make much difference in this case.

Comment by SimonM on Prediction Markets aren't Magic · 2023-12-24T12:14:07.878Z · LW · GW

So the first question is: "how much should we expect the sample mean to move?". 

If the current state is , and we see a sample of  (where  is going to be 0 or 1 based on whether or not we have heads or tails), then the expected change is:

In these steps we are using the facts that ( is independent of the previous samples, and the distribution of  is Bernoulli with . (So  and ). 

To do the proper version of this, we would be interested in how our prior changes, and our distribution for  wouldn't purely be a function of . This will reduce the difference, so I have glossed over this detail.

The next question is: "given we shift the market parameter by , how much money (pnl) should we expect to be able to extract from the market in expectation?"

For this, I am assuming that our market is equivalent to a proper scoring rule. This duality is laid out nicely here. Expending the proper scoring rule out locally, it must be of the form , since we have to be at a local minima. To use some classic examples, in a log scoring rule:

in a brier scoring rule:

Comment by SimonM on Prediction Markets aren't Magic · 2023-12-24T12:13:56.414Z · LW · GW

Whoops. Good catch. Fixing

Comment by SimonM on Prediction Markets aren't Magic · 2023-12-23T17:03:00.986Z · LW · GW

x is the result of the (n+1)th draw sigma is the standard deviation after the first n draws pnl is the profit and loss the bettor can expect to earn

Comment by SimonM on Prediction Markets aren't Magic · 2023-12-22T09:19:00.190Z · LW · GW

Prediction markets generate information. Information is valuable as a public good. Failure of public good provision is not a failure of prediction markets.

I think you've slightly missed my point. My claim is narrower than this. I'm saying that prediction markets have a concrete issue which means you should expect them to be less efficient at gathering data than alternatives. Even if information is a public good, it might not be worth as much as prediction markets would charge to find that information. Imagine if the cost of information via a prediction market was exponential in the cost of information gathering, that wouldn't mean the right answer is to subsidise prediction markets more.

Comment by SimonM on Prediction Markets aren't Magic · 2023-12-22T09:15:32.617Z · LW · GW

If you have another suggestion for a title, I'd be happy to use it

Comment by SimonM on Prediction Markets aren't Magic · 2023-12-22T09:14:13.791Z · LW · GW

Even if there is no acceptable way to share the data semi-anonymously outside of match group, the arguments for prediction markets still apply within match group. A well designed prediction market would still be a better way to distribute internal resources and rewards amongst competing data science teams within match group.

I used to think things like this, but now I disagree, and actually think it's fairly unlikely this is the case.

  1. Internal prediction markets have tried (and failed) at multiple large organisations who made serious efforts to create them
  2. As I've explained in this post, prediction markets are very inefficient at sharing rewards. Internal to a company you are unlikely to have the right incentives in place as much as just subsidising a single team who can share models etc. The added frictions of a market are substantial.
  3. The big selling points of prediction markets (imo) come from:
    1. Being able to share results without sharing information (ie I can do some research, keep the information secret, but have people benefit from the conclusions)
    2. Incentivising a wider range of people. At an orgasation, you'd hire the most appropriate people into your data science team and let them run. There's no need to wonder if someone from marketing is going to outperform their algorithm.

People who actually match and meetup with another user will probably have important inside view information inaccessible to the algorithms of match group.

I strongly agree. I think people often confuse "market" and "prediction market". There is another (arguably better) model of dating apps which is that the market participants are the users, and the site is actually acting as a matching engine. Since I (generally) think markets are great, this also seems pretty great to me.

Comment by SimonM on Prediction Markets aren't Magic · 2023-12-21T14:12:49.678Z · LW · GW

Sure - but that answer doesn't explain their relative lack of success in other countries (eg the UK)

Additionally, where prediction markets work well (eg sports betting, political betting) there is a thriving offshore market catering to US customers. 

Comment by SimonM on Solving Two-Sided Adverse Selection with Prediction Market Matchmaking · 2023-12-21T12:53:43.775Z · LW · GW

This post triggered me a bit, so I ended up writing one of my own.

I agree the entire thing is about how to subsidise the markets, but I think you're overestimating how good markets are as a mechanism for subsidising forecasting (in general). Specifically for your examples:

  1. Direct subsidies are expensive relative to the alternatives (the point of my post)
  2. Hedging doesn't apply in lots of markets, and in the ones where it does make sense those markets already exist. (Eg insurance)
  3. New traders is a terrible idea as you say. It will work in some niches (eg where there's lots of organic interest, but it wont work at scale for important things)
Comment by SimonM on Solving Two-Sided Adverse Selection with Prediction Market Matchmaking · 2023-11-27T11:22:18.861Z · LW · GW

I'm excited about the potential of conditional prediction markets to improve on them and solve two-sided adverse selection.

This applies to roughly the entire post, but I see an awful lot of magical thinking in this space. What is the actual mechanism by which you think prediction markets will solve these problems?

In order to get a good prediction from a market you need traders to put prices in the right places. This means you need to subsidise the markets. Whether or not a subsidised prediction market is going to be cheaper for the equivalent level of forecast than paying another 3rd party (as is currently the case in most of your examples) is very unclear to me

Comment by SimonM on AI #39: The Week of OpenAI · 2023-11-23T17:25:52.414Z · LW · GW

A thing Larry Summers once said that seems relevant, from Elizabeth Warren:

He said something very similar to Yanis Varoufakis ( and now I like to assume he goes around saying this to everyone

Comment by SimonM on When and why should you use the Kelly criterion? · 2023-11-08T15:19:17.245Z · LW · GW

No, it's fairly straightforward to see this won't work

Let N be the random variable denoting the number of rounds. Let x = p*w+(1-p)*l where p is probability of winning and w=1-f+o*f, l=1-f the amounts we win or lose betting a fraction f of our wealth.

Then the value we care about is E[x^N], which is the moment generating function of X evaluated at log(x). Since our mgf is increasing as a function of x, we want to maximise x. ie our linear utility doesn't change

Comment by SimonM on How to (hopefully ethically) make money off of AGI · 2023-11-08T07:13:55.776Z · LW · GW

Yes? 1/ it's not in their mandate 2/ they've never done it before (I guess you could argue the UK did for in 2022, but I'm not sure this is quite the same) 3/ it's not clear that this form of QE would have the effect you're expecting on long end yields

Comment by SimonM on How to (hopefully ethically) make money off of AGI · 2023-11-07T21:30:33.183Z · LW · GW

I absolutely do not recommend shorting long-dated bonds. However, if I did want to do so a a retail investor, I would maintain a rolling short in CME treasury futures. Longest future is UB. You'd need to roll your short once every 3 months, and you'd also want to adjust the size each time, given that the changing CTD means that the same number of contracts doesn't necessarily mean the same amount of risk each expiry.

Comment by SimonM on How to (hopefully ethically) make money off of AGI · 2023-11-07T21:24:34.175Z · LW · GW

Err... just so I'm clear lots of money being printed will devalue those long dated bonds even more, making the bond short an even better trade? (Or are you talking about some kind of YCC scenario?)

Comment by SimonM on When and why should you use the Kelly criterion? · 2023-11-06T09:08:41.011Z · LW · GW

average returns

I think the disagreement here is on what "average" means. All-in maximises the arithmetic average return. Kelly maximises the geometric average. Which average is more relevant is equivalent to the Kelly debate though, so hard to say much more

Comment by SimonM on AI #30: Dalle-3 and GPT-3.5-Instruct-Turbo · 2023-09-21T15:38:56.093Z · LW · GW

Wouldn’t You Prefer a Good Game of Chess?

I assume this was supposed to be a WarGames reference, in which case I think it should be a "nice" game of chess.

Comment by SimonM on My guess for why I was wrong about US housing · 2023-06-14T16:27:02.064Z · LW · GW

Yeah, and it doesn't adjust for taxes there either. I thought this was less of an issue when comparing rents to owning though, as the same error should affect both equally.

Comment by SimonM on My guess for why I was wrong about US housing · 2023-06-14T14:21:21.177Z · LW · GW

This doesn't seem to account for property taxes, which I expect would change the story quite a bit for the US.

Comment by SimonM on Do humans still provide value in correspondence chess? · 2023-05-24T09:21:56.515Z · LW · GW

This seems needlessly narrow minded. Just because AI is better than humans doesn't make it uniformly better than humans in all subtasks of chess.

I don't know enough about the specifics that this guy is talking about (I am not an expert) but I do know that until the release of NN-based algorithms most top players were still comfortable talking about positions where the computer was mis-evaluating positions soon out of the opening.

To take another more concrete example - computers were much better than humans in 2004, and yet Peter Leko still managed to refute a computer prepared line OTB in a world championship game.

Comment by SimonM on Do humans still provide value in correspondence chess? · 2023-05-23T20:05:19.957Z · LW · GW

Agreed - as I said, the most important things are compute and dilligence. Just because a large fraction of the top games are draws doesn't really say much about whether or not there is an edge being given by the humans (A large fraction of elite chess games are draws, but no-one doubts there are differences in skill level there). Really you'd want to see Jon Edward's setup vs a completely untweaked engine being administered by a novice.

Comment by SimonM on Do humans still provide value in correspondence chess? · 2023-05-23T15:05:19.667Z · LW · GW

I believe the answer is potentially. The main things which matter in high-level correspondence chess are:

  1. Total amount of compute available to players
  2. Not making errors

Although I don't think either of those are really relevant. The really relevant bit is (apparently) planning:

For me, the key is planning, which computers do not do well — Petrosian-like evaluations of where pieces belong, what exchanges are needed, and what move orders are most precise within the long-term plan.

(From this interview with Jon Edwards (reigning correspondence world champion) from New In Chess)

I would highly recommend the interview on Perpetual Chess podcast also with Jon Edwards  which I would also recommend.

I'll leave you with this final quote, which has stuck with me for ages:

The most important game in the Final was my game against Osipov. I really hoped to win in order to extend my razor-thin lead, and the game’s 119 moves testify to my determination. In one middlegame sequence, to make progress, I had to find a way to force him to advance his b-pawn one square, all while avoiding the 50-move rule. I accomplished the feat in 38 moves, in a sequence that no computer would consider or find. Such is the joy of high-level correspondence chess. Sadly, I did not subsequently find a win. But happily, I won the Final without it!

Comment by SimonM on On AI and Interest Rates · 2023-01-17T17:51:05.215Z · LW · GW

I agree, as I said here

Comment by SimonM on GraphQL tutorial for LessWrong and Effective Altruism Forum · 2022-12-28T22:22:33.713Z · LW · GW

Just in case anyone is struggling to find the relevant bits of the the codebase, my best guess is the link for the collections folder in github is now here.

You are looking in "views.ts" eg .../collections/comments/views.ts

The best thing to search for (I found) was ".addView(" and see what fits your requirements

Comment by SimonM on Log-odds are better than Probabilities · 2022-12-13T10:23:41.487Z · LW · GW

I feel in all these contexts odds are better than log-odds.

Log-odds simplifies Bayesian calculations: so does odds. (The addition becomes multiplication)

Every number is meaningful: every positive number is meaningful and the numbers are clearer. I can tell you intuitively what 4:1 or 1:4 means. I can't tell you what -2.4 means quickly, especially if I have to keep specifying a base.

Certainty is infinite: same is true for odds

Negation is the complement and 0 is neutral: Inverse is the complement and 1 is neutral. 1:1 means "I don't know" and 1:x is the inverse of x:1. Both ot these are intuitive to me.

Comment by SimonM on Is Metaculus Slow to Update? · 2022-04-22T17:19:22.000Z · LW · GW

No - I think probability is the thing supposed to be a martingale, but I might be being dumb here.

Comment by SimonM on Thoughts on the SPIES Forecasting Method? · 2022-03-19T18:33:43.958Z · LW · GW

So, what do you think? Does this method seem at all promising? I'm debating with myself whether I should begin using SPIES on Metaculus or elsewhere.

I'm not super impressed tbh. I don't see "give a 90% confidence interval for x" as a question which comes up frequently? (At least in the context of eliciting forecasts and estimates from humans - it comes up quite a bit in data analysis).

For example, I don't really understand how you'd use it as a method on Metaculus. Metaculus has 2 question types - binary and continuous. For binary you have to give the probability an event happens - not sure how you'd use SPIES to help here. For continuous you are effectively doing the first step of SPIES - specifying the full distribution.

If I was to make a positive case for this, it would be - forcing people to give a full distribution results in better forecasts for sub-intervals. This seems an interesting (and plausible claim) but I don't find anything beyond that insight especially valuable.

Comment by SimonM on 2022 ACX predictions: market prices · 2022-03-08T11:14:11.610Z · LW · GW

17. Unemployment below five percent in December: 73 (Kalshi said 92% that unemployment never goes above 6%; 49 from Manifold)

I'm not sure exactly how you're converting 92% unemployment < 6% to < 5%, but I'm not entirely convinced by your methodology?

15. The Fed ends up doing more than its currently forecast three interest rate hikes: None (couldn't find any markets)

Looking at the SOFR Dec-22 3M futures 99.25/99.125 put spread on the 14-Feb, I put this probability at ~84%. 

Thanks for doing this, I started doing it before I saw your competition and then decided against since it would have made cheating too easy. (Also why I didn't enter)  

Comment by SimonM on Capturing Uncertainty in Prediction Markets · 2022-02-25T16:51:30.112Z · LW · GW

And one way to accomplish that would be to bet on what percentage of bets are on "uncertainty" vs. a prediction.

How do you plan on incentivising people to bet on "uncertainty"? All the ways I can think of lead to people either gaming the index, or turning uncertainty into a KBC.

Comment by SimonM on Capturing Uncertainty in Prediction Markets · 2022-02-25T16:48:28.644Z · LW · GW

The market and most of the indicators you mentioned would be dominated by the 60 that placed large bets

I disagree with this. Volatility, liquidity, # predictors, spread of forecasts will all be affected by the fact that 20 people aren't willing to get involved. I'm not sure what information you think is being lost by people stepping away? (I guess the difference between "the market is wrong" and "the market is uninteresting"?)

Comment by SimonM on Capturing Uncertainty in Prediction Markets · 2022-02-25T11:51:38.039Z · LW · GW

There are a bunch of different metrics which you could look at on a prediction market / prediction platform to gauge how "uncertain" the forecast is:

  • Volatility - if the forecast is moving around quite a bit, there are two reasons:
    • Lots of new information arriving and people updating efficiently
    • There is little conviction around "fair value" so traders can move the price with little capital
  • Liquidity - if the market is 49.9 / 50.1 in millions of dollars, then you can be fairly confident that 50% is the "right" price. If the market is 40 / 60 with $1 on the bid and $0.50 on the offer, I probably wouldn't be confident the probability lies between 40 and 60, let along "50% is the right number". (The equivalent on prediction platforms might be number of forecasters, although CharlesD has done some research on this which suggests there's little addition value being added by large numbers of forecasters)
  • "Spread of forecasts" - on Metaculus (for example) you can see a distribution of people's forecasts. If everyone is tightly clustered around 50% that (usually) gives me more confidence that 50% is the right number than if they are widely spread out
Comment by SimonM on Prediction Markets are for Outcomes Beyond Our Control · 2022-02-09T10:25:20.319Z · LW · GW

Prediction markets function best when liquidity is high, but they break completely if the liquidity exceeds the price of influencing the outcome. Prediction markets function only in situations where outcomes are expensive to influence.


There are a ton of fun examples of this failing:

Comment by SimonM on Money-generating environments vs. wealth-building environments (or "my thoughts on the stock market") · 2022-02-05T15:43:10.502Z · LW · GW

I don't know enough about how equities trade during earnings, but I do know a little about how some other products trade during data releases and while people are speaking.

In general, the vast, vast, vast majority of liquidity is withdrawn from the market before the release. There will be a few stale orders people have left by accident + a few orders left in at levels deemed ridiculously unlikely. As soon as the data is released, the fastest players will general send quotes making a (fairly wide market) around their estimate of the fair price. Over time (and here we're still talking very fast) more players will come in, firming up that new market.

The absolute level of money which is being made during this period is relatively small. It's not like the first person to see the report gets to trade at the old price, they get to trade with any stale orders - the market just reprices with very little trading volume.

All of the money-making value was redeemed before people like you and me even had a chance to trade. Right?

Correct, you absolutely did not have the chance to be involved in this trade unless you work at one of a handful of firms which have spent 9 figure sums on doing this really, really well.

Comment by SimonM on Use Normal Predictions · 2022-01-17T09:41:39.463Z · LW · GW

I agree identifying model failure is something people can be good at (although I find people often forget to consider it). Pricing it they are usually pretty bad at.

Comment by SimonM on Use Normal Predictions · 2022-01-15T19:52:58.345Z · LW · GW

I'd personally be more interested in asking someone for their 95% CI than their 68% CI, if I had to ask them for exactly one of the two. (Although it might again depend on what exactly I plain to do with this estimate.)

I'm usually much more interested in a 68% CI (or a 50% CI) than a 95% CI because:

  1. People in general arent super calibrated, especially at the tails
  2. You won't find out for a while how good their intervals are anyway
  3. What happens most often is usually the main interest. (Although in some scenarios the tails are all that matters, so again, depends on context - emphasis usually). I would like people to normalise narrower confidence intervals more.
  4. (as you note) the tails are often dominated by model failure, so you're asking a question less about their forecast, and more about their estimate of model failure. I want information about their model of the world rather than their beliefs about where their beliefs breakdown.
Comment by SimonM on Use Normal Predictions · 2022-01-14T14:54:39.517Z · LW · GW

Under what assumption?

1/ You aren't "[assuming] the errors are normally distributed". (Since a mixture of two normals isn't normal) in what you've written above.

2/ If your assumption is  then yes, I agree the median of is ~0.45 (although 

from scipy import stats
stats.chi2.ppf(.5, df=1)
>>> 0.454936

would have been an easier way to illustrate your point). I think this is actually the assumption you're making. [Which is a horrible assumption, because if it were true, you would already be perfectly calibrated].

3/ I guess you're new claim is "[assuming] the errors are a mixture of normal distributions, centered at 0", which okay, fine that's probably true, I don't care enough to check because it seems a bad assumption to make.

More importantly, there's a more fundamental problem with your post. You can't just take some numbers from my post and then put them in a different model and think that's in some sense equivalent. It's quite frankly bizarre. The equivalent model would be something like:

Comment by SimonM on The Unreasonable Feasibility Of Playing Chess Under The Influence · 2022-01-13T16:59:43.832Z · LW · GW

I think the controversy is mostly irrelevant at this point. Leela performed comparably to Stockfish in the latest TCEC season and is based on Alpha Zero. It has most of the "romantic" properties mentioned in the post.

Comment by SimonM on Use Normal Predictions · 2022-01-13T15:39:22.074Z · LW · GW

That isn't a "simple" observation.

Consider an error which is 0.5 22% of the time, 1.1 78% of the time. The squared errors are 0.25 and 1.21. The median error is 1.1 > 1. (The mean squared error is 1)

Comment by SimonM on Use Normal Predictions · 2022-01-13T14:33:06.821Z · LW · GW

Metaculus uses the cdf of the predicted distribution which is better If you have lots of predictions, my scheme gives an actionable number faster

You keep claiming this, but I don't understand why you think this

Comment by SimonM on Use Normal Predictions · 2022-01-13T14:32:24.877Z · LW · GW

If you suck like me and get a prediction very close then I would probably say: that sometimes happen :) note I assume the average squared error should be 1, which means most errors are less than 1, because 02+22=2>1

I assume you're making some unspoken assumptions here, because  is not enough to say that. A naive application of Chebyshev's inequality would just say that .

To be more concrete, if you were very weird, and either end up forecasting 0.5 s.d. or 1.1 s.d. away, (still with mean 0 and average squared error 1) then you'd find "most" errors are more than 1.

Comment by SimonM on Use Normal Predictions · 2022-01-10T20:56:17.471Z · LW · GW

Go to your profile page. (Will be something like{some number}/). Then in the track record section, switch from Brier Score to "Log Score (continuous)"

Comment by SimonM on Use Normal Predictions · 2022-01-10T09:12:49.253Z · LW · GW

I'd be happy to.

Comment by SimonM on Two ominous charts on the financial markets · 2022-01-10T09:04:43.764Z · LW · GW

The 2000-2021 VIX has averaged 19.7, sp500 annualized vol 18.1.

I think you're trying to say something here like 18.1 <= 19.7, therefore VIX (and by extension) options are expensive. This is an error. I explain more in detail here, but in short you're comparing expected variance and expected volatility which aren't the same thing.

From a 2ndary source: "The mean of the realistic volatility risk premium since 2000 has been 11% of implied volatility, with a standard deviation of roughly 15%-points" from . So 1/3 of the time the premia is outside [-4%,26%], which swamps a lot of vix info about true expect vol.

I'm not going to look too closely at that, but anything which tries to say the VRP was solidly positive post 2015 just doesn't gel with my understanding of that market. (For example). (Also, fwiw anyone who quotes changes in volatility in percentages should be treated with suspicion at best)

-60% would the worst draw down ever, the prior should be <<1%. However, 8 years have been above 30% since 1928 (9%), seems you're using a non-symetric CI.

Yeah, it's not symmetric, but I wasn't the person who suggested it. All I'm saying is "OP says [interval] has probability 90%" "market says [interval] has probability 90%".

The reasoning for why there'd be such a drawdown is backwards in OP: because real rates are so low the returns for owning stocks has declined accordingly. If you expect 0% rates and no growth stocks are priced reasonably, yielding 4%/year more than bonds. Thinking in the level of rates not changes to rates makes more sense, since investments are based on current projected rates. A discounted cash flow analysis works regardless of how rates change year to year. Currently the 30yr is trading at 2.11% so real rates around the 0 bound is the consensus view.

OP being my post of arunto's? 

There's several things unclear with this paragraph though: 

  1. Stocks are currently 'yielding' 1.3% (dividend yield) or 3.9% ('earnings' yield). Not sure exactly what yield you think is 4% over bonds. (Or which maturity bond you're considering).
  2. "Thinking in the level of rates not changes to rates makes more sense, since investments are based on current projected rates.". The forward curve is upward sloping, yes, but if arunto thinks rates are going to change higher than what the market forecasts that will definitely change the price of equities. "A discounted cash flow analysis works regardless of how rates change year to year." Yes, but if you change the rates in your DCF you will change your price
  3. "Currently the 30yr is trading at 2.11% so real rates around the 0 bound is the consensus view.". Currently 30y real rates are -15bps after a steep sell-off after the start of the year. 30y real rates were as low as -60bps in December.

    10y real rates are more like -75bps (up from -110bps in December). 

    "the 0 bound" is something people talk about in nominal space because the yield on cash is somewhere in that ballpark. (These days people generally think that figure should be around -50 to -100bps depending on which euro rates trader you speak to). For real rates there's no particular reason to think there is any significant bound - 10y real rates in the US have been negative since the start of 2020; in the UK they've been negative since the early 2010s.
Comment by SimonM on Use Normal Predictions · 2022-01-10T08:31:48.637Z · LW · GW

I still think you're missing my point.

If you're making ~20 predictions a year, you shouldn't be doing any funky math to analyse your forecasts. Just go through each one after the fact and decide whether or not the forecast was sensible with the benefit of hindsight.

I am even explaining what an normal distribution is because I do not expect my audience to know...

I think this is exactly my point, if someone doesn't know what a normal distribution is, maybe they should be looking at their forecasts in a fuzzier way than trying to back fit some model to them.

All I propose that people sometimes make continuous predictions, and if they want to start doing that and track how much they suck, then I give them instructions to quickly getting a number for how well it is going.

I disagree that's all you propose. As I said in an earlier comment, I'm broadly in favour of people making continuous forecasts as they convey more information. You paired your article with what I believe is broadly bad advise around analysing those forecasts. (Especially if we're talking about a sample of ~20 forecasts)

Comment by SimonM on Use Normal Predictions · 2022-01-09T21:26:47.383Z · LW · GW

I disagree with that characterisation of our disagreement, I think it's far more fundamental than that.

  1. I think you misrepresent the nature of forecasting (in it's generality) versus modelling in some specifics
  2. I think your methodology is needlessly complicated
  3. I propose what I think is a better methodology

To expand on 1. I think (although I'm not certain, because I find your writing somewhat convoluted and unclear) that you're making an implicit assumption that the error distribution is consistent from forecast to forecast. Namely your errors when forecasting COVID deaths and Biden's vote share come from some similar process. This doesn't really mirror my experience in forecasting. I think this model makes much more sense when looking at a single model which produces lots of forecasts. For example, if I had a model for COVID deaths each week, and after 5-10 weeks I noticed that my model was under or over confident then this sort of approach might make sense to tweak my model. 

To expand on 2. I've read your article a few times and I still don't fully understand what you're getting at. As far as I can tell, you're proposing a model for how to adjust your forecasts based on looking at their historic performance. Having a specific model for doing this seems to miss the point of what forecasting in the real world is like. I've never created a forecast, and gone "hmm... usually when I forecast things with 20% they happen 15% of the time, so I'm adjusting my forecast down" (which is I think what you're advocating) it's more likely a notion of, "I am often over/under confident, when I create this model is there some source of variance I am missing / over-estimating?". Setting some concrete rules for this doesn't make much sense to me.

Yes, I do think it's much simpler for people to look at a list of percentiles of things happening, to plot them, and then think "am I generally over-confident / under-confident"? I think it's generally much easier for people to reason about percentiles than standard-deviations. (Yes, I know 68-95-99, but I don't know without thinking quite hard what 1.4 sd or 0.5 sd means). I think leaning too heavily on the math tends to make people make some pretty obvious mistakes.

Comment by SimonM on Two ominous charts on the financial markets · 2022-01-09T19:32:13.406Z · LW · GW

d/ is actually completely consistent with the vol market (I point this out here), so it's not clear that's their recommendation.

Comment by SimonM on Use Normal Predictions · 2022-01-09T19:20:09.477Z · LW · GW

If you think 2 data points are sufficient to update your methodology to 3 s.f. of precision I don't know what to tell you. I think if I have 2 data point and one of them is 0.99 then it's pretty clear I should make my intervals wider, but how much wider is still very uncertain with very little data. (It's also not clear if I should be making my intervals wider or changing my mean too)

Comment by SimonM on Use Normal Predictions · 2022-01-09T18:33:42.658Z · LW · GW

you are missing the step where I am transforming arbitrary distribution to U(0, 1)

I am absolutely not missing that step. I am suggesting that should be the only step.

(I don't agree with your intuitions in your "explanation" but I'll let someone else deconstruct that if they want)

Comment by SimonM on Use Normal Predictions · 2022-01-09T18:06:23.551Z · LW · GW

you need less data to check whether your squared errors are close to 1 than whether your inverse CDF look uniform

I don't understand why you think that's true. To rephrase what you've written:

"You need less data to check whether samples are approximately N(0,1) than if they are approximately U(0,1)"

It seems especially strange when you think that transforming your U(0,1) samples to N(0,1) makes the problem soluble.