# Grading myself on SSC's 2020 predictions

post by knite · 2021-03-01T19:55:18.413Z · LW · GW · 8 comments

Mantic Monday: Judging April COVID Predictions

I privately recorded my own predictions on these questions. I felt uncomfortable posting these publicly at the time. I can't fix that, but I can post my answers now and do better next time.

Score: 3.34

Edit: Per @bucky, I'm pretty sure the scoring rule I used (ln(p) - ln(.5)) is wrong, but I'm not sure what would be the correct rule.

comment by Bucky · 2021-03-02T07:06:18.202Z · LW(p) · GW(p)

Welcome to the predictions fun!

Im impressed with how little you put on 14&15, those were particularly good predictions IMO.

I think there might be an error on your calculation sheet - for instance your score for 3 should be the same as your score for 5?

Replies from: knite, knite
comment by knite · 2021-03-02T22:22:55.391Z · LW(p) · GW(p)

Regarding 14/15, I felt that we were probably under-reacting, but "general consensus" is tricky. We were in the home stretch of the Trump presidency so I figured the baseline odds of "consensus" on anything were extremely low.

I'm kicking myself on #16 - I don't know enough about epidemiology to make such a strong guess.

Replies from: Bucky
comment by Bucky · 2021-03-03T08:42:25.123Z · LW(p) · GW(p)

I'm kicking myself on #16 - I don't know enough about epidemiology to make such a strong guess.

Yeah, I did a similar thing on #38 where I was similarly overconfident on an economy question which I don't know nearly enough about.

On #16 itself I was lower than I should have been because I was using "virus" as a reference class rather than "respiratory virus" which was an obvious mistake looking back at it.

comment by knite · 2021-03-02T22:16:44.241Z · LW(p) · GW(p)

Is the rule supposed to be symmetric around 50%? I used ln(p) - ln(.5) because Scott wrote:

"I scored these using a logarthmic scoring rule, adjusted so that guessing 50-50 always gave zero points."

However, this doesn't square with his second statement:

"Getting everything maximally right gives a score of about 14; guessing 50-50 for everything gives a score of 0, getting everything maximally wrong gives a score of negative infinity."

Do you know what the correct scoring rule is?

Replies from: Bucky
comment by Bucky · 2021-03-03T08:33:18.832Z · LW(p) · GW(p)

It looks like you're using the correct formula but maybe with a mistake of what the "p" in the formula means so that your scores on questions where the result was "false" are incorrect.

I think you maybe used ln(probability put on "true")-ln(.5) and then multiplied the result by -1 if the actual answer was false?

The formulation Scott used was ln(probability put on the correct answer)-ln(.5)

So for q3 for example the calculation shouldn't be

but should be

Replies from: gjm
comment by gjm · 2021-03-03T13:54:58.257Z · LW(p) · GW(p)

That looks right to me. If so, and if I've done the calculations right, the actual score should be (not +3.34 but) -1.89, just a little bit better than Bucky's score according to Scott. (Except that #18 -- whether Scott went back to working in the office -- seems to be missing; perhaps you didn't bother predicting on that one because it seemed too Scott-specific? So comparison against others who did predict that one will be misleading unless you remove it from their score. Scott, Zvi and Bucky all lost quite a few points on #18.)

Replies from: Bucky
comment by Bucky · 2021-03-03T15:31:18.229Z · LW(p) · GW(p)

Yeah, I didn't actually answer q18 either (possibly knite maybe used my list [LW(p) · GW(p)] as a basis?) for exactly that reason. Scott just put me in as the same as him for that question for the purposes of making an apples-to-apples comparison which seemed fine - no idea what I would have put if I had answered!

comment by Zvi · 2021-03-03T11:27:04.097Z · LW(p) · GW(p)

Congratulations for fully participating and posting, even if you kept your initial predictions private. What I'd most encourage now is what my post did: explaining your reasoning, especially where you posted different numbers than myself/Scott, and thinking about how good your logic was in each case and what you think your best prediction was given your knowledge at the time.

also question 18 is missing here?