Someone who is near the top of the leaderboard is both accurate and highly experienced
I think this unfortunately isn't true right now, and just copying the community prediction would place very highly (I'm guessing if made as soon as the community prediction appeared and updated every day, easily top 3 (edit: top 10)). See my comment below for more details.
You can look at someone's track record in detail, but we're also planning to roll out a more ways to compare people with each other.
I'm very glad to hear this. I really enjoy Metaculus but my main gripe with it has always been (as others have pointed out) a lack of way to distinguish between quality and quantity. I'm looking forward to a more comprehensive selection of metrics to help with this!
Looking at my track record, for questions resolved in the last 3 months, evaluated at all times, here's how my log score looks compared to the community:
Binary questions (N=19): me: -.072 vs. community: -.045
Continuous questions (N=20): me: 2.35 vs. community: 2.33
So if anything, I've done a bit worse than the community overall, and am in 5th by virtue of predicting on all questions. It's likely that the predictors significantly in front of me are that far ahead in part due to having predicted on (a) questions that have resolved recently but closed before I was active and (b) a longer portion of the lifespan for questions that were open before I became active.
I discovered that the question set changes when I evaluate at "resolve time" and filter for the past 3 months, not sure why exactly. Numbers at resolve time:
Binary questions (N=102): me: .598 vs. community: .566
Continuous questions (N=92): me: 2.95 vs. community: 2.86
I think this weakens my case substantially, though I still think a bot that just predicts the community as soon as it becomes visible and updates every day would currently be at least top 10.
Anything much worse than that, yes, people could have negative overall scores - which, if they've predicted on a decent number of questions, is pretty strong evidence that they really suck at forecasting
I agree that this should have some effect of being less welcoming to newcomers, but I'm curious to what extent. I have seen plenty of people with worse brier scores than the median continuing to predict on GJO rather than being demoralized and quitting (disclaimer: survivorship bias).
Is there an option to have people "lock in" their answer? (Maybe they can still edit/delete for a short time after they submit or before a cutoff date/time)
Not planning on supporting this on our end in the near future, but could be a cool feature down the line.
Is there a way to see in one place all the predictions I've submitted an answer to?
As of right now, not if you make the predictions via LW. You can view questions that you've submitted a prediction on via Elicit at https://elicit.org/binary?binaryQuestions.hasPredicted=true if you're logged in, and we're working on allowing for account linking so your LW predictions would show up in the same place.
The first version of account linking will be contacting someone at Ought then us manually running a script.
Edit: the first version of account linking is ready, email firstname.lastname@example.org with your LW username and Elicit email and I can link them.
I must admit I haven't followed the discussions you're referring to but if I were to spend more time forecasting this question I would look into them.
I didn't include effects of COVID in my forecast as it looks like the Zillow Home Value Index for Seattle has remained relatively steady since March (2% drop). I'm skeptical that there are likely to be large effects from COVID in the future when there hasn't been a large effect from COVID thus far,
A few reasons I could be wrong:
Zillow data is inaccurate or incomplete, or I'm interpreting it incorrectly.
COVID affects variation of individual housing prices much more than the trend of the city as a whole.
The COVID effects will be much bigger in the next half of a year than in the previous. Perhaps there will be a second wave much worse than the market is pricing in which produces large effects.
I don't have a background in quantum computing, so there's a chance I'm misinterpreting the question in some way, but I learned a lot doing the research for the forecast (like that there's a lot of controversy regarding whether quantum supremacy has been achieved yet).
Amusingly, during my research I stumbled upon this Metaculus question about when a >49 qubit quantum computer would be created which resolved ambiguously due to the issue of how well-controlled the qubits are. For the purposes of this forecast I assumed it would resolve based on the raw number of qubits, without adjusting for control.
My forecast is based on historical data from Zillow. I explained my reasoning in the notes. The summary is that housing prices haven't changed very much in Seattle since April 2019 (on the whole it's risen 1%). On the other hand, prices in more expensive areas have stayed the same or declined slightly. I settled on a boring median of the price staying the same. Due to how stable the prices have been recently, I think most of the variation will come from the individual house and which neighborhood it's in, with an outside chance of large Seattle home value fluctuations.
5% chance given to Human Level Machine Intelligence (HLMI) having an extremely bad long run impact (e.g. human extinction)
Does Stuart Russell's argument for why highly advanced AI might pose a risk point at an important problem? 39% say at least important, 70% at least moderately important.
But on the other hand, only 8.4% said working on this problem is now is more valuable than other problems in the field. 28% said as valuable as other problems.
47% agreed that society should prioritize "AI Safety Research" more than it currently was.
These seem like fairly safe lower bounds compared to the population of researchers Rohin would evaluate, since concern regarding safety has increased since 2015 and the survey included all AI researchers rather than only those whose work is related to AGI.
These responses are more directly related to the answer to Question 3 ("Does X agree that there is at least one concern such that we have not yet solved it and we should not build superintelligent AGI until we do solve it?") than Question 2 ("Does X broadly understand the main concerns of the safety community?"). I feel very uncertain about the percentage that would pass Question 2, but think it is more likely to be the "bottleneck" than Question 3.
Given these considerations, I increased the probability before 2023 to 10%, with 8% below the lower bound. I moved the median | not never up to 2035 as a higher probability pretty soon also means a sooner median. I decreased the probability of “never” to 20%, since the “not enough people update on it / consensus building takes forever / the population I chose just doesn't pay attention to safety for some reason” condition seems not as likely.
I also added an extra bin to ensure that the probability continues to decrease on the right side of the distribution.