It's really hard to say because the alignment field might grow exponentially. I have the impression that this is already happening, with e.g. SERI MATS recently having started a whole bunch of new alignment researchers? So we should probably expect a lot of growth.
One thing I'm unsure about is the extent to which this growth all funnels into the existing research programs vs creates new research programs. Insofar as everyone creates new research programs, it seems like the answer should still be predictable when averaging over researchers.
So, likely do I expect a skilled researcher on a promising-ish research program to be to come up with a big finding in four years? Probably the smartest place would be to go with base rates; what have we seen so far in alignment research. But I don't feel strong enough in history to list the major programs and achievements.
I tried thinking about each of the sub-directions I could think of in the above 4 research programs, and considered my gut feeling for the probability of whether the researchers would find something in those sub-directions, and added them up. I got the result of 2.2 expected findings. Obviously this should be taken with a huge grain of salt.
To help clarify, if anyone wants to present a historical research result that they would like to know whether I would consider an achievement, then I would be open to rating it.
You might also be interested in this comment, which lists some things I consider to be achievement and non-achievements: https://manifold.markets/tailcalled/will-tailcalled-think-that-the-infr#DzXOJ7Tw4EURocWdzjlX
Can you please make one for whether you think ELK will have been solved (/substantial progress has been made) by 2026? I could do it, but would be nice to have as many as possible centrally visible when browsing your profile.
My suspicion is that if ELK gets solved in practice, it will be by restricting the class of neural networks under consideration. Yet the ELK challenge adds a requirement that it has to work on just about any neural network.
Hm, do we even want that condition? It seems to me that the goal for the prediction markets is to indicate what research directions are worthwhile, and a research direction that can do something of general importance to science seems more likely to be able to do something important for alignment.
Presumably the alignment problem isn't going to be solved in 4 years, so we're going to have to go with imperfect indicators anyway, and have to accept whichever indicators we get.