I'm a Ph.D. student doing research on Natural Language Processing.
My research focuses on developing question-answering methods that generalize to harder questions than we have supervision for. Learning from human examples (supervised learning) won't scale to these kinds of questions, so I am investigating other paradigms that recursively break down harder questions into simpler ones (i.e., Debate and Iterated Amplification). Check out my website for more information about me/my research: http://ethanperez.net/
What encourages the helper model to generate correct explanations as opposed to false/spurious ones?
I.e., let's say the text is a list of fruit, and the correct next word is Klingon for "pineapple". I'm imagining that the helper model could just say "The next word is [Klingon for pineapple]" or give an alternate/spurious explanation of the Klingon text ("The text is discussing a spiky fruit that goes on pizza"). Both of the above, unhelpful/spurious explanations would make me predict the next Klingon word correctly.
Your "Ethan computed" distribution matches the intended/described distribution from my original prediction comment. The tail now looks uniform, while my distribution had an unintentional decay that came from me using Elicit's smoothing.
Now that I see how uniform looks visually/accurately, it does look slightly odd (without any decay towards zero), and a bit arbitrary that the uniform distribution ends at 2100. So I think it makes a lot of sense to use Datscilly's outside view as my outside view prior as you did! So overall, I think the ensembled distribution more accurately represents my beliefs, after updating on the other distributions in the LessWrong AGI timelines post.
The above ensemble distribution looks pretty optimistic, which makes me wonder if there is some "double counting" of scenarios-that-lead-to-AGI between the inside and outside view distributions. I.e., Datscilly's outside view arguably does incorporate the possibility that we get AGI via "Prosaic AGI" as I described it.
Yes, the peak comes from (1) a relatively high (25%) confidence that current methods will lead to AGI and (2) my view that we'll achieve Prosaic AGI in a pretty small (~13-year) window if it's possible, after which it will be quite unlikely that scaling current methods will result in AGI (e.g., due to hitting scaling limits or a fundamental technical problem).
It would be awesome to easily ensemble Elicit distributions (e.g., take a weighted average). If ensembling were easy, I would have definitely updated my distribution more aggressively, e.g., averaging my inside view / prosaic AGI scenario distribution with datscilly's outside view distribution (instead of a uniform distribution as an outside view), and/or other distributions which weighed different considerations more heavily (e.g., hardware constraints). It'd be quite informative to see each commenter's independent/original/prior distribution (before to viewing everyone else's), and then each commenter's ensembled/posterior distribution, incorporating or averaging with the distributions of others. I suspect in many cases these two distributions would look quite different, so it would be easy for people to quickly update their views based on the arguments/distributions of others (and see how much they updated).
When I cite scaling limit numbers, I'm mostly deferring to my personal discussions with Tim Dettmers (whose research is on hardware, sparsity, and language models), so I'd check out his comment on this post for more details on his view of why we'll hit scaling limits soon!
Love this take. Tim, did you mean to put some probability on the >2100 bin? I think that would include the "no AGI ever" prediction, and I'm curious to know exactly how much probability you assign to that scenario.
I'll follow the definition of AGI given in this Metaculus challenge, which roughly amounts to a single model that can "see, talk, act, and reason." My predicted distribution is a weighted sum of two component distributions described below:
Prosaic AGI (25% probability). Timeline: 2024-2037 (Median: 2029): We develop AGI by scaling and combining existing techniques. The most probable paths I can foresee loosely involves 3 stages: (1) developing a language model with human-level language ability, then (2) giving it visual capabilities (i.e., talk about pictures and videos, solve SAT math problems with figures), and then (3) giving it capabilities to intelligently act in the world (i.e., trade stocks or navigate webpages). Below are my timelines for the above stages:
Human-level Language Model: 1.5-4.5 years (Median: 2.5 years). We can predictably improve our language models by increasing model size (parameter count), which we can do in the following two ways:
Scaling Language Model Size by 1000x relative to GPT3. 1000x is pretty feasible, but we'll hit difficult hardware/communication bandwidth constraints beyond 1000x as I understand.
Increasing Effective Parameter Count by 100x using modeling tricks (Mixture of Experts, Sparse Tranformers, etc.)
+Visual Capabilities: 2-6 extra years (Median: 4 years). We'll need good representation learning techniques for learning from visual input (which I think we mostly have). We'll also need to combine vision and language models, but there are many existing techniques for combining vision and language models to try here, and they generally work pretty well. A main potential bottleneck time-wise is that the language+vision components will likely need to be pretrained together, which slows the iteration time and reduces the number of research groups that can contribute (especially for learning from video, which is expensive). For reference, Language+Image pretrained models like ViLBERT came out 10 months after BERT did.
+Action Capabilities: 0-6 extra years (Median: 2 years). GPT3-style zero-shot or few-shot instruction following is the most feasible/promising approach to me here; this approach could work as soon as we have a strong, pretrained vision+language model. Alternatively, we could use that model within a larger system, e.g. a policy trained with reinforcement learning, but this approach could take a while to get to work.
Breakthrough AGI (75% probability). Timeline: Uniform probability over the next century: We need several, fundamental breakthroughs to achieve AGI. Breakthroughs are hard to predict, so I'll assume a uniform distribution that we'll hit upon the necessary breakthroughs at any year <2100, with 15% total probability mass after 2100 (a rough estimate); I'm estimating 15% roughly based on a 5% probability that we won't find the right insights by 2100, 5% probability that we have the right insights but not enough compute by 2100, and 5% probability to account for planning fallacy, unknown unknowns, and the fact that a number of top AI researchers believe that we are very far from AGI.
My probability for Prosaic AGI is based on an estimated probability of each of the 3 stages of development working (described above):
P(Prosaic AGI) = P(Stage 1) x P(Stage 2) x P(Stage 3) = 3/4 x 2/3 x 1/2 = 1/4
Updates/Clarification after some feedback from Adam Gleave:
Updated from 5% -> 15% probability that AGI won't happen by 2100 (see reasoning above). I've updated my Elicit snapshot appropriately.
There are other concrete paths to AGI, but I consider these fairly low probability to work first (<5%) and experimental enough that it's hard to predict when they will work. For example, I can't think of a good way to predict when we'll get AGI from training agents in a simulated, multi-agent environment (e.g., in the style of OpenAI's Emergent Tool Use paper). Thus, I think it's reasonable to group such other paths to AGI into the "Breakthrough AGI" category and model these paths with a uniform distribution.
I think you can do better than a uniform distribution for the "Breakthrough AGI" category, by incorporating the following information:
Breakthroughs will be less frequent as time goes on, as the low-hanging fruit/insights are picked first. Adam suggested an exponential decay over time / Laplacian prior, which sounds reasonable.
Growth of AI research community: Estimate the size of the AI research community at various points in time, and estimate the pace of research progress given that community size. It seems reasonable to assume that the pace of progress will increase logarithmically in the size of the research community, but I can also see arguments for why we'd benefit more or less from a larger community (or even have slower progress).
Growth of funding/compute for AI research: As AI becomes increasingly monetizable, there will be more incentives for companies and governments to support AI research, e.g., in terms of growing industry labs, offering grants to academic labs to support researchers, and funding compute resources - each of these will speed up AI development.
Cool! I feel like I should go into more detail on how I made the posterior prediction then - I just predicted relative increases/decreases in probability for each probability bucket in Rohin's prior:
4x increase in probability for 2020-2022
20% increase for 2023-2032
15% increase for 2033-2040
10% decrease for 2041-2099
10% decrease for 2100+
Then I just let Elicit renormalize the probabilities.
I guess this process incorporates the "meta-prior" than Rohin won't change his prior much, and then I estimated the relative increase/decrease margins based on the number and upvotes of comments. E.g., there were a lot of highly voted comments that Rohin should increase his probability in the <2022 range, so I predicted a larger change.
In my experience, a highly predictive feature of "agreeing with safety concerns" is how much someone thinks about/works on RL (or decision-making more broadly). For example, many scientists in RL-driven labs (DeepMind, OpenAI) agree with safety concerns, while there is almost no understanding of safety concerns (let alone agreement that they are valid) among researchers in NLP (mainly driven by supervised learning and unsupervised learning); it's easier to intuitively motivate safety concerns from the perspective of RL and demos of where it fails (esp. with concerns like reward gaming and instrumental convergence). Thus, a useful way to decompose Rohin's question is:
How many top researchers doing AGI-related work think about RL?
How many of the above researchers agree with safety concerns?
How many top researchers doing AGI-related work don't think about RL?
How many of the above researchers agree with safety concerns?
We can add the numbers from 1.1 and 2.1 and divide by the total number of top AGI researchers (here, 2000).
I'd argue that more researchers are in category 2 (non-RL researchers) than 1 (RL researchers). A lot of recent progress towards AGI has been driven by improvements in representation learning, supervised learning, and unsupervised learning. Work in these areas is AGI-related in the relevant sense to Rohin; if we develop AGI without RL (e.g., GPT-10), we'd need the non-RL researchers who develop these models to coordinate and agree with safety concerns about e.g. releasing the models. I think it will continue to be the case for the foreseeable future that >50% of AGI-related researchers aren't doing RL, as representation learning, supervised learning, unsupervised learning, etc. all seem quite important to developing AGI.
The upshot (downshot?) of the above is that we'll probably need a good chunk of non-RL but AGI-related researchers to agree with safety concerns. Folks in this group seem less receptive to safety concerns (probably because they don't obviously come up as often as in RL work), so I think it'll take a pretty intuitively compelling demonstration/warning-shot to convince people in the non-RL group, visible enough to reach across several subfields in ML; preferably, these demos should be of direct relevance to the work in people doing non-RL work (i.e., directly showing how NLP systems are Stuart-Russell-style dangerous to convince NLP people). I think we'll need pretty advanced systems to get these kinds of demos, roughly +/-5 years from when we get AGI (vs. Rohin's prior estimate of 1-10 years before AGI). So overall, I think Rohin's posterior should be shifted right ~5 years.
Here is my Elicit snapshot of what I think Rohin's posterior will be after updating on all comments here. It seems like all the other comments are more optimistic than Rohin prior, so I predict that Rohin's posterior will become more optimistic, even though I think the concerns I've outlined above outweigh/override some of the considerations in the other comments. In particular, I think you'll get an overestimate of "agreement on safety concerns" by looking only at the NeurIPS/ICML community which is pretty RL-heavy relative to e.g. the NLP and Computer Vision communities (which will still face AGI-related coordination problems). The same can be said about researchers who explicitly self-identify with "Artificial Intelligence" or "Artificial General Intelligence" (historically focused on decision-making and games). Looking at the 100 most cited NLP researchers on Google Scholar, I found one who I could recognize as probably sympathetic to safety concerns (Wojciech Zaremba) and similar for Computer Vision.