Posts
Comments
I agree that it would be better to say “adds five years under ‘Best Guess’ parameters,” or to just use “years” in the tagline. (Though I stand by the decision to compare worlds by using the best guess presets, if only to isolate the one variable under discussion.)
It makes sense that aggressive parameters reduce the difference, since they reach full automation in significantly less time. At parallelization penalty 0.7, Aggressive gets you there in 2027, while Best Guess takes until 2040! With Conservative parameters, the same .7 to .1 shift has even larger consequences.
Hey Daniel, thanks for your comments here. The concern you bring up here is really important. I went back through this section with an eye toward “uniqueness.”
Looking back, I agree that the argument would be strengthened by an explicit comparison to other R&D work. My thought is still that both (1) technologies of national security and (2) especially lucrative tech have additional parallelization, since every actor has a strong incentive to chase and win it. But you’re right to point out that these factors aren’t wholly unique. I’d love to see more research on this. (I have more to look into as well. It might already be out there!)
Though one factor that I think survives this concern is that of investors.
Lots of R&D sectors involve lots of investors who have no idea what they are doing. (Indeed I'd argue that's the norm.)
While there are always unsophisticated investors, I still think there are many more of them in a particularly lucrative hype cycle. Larger potential rewards attract more investors with no domain expertise. Plus, the signal of learning from experts also gets weaker, as the hype cycle attracts more unsophisticated authors/information sources who also want a piece of the pie. (Think about how many more people have become “AI experts” or are writing AI newsletters over the last few years than have done the same in, say, agriculture.) These factors are compounded by the overvaluation that occurs at the top of bubbles as some investors try to “get in on the wave” even when prices are too high for others.
Hey Nathan, thanks for your comments. A few quick responses:
On Taiwan Supply Chain:
- Agreed that US fabs don’t become a huge factor for a few years, even if everything “goes right” in their scale-up.
- Important to note that even as the US fabs develop, other jurisdictions won’t pause their progress. Even with lots to be determined re: future innovation, lots has to “go right” to displace Taiwan from the pole position.
On R&D Penalty:
- The “hive mind”/“One Giant Researcher” model might smooth out the inefficiency of communicating findings within research teams. However, this doesn’t solve the problem of different R&D teams working toward the same goals, thus “duplicating” their work. (Microsoft and Google won’t unite their “AI hive minds.” Nor will Apple and Huawei.)
- Giving every researcher a super-smart assistant might help individual researcher productivity, but it doesn’t stop them from pursuing the same goals as their counterparts at other firms. It might accelerate progress without changing the parallelization penalty.
- Concerns about private markets investment inefficiency still also contribute to a high parallelization penalty.
On Data and Reasoning:
“I actually think we are already in a data and compute overhang, and the thing holding us back is algorithmic development. I don't think we are likely to get to AGI by scaling existing LLMs.”
If new breakthroughs in algorithm design solve the abstract reasoning challenge, then I agree! Models will need less data and compute to do more. I just think we’re major breakthrough or two away from that.
Davidson’s initial report builds off of a compute-centric model where “2020-era algorithms are powerful enough to reach AGI, if only provided enough compute.”
If you think we’re unlikely to get to AGI—or just solve the common sense problem—by scaling existing LLMs, then we will probably need more than just additional compute.
(I’d also push back on the idea that we’re already in a “data overhang” in many contexts. Both (1) robotics and (2) teaching specialized knowledge come to mind as domains where a shortage of quality data limits progress. But given our agreement above, that concern is downstream.)
Hey Tom, thanks again for your work creating the initial report and for kicking off this discussion. Apologies for the Christmastime delay in reply.
Two quick responses, focused on points of disagreement that aren’t stressed in my original text.
On AI-Generated Synthetic Data:
Breakthroughs in synthetic data would definitely help overcome my dataset quality concerns. Two main obstacles I’d want to see overcome: How will synthetic data retain (1) the fidelity to individual data points of ground truth (how well it represents the "real world" its simulation prepares models for) and (2) the higher-level distribution of datapoints.
On Abstraction with Scale:
- Understanding causality deeply would definitely be useful for predicting next words. However, I don’t think that this potential utility implies that current models have such understanding. It might mean that algorithmic innovations that “figure this out” will outcompete others, but that time might still be to-come.
- I agree, though, that performance definitely improves with scale and more data collection/feedback when deployed more frequently. Time will tell the level of sophistication to which scale can take us on its own?
On the latter two points (GDP Growth and Parallelization), the factors you flag are definitely also parts of the equation. A higher percentage of GDP invested can increase total investment even if total GDP remains level. Additional talent coming into AI helps combat diminishing returns on the next researcher up, even given duplicative efforts and bad investments.