Posts

Comments

Comment by Mark Schröder (mark-schroeder) on nikola's Shortform · 2025-02-09T10:06:08.277Z · LW · GW

That seems to imply that:

  • If current levels are around GPT-4.5, the compute increase from GPT-4 would be either 10× or 50×, depending on whether we use a log or linear scaling assumption.
  • The completion of Stargate would then push OpenAI’s compute to around GPT-5.5 levels. However, since other compute expansions (e.g., Azure scaling) are also ongoing, they may reach this level sooner.
  • Recent discussions have suggested that better base models are a key enabler for the current RL approaches, rather than major changes in RL architecture itself. This suggests that once the base model shifts from a GPT-4o-scale model to a GPT-5.5-scale model, there could be a strong jump in capabilities.
  • It’s unclear how much of a difference it makes to train the new base model (GPT-5) on reasoning traces from O3/O4 before applying RL. However, by the time the GPT-5 scale run begins, there will likely be a large corpus of filtered, high-quality reasoning traces, further edited for clarity, that will be incorporated into pretraining.
  • The change to a better base model for RL might enable longer horizon agentic work as an "emergent thing", combined with superhuman coding skills this might already be quite unsafe.
  • GPT-5’s reasoning abilities may be significantly more domain-specific than prior models.