Expectations for Gemini: hopefully not a big deal
post by Maxime Riché (maxime-riche) · 2023-10-02T15:38:32.834Z · LW · GW · 5 commentsContents
Introduction My expectations Comparison From GPT-3 to GPT-4 GPT-4 to Gemini None 5 comments
Introduction
My goal is to register and share my expectations and hear others' opinions on their expectation for the relative performances of Gemini VS GPT-4.
My expectations
GPT-4 to Gemini will likely not be as big a jump in capabilities as GPT-3 to GPT-4 was.
Gemini could bring surprises by being more agentic than GPT-4. Being better at planning and longer horizon tasks. But this is likely difficult to achieve, or strong LLM agents would already be making the buzz.
Comparison
From GPT-3 to GPT-4
- Scaling Factor: x100 more compute than GPT-3.
- Optimization: Chinchilla scaling laws (for MoE) over OpenAI/Kaplan scaling laws.
- MoE Over Dense: Utilizes Mixture of Experts (MoE) instead of dense layers.
- Data Quality: Likely higher-quality data, not sure.
- Image Generation: Not publicly released, possibly due to subpar performance or security risks.
- Tools are added during finetuning.
- Algorithmic Gains: 3 years between GPT-3 and GPT-4.
- GPT-4 may already employ process-based feedback.
- GPT-4 aimed for training compute efficiency. GPT-4 was not designed to be commercially deployed at scale.
GPT-4 to Gemini
- Scaling Factor: ~x5 (x20) more compute than GPT-4.
- Supercomputer Constraint: No existing supercomputer could feasibly provide x100 more compute than used for GPT-4. (Not sure but likely)
- Multimodal: maybe image, audio, speech.
- Data Efficiency: Possibly better quality data like Google Books, fewer epochs.
- Tools could be added either during finetuning or pretraining.
- Algorithmic Gains: ~1 year between GPT-4 and Gemini.
- Gemini more likely aims for inference efficiency, given its intended extensive usage by Google. Maybe sacrificing training efficiency.
- Gemini trained to be more agentic, better at planning, etc. ("GPT-4 + AlphaGo").
Note: I drafted that before news of Gemini's release and capabilities but failed to finish writing... Since then, there have been some reports of Gemini being roughly at the level of GPT-4...
5 comments
Comments sorted by top scores.
comment by Amal (asta-vista) · 2023-10-02T19:06:38.501Z · LW(p) · GW(p)
My guess is that it will be a scaled-up Gato - https://www.lesswrong.com/posts/7kBah8YQXfx6yfpuT/what-will-the-scaled-up-gato-look-like-updated-with. [LW · GW] I think there might be some interesting features when the models are fully multi-modal - e.g. being able to play games, perform simple actions on a computer etc. Based on the announcement from google I would expect full multimodal training - image, audio, video, text in/out. Based on deepmind's hiring needs I would expect they want it to also generate audio/video and extend the model to robotics (the brain of something similar to a Tesla Bot) in the near future. Elon claims that training just from video input/output can result in full self-driving, so I'm very curious what training on youtube videos can achieve. If they've managed to make a solid progress in long-term planning/reasoning and can deploy the model with a sufficiently small latency it might be a quite significant release, that could simplify many office jobs.
comment by p.b. · 2023-10-02T19:56:04.056Z · LW(p) · GW(p)
My current assumption is that extracting "intelligence" from images and even more so from videos is much less efficient than from text. Text is just extremely information dense.
So I wouldn't expect Gemini to initially feel more intelligent than GPT4 even if it used 5 times the compute.
I mostly wonder about qualitative differences maybe induced by algorithmic improvements like actually using RL or search components for a kind of self-supervised finetuning, that's one area where I can easily see Deepmind outcompeting OpenAI.
comment by Kaj_Sotala · 2023-10-02T16:01:20.417Z · LW(p) · GW(p)
GPT-4 was not designed to be commercially deployed at scale.
What makes you say that?
Replies from: maxime-riche↑ comment by Maxime Riché (maxime-riche) · 2023-10-02T16:35:36.800Z · LW(p) · GW(p)
This comes from OpenAI saying they didn't expect ChatGPT to be a big commercial success. It was not a top-priority project.
Replies from: gwern