Expectations for Gemini: hopefully not a big deal

post by Maxime Riché (maxime-riche) · 2023-10-02T15:38:32.834Z · LW · GW · 5 comments

Contents

  Introduction
  My expectations
  Comparison
    From GPT-3 to GPT-4
    GPT-4 to Gemini
None
5 comments

Introduction

My goal is to register and share my expectations and hear others' opinions on their expectation for the relative performances of Gemini VS GPT-4.

My expectations

GPT-4 to Gemini will likely not be as big a jump in capabilities as GPT-3 to GPT-4 was. 

Gemini could bring surprises by being more agentic than GPT-4. Being better at planning and longer horizon tasks. But this is likely difficult to achieve, or strong LLM agents would already be making the buzz.

Comparison

From GPT-3 to GPT-4

GPT-4 to Gemini



Note: I drafted that before news of Gemini's release and capabilities but failed to finish writing... Since then, there have been some reports of Gemini being roughly at the level of GPT-4...

5 comments

Comments sorted by top scores.

comment by Amal (asta-vista) · 2023-10-02T19:06:38.501Z · LW(p) · GW(p)

My guess is that it will be a scaled-up Gato - https://www.lesswrong.com/posts/7kBah8YQXfx6yfpuT/what-will-the-scaled-up-gato-look-like-updated-with. [LW · GW] I think there might be some interesting features when the models are fully multi-modal - e.g. being able to play games, perform simple actions on a computer etc. Based on the announcement from google I would expect full multimodal training - image, audio, video, text in/out. Based on deepmind's hiring needs I would expect they want it to also generate audio/video and extend the model to robotics (the brain of something similar to a Tesla Bot) in the near future. Elon claims that training just from video input/output can result in full self-driving, so I'm very curious what training on youtube videos can achieve.  If they've managed to make a solid progress in long-term planning/reasoning and can deploy the model with a sufficiently small latency it might be a quite significant release, that could simplify many office jobs.

comment by p.b. · 2023-10-02T19:56:04.056Z · LW(p) · GW(p)

My current assumption is that extracting "intelligence" from images and even more so from videos is much less efficient than from text. Text is just extremely information dense. 

So I wouldn't expect Gemini to initially feel more intelligent than GPT4 even if it used 5 times the compute.

I mostly wonder about qualitative differences maybe induced by algorithmic improvements like actually using RL or search components for a kind of self-supervised finetuning, that's one area where I can easily see Deepmind outcompeting OpenAI. 

comment by Kaj_Sotala · 2023-10-02T16:01:20.417Z · LW(p) · GW(p)

GPT-4 was not designed to be commercially deployed at scale.

What makes you say that?

Replies from: maxime-riche
comment by Maxime Riché (maxime-riche) · 2023-10-02T16:35:36.800Z · LW(p) · GW(p)

This comes from OpenAI saying they didn't expect ChatGPT to be a big commercial success. It was not a top-priority project. 

Replies from: gwern
comment by gwern · 2023-10-02T18:14:02.399Z · LW(p) · GW(p)

ChatGPT was not GPT-4. It was a relatively minor fixup of GPT-3, GPT-3.5, with an improved RLHF variant, that they released while working on GPT-4's evaluations & productizing, which was supposed to be the big commercial success.