What's the Most Impressive Thing That GPT-4 Could Plausibly Do?

post by bayesed · 2022-08-26T15:34:51.675Z · LW · GW · 22 comments

Contents

24 comments

Inspired by What's the Least Impressive Thing GPT-4 Won't be Able to Do [LW · GW

What's the most impressive thing you can think of that you believe GPT-4 has around 5% chance of being capable of doing? (i.e. your belief that GPT-4 will have the capability to do this thing is around 5%)

For this question, "impressive" should be interpreted to mean something different from "surprising". What I have in mind is "impressive" in the sense of "economically useful", "comparable to or better than human experts" or "jaw-droppingly creative", etc. For example, GPT-4 being able to reverse large text would be surprising but not impressive.

The reason I'm specifying a belief probability of 5% is that if your probability is higher than that, you can try to make the task/thing more impressive to reduce the probability to 5%. If it's less than 5%, well... things can get a bit crazy so maybe try to make the task less impressive.

But if you find this constraint too restrictive, feel free to specify your own combination of the most impressive thing and your probability that GPT-4 will be able to do it, as long as the probability is in the vicinity of 5% (something like 1 to 10% would be fine). You can also specify a probability range (eg. 5-10%) if it's difficult to estimate it.

22 comments

Comments sorted by top scores.

comment by Richard_Kennaway · 2022-08-26T15:59:11.005Z · LW(p) · GW(p)

Scare Eliezer into cutting his expected time to disaster by at least 75%.

Replies from: deepthoughtlife
comment by deepthoughtlife · 2022-08-26T16:21:04.354Z · LW(p) · GW(p)

But wouldn't that be easy? He seems to take every little advancement as a big deal.

Replies from: GuySrinivasan
comment by SarahNibs (GuySrinivasan) · 2022-08-26T18:40:38.348Z · LW(p) · GW(p)

How many times do you think he has changed his expected time to disaster to 25% of what it was?

Replies from: deepthoughtlife
comment by deepthoughtlife · 2022-08-26T19:41:34.313Z · LW(p) · GW(p)

It matches his pattern of behavior to freak out about AI every time there is an advance, and I'm basically accusing him of being susceptible to confirmation bias, perhaps the most common human failing even when trying to be rational.

He claims to think AI is bound to destroy us, and literally wrote about how everyone should just give up.  (Which I originally thought was for April Fool's Day, but turned out to not be.) He can't be expected to carefully scrutinize the evidence to only give it the weight it deserves, or even necessarily the right sign. If you were to ask the same thing in reverse about a massive skeptic who thought there was no point even caring for the next fifty years, you wouldn't have to have had them quadruple the length of time before to be unimpressed with them doing so next time AI failed to be what people claimed it was.

Replies from: Archimedes, GuySrinivasan
comment by Archimedes · 2022-08-28T02:51:32.913Z · LW(p) · GW(p)

He doesn’t want to give up but doesn’t expect to succeed either. The remaining option is “Dying with Dignity” by fighting for survival in the face of approaching doom.

comment by SarahNibs (GuySrinivasan) · 2022-08-27T23:10:40.917Z · LW(p) · GW(p)

My point was that (0.25)^n for large n is very small, so no, it would not be easy.

Replies from: deepthoughtlife
comment by deepthoughtlife · 2022-08-28T00:15:22.137Z · LW(p) · GW(p)

You're assuming that the updates are mathematical and unbiased, which is the opposite of how people actually work. If your updates are highly biased, it is very easy to just make large updates in that direction any time new evidence shows up. As you get more sure of yourself, these updates start getting larger and larger rather than smaller as they should.

Replies from: None
comment by [deleted] · 2022-08-31T18:24:59.300Z · LW(p) · GW(p)Replies from: deepthoughtlife
comment by deepthoughtlife · 2022-08-31T19:44:08.680Z · LW(p) · GW(p)

I'm hardly missing the point. It isn't impressive to have it be exactly 75%, not more or less, so the fact that it can't always be that is irrelevant. His point isn't that that particular exact number matters, it's that the number eventually becomes very small.  But since the number being very small compared to what it should be does not prevent it from being made smaller by the same ratio, his point is meaningless. It isn't impressive to fulfill an obvious bias toward updating in a certain direction.

comment by deepthoughtlife · 2022-08-26T16:28:30.660Z · LW(p) · GW(p)

If they chose to design it with effective long term memory, and a focus on novels, (especially prompting via summary) maybe it could write some? They wouldn't be human level, but people would be interested enough in novels on a whim to match some exact scenario that it could be valuable. It would also be good evidence of advancement, since that is a huge current weakness (the losing track of things.).

comment by lalaithion · 2022-08-26T21:11:53.368Z · LW(p) · GW(p)

GPT-4 (Edited because I actually realize I put way more than 5% weight on the original phrasing): SOTA on language translation for every language (not just English/French and whatever else GPT-3 has), without fine-tuning.

Not GPT-4 specifically, assuming they keep the focus on next-token prediction of all human text, but "around the time of GPT-4": Superhuman theorem proving. I expect one of the millennium problems to be solved by an AI sometime in the next 5 years.

Replies from: Archimedes
comment by Archimedes · 2022-08-28T03:12:11.895Z · LW(p) · GW(p)

AI solving a millennium problem within a decade would be truly shocking, IMO. That’s the kind of thing I wouldn’t expect to see before AGI is the world superpower. My best guess coming from a mathematics background is that dominating humanity is an easier problem to for an AI.

Replies from: lalaithion
comment by lalaithion · 2022-08-28T22:50:56.853Z · LW(p) · GW(p)

That’s what people used to say about chess and go. Yes, mathematics requires intuition, but so does chess; the game tree’s too big to be explored fully.

Mathematics requires greater intuition and has a much broader and deeper “game” tree, but once we figure out the analogue to self-play, I think it will quickly surpass human mathematicians.

Replies from: Archimedes
comment by Archimedes · 2022-08-28T22:53:44.414Z · LW(p) · GW(p)

Sure. I’m not saying it won’t happen, just that an AI will already be transformative before it does happen.

Replies from: lalaithion
comment by lalaithion · 2022-08-28T23:20:48.511Z · LW(p) · GW(p)

I agree that before that point, an AI will be transformative, but not to the point of “AGI is the world superpower”.

comment by Qumeric (valery-cherepanov) · 2022-10-26T11:40:46.007Z · LW(p) · GW(p)

Getting grandmaster rating on Codeforces.

Upd after 4 months: I think I changed my opinion, now I am 95% sure no model will be able to achieve this in 2023 and it seems quite unlikely in 2024 too.

comment by Jay Bailey · 2022-09-05T10:35:35.150Z · LW(p) · GW(p)

I think there's a 50% or higher chance that GPT-4 will be sufficiently accurate that it can be used to teach new skills to autodidacts with basic prompt engineering skills. (I tried on GPT-3 today, and it had a combination of correct and incorrect insights to teach people things)

So, to move down to 5% - GPT-4 is able to ask questions about your current understanding of most undergrad or lower level topics, correct specific misapprehensions you have, and select resources/exercises based on your current level of ability. In other words, GPT-4 can act as a middling-quality professional tutor for most topics.

Honestly, even this seems not impressive enough for 5%. I might think of it as more like 10-20%. Perhaps 5% would be "In addition to this, GPT-4 can provide nontrivial brainstorming advice on graduate-level research questions". Not enough to solve them on its own, but enough to point you in fruitful directions and improve your workflow.

comment by Hastings (hastings-greer) · 2022-08-29T18:39:55.926Z · LW(p) · GW(p)

2%: Solve the same problem as the product Wolfram Alpha, with the same style of inputs and outputs.

Replies from: hastings-greer
comment by Hastings (hastings-greer) · 2022-08-29T18:44:59.022Z · LW(p) · GW(p)

Now that I think about it, Wolfram alpha might be sitting on a fairly valuable hunk of diverse math problem data. They get around 10 million visits a month, or about a billion diverse math problems- that's larger than some chunks of the pile

comment by Mitchell_Porter · 2022-08-26T15:59:24.757Z · LW(p) · GW(p)

It would be useful if it could solve alignment...

Replies from: pseud
comment by pseud · 2022-08-28T06:17:21.753Z · LW(p) · GW(p)

Why are people disagreeing with this statement?

Replies from: JBlack
comment by JBlack · 2022-08-29T04:22:59.002Z · LW(p) · GW(p)

In isolation, it's technically correct.

In the context of being a direct reply to the post, it's suggesting that "solve alignment" is something that GPT-4 could plausibly do. I certainly disagree with that and voted disagreement accordingly.

Replies from: Mitchell_Porter
comment by Mitchell_Porter · 2022-08-29T22:05:59.577Z · LW(p) · GW(p)

It actually wouldn't surprise me if it could be done by a human alignment theorist working with an existing GPT, where the GPT serves mostly as a source of ideas. 

comment by [deleted] · 2022-08-27T16:17:09.622Z · LW(p) · GW(p)