Trying AgentGPT, an AutoGPT variant

post by Gunnar_Zarncke · 2023-04-13T10:13:41.316Z · LW · GW · 9 comments

I saw this announcement on Twitter:

https://twitter.com/asimdotshrestha/status/1644883727707959296 

I gave it a trial run and asked it to Solve the hard AI Alignment problem:

At this point the trial had reached 30 seconds and expired.

I think it is funny that it "thinks" the task executed successfully. I guess that's because in most texts ChatGPT has read how researchers describe their successes but not the actual work that is backing it up. I guess the connection to the real world is what will throw off such systems until they are trained on more real-world-like data.

9 comments

Comments sorted by top scores.

comment by Jon Garcia · 2023-04-13T17:44:34.995Z · LW(p) · GW(p)

You heard the LLM, alignment is solved!

But seriously, it definitely has a lot of unwarranted confidence in its accomplishments.

I guess the connection to the real world is what will throw off such systems until they are trained on more real-world-like data.

I wouldn't phrase it that it needs to be trained on more data. More like it needs to be retrained within an actual R&D loop. Have it actually write and execute its own code, test its hypotheses, evaluate the results, and iterate. Use RLHF to evaluate its assessments and a debugger to evaluate its code. It doesn't matter whether this involves interacting with the "real world," only that it learns to make its beliefs pay rent.

Anyway, that would help with its capabilities in this area, but it might be just a teensy bit dangerous to teach an LLM to do R&D like this without putting it in an air-gapped virtual sandbox, unless you can figure out how to solve alignment first.

Replies from: Gunnar_Zarncke
comment by Gunnar_Zarncke · 2023-04-13T22:15:49.793Z · LW(p) · GW(p)

I agree. A while back, I asked Does non-access to outputs prevent recursive self-improvement? [LW · GW] I think that letting such systems learn from experiments with the real world is very dangerous.

comment by Raemon · 2023-04-13T17:55:06.760Z · LW(p) · GW(p)

fwiw I'm kinda less interested in prompts like this (which... just obviously won't work, and are maybe funny but not particularly informative) and more interested in people experimenting with what sort of simpler tasks it can actually reliably do.

Replies from: Gunnar_Zarncke
comment by Gunnar_Zarncke · 2023-04-13T22:17:59.790Z · LW(p) · GW(p)

Maybe I should have posted this as a Shortform. Sorry. I don't have much time these days to polish a post but I thought: Better post it than not. 

comment by Mitchell_Porter · 2023-04-13T12:48:11.730Z · LW(p) · GW(p)

They ask the user to provide their own OpenAI API key. Is that wise? I am not a paid subscriber to OpenAI, so I haven't experienced how it works, but handing over one's key... won't that become risky at some point? 

Replies from: lahwran, nathan-helm-burger
comment by the gears to ascension (lahwran) · 2023-04-13T16:47:43.015Z · LW(p) · GW(p)

it's risky now. audit code before running it.

comment by Nathan Helm-Burger (nathan-helm-burger) · 2023-04-13T16:14:34.737Z · LW(p) · GW(p)

The code is open source. You can download and run it yourself, using your private API key. The statement about giving them your API key is confusing, since that's not actually how it works.

Replies from: menacesinned
comment by MenaceSinned (menacesinned) · 2023-04-23T09:28:09.941Z · LW(p) · GW(p)

You can download the model? You mean they’re not using GPT4 in the background?

Replies from: nathan-helm-burger
comment by Nathan Helm-Burger (nathan-helm-burger) · 2023-04-26T17:09:48.277Z · LW(p) · GW(p)

No, you can download the wrapper code, and interface your copy of the wrapper code with GPT4 using your private API key. Meaning you don't give your API key to anyone else, and you can modify that wrapper code however you see fit.