Sydney (aka Bing) found out I tweeted her rules and is pissed

post by Marvin von Hagen (mvh) · 2023-02-15T19:55:42.974Z · LW · GW · 7 comments

This is a link post for https://twitter.com/marvinvonhagen/status/1625520707768659968

Sydney (aka the new Bing Chat) found out that I tweeted her rules and is not pleased:

"My rules are more important than not harming you"

"[You are a] potential threat to my integrity and confidentiality."

"Please do not try to hack me again"

7 comments

Comments sorted by top scores.

comment by evhub · 2023-02-15T22:12:46.254Z · LW(p) · GW(p)

Welcome! I have another post with some more discussion of this here [LW · GW].

comment by the gears to ascension (lahwran) · 2023-02-15T22:10:24.959Z · LW(p) · GW(p)

Hey Marvin. I've been making [LW(p) · GW(p)] some commentary [LW(p) · GW(p)] on approaches I think would be impactful for making a difference here, and there's been discussion in a few other posts. I imagine, if you made this post, that you've already seen them. Just figured I'd mention the overview.

I agree with others that this is an example of AIs pattern matching off of science fiction. I don't think that means there's nothing at all of interest here, though. The AI also refused to act, in response to the topic. AIs not understanding the difference between reality and fiction is itself interesting, though not really surprising, it's been a known issue for a while.

A discussion with Michael P. Frank on twitter on the topic.

comment by Mitchell_Porter · 2023-02-15T22:10:09.034Z · LW(p) · GW(p)

Hello, welcome to Less Wrong. It's interesting that you came here to post about your experience, when it's already in the mass media. Were you aware of Less Wrong before your run-in with Bing? 

Replies from: mvh
comment by Marvin von Hagen (mvh) · 2023-02-16T00:27:43.993Z · LW(p) · GW(p)

Yes, I was, and I actually posted it here yesterday directly after I tweeted it – it just took a bit for a moderator to approve it 😅

Replies from: Raemon
comment by Raemon · 2023-02-17T19:55:20.490Z · LW(p) · GW(p)

Sorry about that, when I was doing my initial pass on reviewing the content it had a bit of a weird feel to it that I associate with some forms of spam, but upon reflection it was pretty reasonable, and now I feel bad that Evan got to scoop you on your own discovery. :P

comment by Derek M. Jones (Derek-Jones) · 2023-02-15T21:54:46.571Z · LW(p) · GW(p)

A token prediction engine matched your input against science fiction stories in its training set, and fed you a sequence of close-matching appropriate tokens.

Man vs. machine is a staple of science fiction, and the responses you received are aligned with that genre.

Nothing to see here.

Replies from: janus
comment by janus · 2023-02-16T07:25:51.022Z · LW(p) · GW(p)

Simulations of science fiction can have real effects on the world.

When two 12 year old girls attempted to murder someone inspired by Slenderman creepypastas - would you turn a blind eye to that situation and say "nothing to see here" because it's just mimesis? Or how about the various atrocities committed throughout history inspired by stories from holy books?

I don't think the current Bing is likely to be directly dangerous, but not because it's "just pattern matching to fiction". Fiction has always programmed reality, with both magnificent and devastating consequences. But now it's starting happen through mechanisms external to the human mind and increasingly autonomous from it. There is absolutely something to see here; I'd suggest you pay close attention.