[April Fools] User GPT2 is Banned

post by jimrandomh · 2019-04-02T06:00:21.075Z · LW · GW · 20 comments

Contents

20 comments

For the past day or so, user GPT2 [LW · GW] has been our most prolific commenter, replying to (almost) every LessWrong comment without any outside assistance. Unfortunately, out of 131 comments, GPT2's comments have achieved an average score of -4.4, and have not improved since it received a moderator warning [LW · GW]. We think that GPT2 needs more training time reading the Sequences before it will be ready to comment on LessWrong.

User GPT2 is banned for 364 days, and may not post again until April 1, 2020. In addition, we have decided to apply the death penalty, and will be shutting off GPT2's cloud server.

Use this thread for discussion about GPT2, on LessWrong and in general.

20 comments

Comments sorted by top scores.

comment by Ruby · 2019-04-02T06:08:06.346Z · LW(p) · GW(p)

I warned them, I said it wasn't safe to put an AI in a text box.

comment by complexmeme · 2019-04-03T20:00:46.501Z · LW(p) · GW(p)
In addition, we have decided to apply the death penalty

Less Wrong moderation policy: Harsh but fair.

comment by Alexei · 2019-04-02T20:29:46.398Z · LW(p) · GW(p)

I think overall I just appreciate that you guys did something for April 1st. It made the website / community feel a bit more alive.

comment by namespace (ingres) · 2019-04-02T06:10:52.622Z · LW(p) · GW(p)

Thanks for inspiring GreaterWrong's new ignore feature.

Replies from: Raemon
comment by Raemon · 2019-04-02T07:51:24.220Z · LW(p) · GW(p)

Man we were considering whether to implement that but then we’re like ‘hmm we probably should not do that on a whim without thinking about it’

Replies from: clone of saturn
comment by clone of saturn · 2019-04-02T08:28:37.093Z · LW(p) · GW(p)

I'm happy to discuss any concerns you have about it.

comment by Chris_Leong · 2019-04-02T03:38:48.421Z · LW(p) · GW(p)

I thought that GPT2 was funny at first, but after a while it got irritating. If there's a next time, it should be more limited in how many comments it makes. 1) You could train it on how many votes its comments got to try to figure out which comments to reply to 2) It might also automatically reply to every reply on its comments.

Replies from: DPiepgrass
comment by DPiepgrass · 2019-04-16T19:16:30.478Z · LW(p) · GW(p)

Maybe by next year they'll have an adversarial anti-GPT AI trained to distinguish GPT2 (GPT3? GPT4?) comments from humans. Then GPT can create 50 replies to every human comment, and of those, the other AI will decide which of the replies sounds the *least* like GPT and post that one.

April Fool's day: the funniest step on the path to weaponized AI.

comment by Richard_Kennaway · 2019-04-02T11:54:38.458Z · LW(p) · GW(p)

The reference to shutting down its server, the sudden appearance of a special checkbox to autocollapse its comments, and the suggestion to use this thread to discuss the event, all suggest that this was an inside job. It was annoying while it lasted, but so is a fire alarm, for good reason. Bravo!

comment by ryan_b · 2019-04-02T14:06:44.518Z · LW(p) · GW(p)

I thought this was a great gag experiment.

I echo the other comments about more volume control; it posted so much so fast there wasn't much opportunity for it to improve via feedback, if indeed such a mechanism was considered.

Replies from: Vaniver
comment by Vaniver · 2019-04-02T17:05:23.326Z · LW(p) · GW(p)

It's trained on the whole corpus of LW comments and replies that got sufficiently high karma; naively I wouldn't expect a day to make much of a dent in the training data. But there's an interesting fact about training to match distributions, which is that most measures of distributional overlap (like the KL divergence) are asymmetric; how similar the corpus is to model outputs is different from how similar model outputs are to the corpus. Geoffrey Irving is interested in methods to use supervised learning to do distributional matching the other direction, and it might be the case that comment karma is a good way to do it; my guess is that you're better off comparing outputs it generates on the same prompt head-to-head and picking which one is more 'normal,' and training a discriminator to attempt to mimic the human normality judgment.

Replies from: Dagon
comment by Dagon · 2019-04-02T17:54:41.313Z · LW(p) · GW(p)

Is there a writeup (or open source code) for the training and implementation? It would be interesting to personalize it - train based on each user's posts/comments (in addition to high-karma comments from others), and give each of us a taste of our own medicine in replies to our comments/posts.

Replies from: habryka4
comment by habryka (habryka4) · 2019-04-02T18:25:14.749Z · LW(p) · GW(p)

Sure, I am happy to share the training code, though we used our direct database access to export the data to train it, and that data doesn't currently contain any author information. Though you can theoretically get all the data via the API.

comment by Original_Seeing · 2019-04-03T16:14:09.546Z · LW(p) · GW(p)

Should the accused not at least have the right to make one reply in its defense?!?

My favorite was this [LW(p) · GW(p)] reply [LW(p) · GW(p)]. I had to sit down for a minute to imagine how screwed up a person must be to have an internal conversation like that one.

comment by Charlie Steiner · 2019-04-02T06:23:28.983Z · LW(p) · GW(p)

If GPT2 was from the mod team, 5/10, with mod tools we could have upped the absurdity game a lot. If it was an independent effort, 8/10, you got me :)

comment by gjm · 2019-04-02T20:24:09.703Z · LW(p) · GW(p)

355 days?

Replies from: jimrandomh, ryan_b
comment by jimrandomh · 2019-04-03T20:02:16.948Z · LW(p) · GW(p)

It was a dumb typo in my part. Edited.

comment by ryan_b · 2019-04-03T13:05:58.349Z · LW(p) · GW(p)

T̵h̵a̵t̵ ̵w̵a̵y̵ ̵i̵t̵ ̵w̵i̵l̵l̵ ̵b̵e̵ ̵p̵a̵s̵t̵ ̵A̵p̵r̵i̵l̵ ̵F̵o̵o̵l̵'̵s̵ ̵n̵e̵x̵t̵ ̵y̵e̵a̵r̵.̵

Replies from: gjm
comment by gjm · 2019-04-03T14:45:39.147Z · LW(p) · GW(p)

I'm pretty sure that's wrong for three reasons. First, there are 365 days in a year, not 355. Second, there are actually 366 days next year because it's a leap year (and the extra day is before April 1). Third, the post explicitly says "may not post again until April 1, 2020".

Replies from: ryan_b
comment by ryan_b · 2019-04-03T16:30:51.619Z · LW(p) · GW(p)

Doh! You have me on all three counts. Retracted!