A plan for spam

post by cousin_it · 2011-01-26T02:22:15.209Z · LW · GW · Legacy · 16 comments

I'm getting tired of banning tons of similar articles about jewelry etc. in the discussion section. Our situation looks like a textbook-perfect use case for a Bayesian spam filter (ahem). Or just implement the 5 karma limit that was discussed earlier, that would help too.

16 comments

Comments sorted by top scores.

comment by JoshuaZ · 2011-01-26T02:34:40.443Z · LW(p) · GW(p)

There's an additional problem, it seems that the banned posts are still showing up in Google. This means that Less Wrong's failure to deal with spam is harming the utility of people outside LW who are searching Google for certain classes of products. Any final solution should also include the actual removal of these pages from LW.

I'm also worried about what this shows about our actual level of instrumental rationality given that we've now had this problem from a single set of spammers for a fair bit of time and all agree that there are problems, have had multiple threads about the problem, and still have done absolutely nothing about it.

Replies from: topynate
comment by topynate · 2011-01-26T02:54:42.052Z · LW(p) · GW(p)

I just gave myself a deadline to write a patch for that problem.

Edit: Done!

comment by jimrandomh · 2011-01-26T02:53:05.353Z · LW(p) · GW(p)

People seem to agree that we need a karma minimum. Who actually has the administrative access necessary to implement it?

Replies from: Jack
comment by Jack · 2011-01-26T03:24:52.418Z · LW(p) · GW(p)

Shrug.

comment by taw · 2011-01-26T02:48:12.803Z · LW(p) · GW(p)

... or captchas.

The worst problem is that this spam stays in rss readers even once it's deleted from lesswrong itself. Right now lesswrong discussions rss feed's signal to noise ratio is very very bad.

Replies from: Jack, wedrifid, beriukay
comment by Jack · 2011-01-26T03:01:08.094Z · LW(p) · GW(p)

They're getting past captchas to create their accounts.

Replies from: taw
comment by taw · 2011-01-26T21:29:49.662Z · LW(p) · GW(p)

In such case, disregard my suggestion.

comment by wedrifid · 2011-01-26T02:54:26.852Z · LW(p) · GW(p)

Do we not already have captchas? I must admit it has been an awful long time since I created an account here.

comment by beriukay · 2011-01-26T09:31:53.169Z · LW(p) · GW(p)

I was getting similarly annoyed, so I signed up for Feed Rinse, put the LW discussion RSS on their site, added some filters, all involving Pandora (though I would be sorry if any great discussions arise about the music service), and put that into my RSS feed. Because, somehow, Google Reader can't filter anything, even though gmail has a pretty amazing filtration system.

comment by nazgulnarsil · 2011-01-26T08:49:47.971Z · LW(p) · GW(p)

democracy sucks. is there not a single person with the authority to simply make the change?

Replies from: Emile
comment by Emile · 2011-01-26T09:13:34.534Z · LW(p) · GW(p)

I don't think the problem has anything to do with Democracy, it's just a question of someone who understands the system taking the time to implement it.

Replies from: XiXiDu, nazgulnarsil
comment by XiXiDu · 2011-01-26T10:43:09.903Z · LW(p) · GW(p)

...it's just a question of someone who understands the system taking the time to implement it.

We got some unfriendly AI here trying to tile LW with spam and nobody takes the time to implement a solution? If the SIAI fails this field test we're doomed.

Replies from: JoshuaZ
comment by JoshuaZ · 2011-01-26T13:25:40.507Z · LW(p) · GW(p)

We got some unfriendly AI here trying to tile LW with spam and nobody takes the time to implement a solution? If the SIAI fails this field test we're doomed.

I'm not sure the goal is complete tiling. Note that if a website is completely tiled with spam then people will stop linking to it. The goal therefore should be to spam but not spam so much as to fill the website with just spam. This would in fact explain why we don't get a lot more of them placed: there's a deliberate limit on the rate of spamming.

Replies from: David_Gerard
comment by David_Gerard · 2011-01-26T18:05:58.328Z · LW(p) · GW(p)

I'm not sure the goal is complete tiling.

The adaptation being executed would certainly lead to complete tiling.

The goal therefore should be to spam but not spam so much as to fill the website with just spam.

Enough spammers observably don't behave like this, but instead fill their prey with just spam.

comment by nazgulnarsil · 2011-01-26T10:31:30.443Z · LW(p) · GW(p)

isnt this a matter of changing an integer somewhere? i thought there was minimum threshold to post code already in place for the main section.

Replies from: Emile
comment by Emile · 2011-01-26T10:59:35.985Z · LW(p) · GW(p)

I think it's mostly a question of knowing which integer to change where (it might be more than just an integer, like adding an extra condition to an "if" or something, but I don't expect the change itself to be particularly big), comitting the change to github, and deploying the version with the change (and without including any other risky untested work-in-progress changes to the code that may also be on github). It's not just a config parameter that an be changed at runtime.

Trivial inconveniences and all that.

I think last time someone tried to fix this problem, they did it not by setting a karma threshold, but by adding a (better?) kapcha to registration, or adding a captcha for posting when you have zero karma, something like that. It probably seemed like a fine idea at the time, but bots crowdsource captchas by reusing them on humans trying to get access to porn / downloads.