A plan for spam

cousin_itJan 26, 2011, 2:22 AM

15 points

I’m getting tired of banning tons of similar articles about jewelry etc. in the discussion section. Our situation looks like a textbook-perfect use case for a Bayesian spam filter (ahem). Or just implement the 5 karma limit that was discussed earlier, that would help too.

cousin_itJan 26, 2011, 2:22 AM

15 points

16 comments1 min readLW link Archive

JoshuaZ Jan 26, 2011, 2:34 AM
21 points

There’s an additional problem, it seems that the banned posts are still showing up in Google. This means that Less Wrong’s failure to deal with spam is harming the utility of people outside LW who are searching Google for certain classes of products. Any final solution should also include the actual removal of these pages from LW.

I’m also worried about what this shows about our actual level of instrumental rationality given that we’ve now had this problem from a single set of spammers for a fair bit of time and all agree that there are problems, have had multiple threads about the problem, and still have done absolutely nothing about it.
- topynate Jan 26, 2011, 2:54 AM
  17 points
  Parent
  
  I just gave myself a deadline to write a patch for that problem.
  
  Edit: Done!
jimrandomh Jan 26, 2011, 2:53 AM
5 points

People seem to agree that we need a karma minimum. Who actually has the administrative access necessary to implement it?
- Jack Jan 26, 2011, 3:24 AM
  2 points
  Parent
  
  Shrug.
taw Jan 26, 2011, 2:48 AM
4 points

… or captchas.

The worst problem is that this spam stays in rss readers even once it’s deleted from lesswrong itself. Right now lesswrong discussions rss feed’s signal to noise ratio is very very bad.
- Jack Jan 26, 2011, 3:01 AM
  8 points
  Parent
  
  They’re getting past captchas to create their accounts.
  - taw Jan 26, 2011, 9:29 PM
    0 points
    Parent
    
    In such case, disregard my suggestion.
- wedrifid Jan 26, 2011, 2:54 AM
  2 points
  Parent
  
  Do we not already have captchas? I must admit it has been an awful long time since I created an account here.
- beriukay Jan 26, 2011, 9:31 AM
  1 point
  Parent
  
  I was getting similarly annoyed, so I signed up for Feed Rinse, put the LW discussion RSS on their site, added some filters, all involving Pandora (though I would be sorry if any great discussions arise about the music service), and put that into my RSS feed. Because, somehow, Google Reader can’t filter anything, even though gmail has a pretty amazing filtration system.
nazgulnarsil Jan 26, 2011, 8:49 AM
0 points

democracy sucks. is there not a single person with the authority to simply make the change?
- Emile Jan 26, 2011, 9:13 AM
  1 point
  Parent
  
  I don’t think the problem has anything to do with Democracy, it’s just a question of someone who understands the system taking the time to implement it.
  - XiXiDu Jan 26, 2011, 10:43 AM
    10 points
    Parent
    
    
    ...it’s just a question of someone who understands the system taking the time to implement it.
    
    We got some unfriendly AI here trying to tile LW with spam and nobody takes the time to implement a solution? If the SIAI fails this field test we’re doomed.
    - JoshuaZ Jan 26, 2011, 1:25 PM
      5 points
      Parent
      
      
      We got some unfriendly AI here trying to tile LW with spam and nobody takes the time to implement a solution? If the SIAI fails this field test we’re doomed.
      
      I’m not sure the goal is complete tiling. Note that if a website is completely tiled with spam then people will stop linking to it. The goal therefore should be to spam but not spam so much as to fill the website with just spam. This would in fact explain why we don’t get a lot more of them placed: there’s a deliberate limit on the rate of spamming.
      - David_Gerard Jan 26, 2011, 6:05 PM
        5 points
        Parent
        
        
        I’m not sure the goal is complete tiling.
        
        The adaptation being executed would certainly lead to complete tiling.
        
        The goal therefore should be to spam but not spam so much as to fill the website with just spam.
        
        Enough spammers observably don’t behave like this, but instead fill their prey with just spam.
  - nazgulnarsil Jan 26, 2011, 10:31 AM
    0 points
    Parent
    
    isnt this a matter of changing an integer somewhere? i thought there was minimum threshold to post code already in place for the main section.
    - Emile Jan 26, 2011, 10:59 AM
      0 points
      Parent
      
      I think it’s mostly a question of knowing which integer to change where (it might be more than just an integer, like adding an extra condition to an “if” or something, but I don’t expect the change itself to be particularly big), comitting the change to github, and deploying the version with the change (and without including any other risky untested work-in-progress changes to the code that may also be on github). It’s not just a config parameter that an be changed at runtime.
      
      Trivial inconveniences and all that.
      
      I think last time someone tried to fix this problem, they did it not by setting a karma threshold, but by adding a (better?) kapcha to registration, or adding a captcha for posting when you have zero karma, something like that. It probably seemed like a fine idea at the time, but bots crowdsource captchas by reusing them on humans trying to get access to porn / downloads.