After getting about 40 moderation requests a day, I figured I should spend some time finding some anti-comment-spam plugins for WordPress. After digging around a while, I found one that doesn’t require JavaScript, doesn’t perform vision tests, but works just fine for the kind of comment-spam-bot that seemed to have taken a liking to my blog (even though no spam ever appeared in my comments ever…)
I found lr2Spam which has a great setup, but an incomplete final step. I merged it with ideas I saw in the RBL measures plugin, and got some good results. By replacing lr2Spam’s comment_post
with pre_comment_content
(see the WordPress Plugin API), I was able to redirect spammers away from from my site with PHP’s header("Location: [URL]")
technique. (This is what I borrowed from the RBL plugin.) The patch is almost as big as lr2Spam itself (both are very small). Honestly, I’m surprised it works at all. Someone wrote a comment-spam bot that can’t correctly parse a totally valid HTML form, but does correctly handle a 302/Location redirect. Weird.
I thought briefly about redirecting all the spammers to http://fbi.gov/i-am-a-spammer/?ip=[IP] but then realized their requests’ referer header would show my URL still. On further thought, I realized that comment-spam is very different from email spam because the bot has to implement a much larger set of protocol elements. Since they must respect the 302/Location redirect, someone who is getting hit really hard with comment spam could effectively DDoS somone’s link by redirecting to somewhere with big files. Say, for example, instead of using fbi.gov above, I used http://mirrors.example.com/iso/DVD-distro-image.iso. Every spam bot in their network would start a giant-ass download from example.com after hitting my anti-spam system. Ewww.
Implemented early on May 20th, after 4 days, I’ve seen 250 comment spam attempts from 162 unique IP addresses (most in China — maybe they need to turn their firewall around). The volume of spam isn’t big when compared to my daily email spam statistics, but each one of those would have been an email in my inbox, asking for moderation. Interestingly, they all stopped on May 23rd. Maybe they got a clue.
© 2006, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 License.