Comment Spam Countermeasures

Posted on October 18, 2006 in Blogging
(last edited on April 29, 2014 at 1:27 am)

I’ve been noticing an increase in comment spam yet again. While Bad Behavior and Akismet are doing a good job of keeping the comment spam out of the blog, I’m also starting to hit some server performance issues. I’m not quite ready to move yet, so I thought I’d try an easy trick to see if I could alleviate server load due to heavy spamming activity.

Geeky notes follow.

Bad Behavior and Akismet Revisited

Someone asked me why I was using two spam plugins…didn’t they do the same thing? So here’s why:

Bad Behavior 2 is the first line of defense. From the Bad Behavior website:

Bad Behavior was designed and built by watching actual spambots which harvested email addresses, posted comment spam, and used fake referrers. By logging their entire HTTP requests and comparing them to HTTP requests of legitimate users, it is possible to detect most spambots.

In other words, it actually acts as a screening program to prevent suspicious software from “seeing” your website. It does this by doing a bit of behavioral profiling; software that does not play nice is not allowed any further. This means that the spambots can’t even see the comment form, which means they can’t leave spam. Legimate users with a web browser, though, behave nicely, so they are allowed to see the web page in most cases. Advanced users who have tricked out their Internet to be anonymous, for example, fall into the “spooky” category and probably get bounced.

Despite all the work that BB2 does, some spambots still get through. This is when Akismet comes into play. This plugin analyzes the comment text itself and determines whether it’s spammy or not. Bad Behavior, by comparison, just looks at how the entire website is accessed, not the content. If you were to think of Bad Behavior as law enforcement, it’s like an officer who arrests anyone that even looks like they might commit a crime. Akismet, however, waits for the act to be commited, then sees if it’s criminal by consulting its database of known acts of spam; these are collected and analyzed by the network of Akismet plugin users.

So the combination of Bad Behavior 2, which vigorously beats off the majority of spam attempts, makes for a lighter workload on the server AND for Akismet, which does cleanup on a case-by-case basis. Since installing this combination of plugins several months ago, I’ve seen maybe 2 or 3 actual spam comments get to the moderation queue where I can give the final OK. A few legitimate comments have been caught, but even that has been pretty rare (maybe 5 or 6).

Never the less, I’m still experiencing some server sluggishness, so I thought maybe I’d tweak my setup.

Security through Obscurity

The phrase security through obscurity usually has negative connotations: any security system that relies on secrecy alone is pretty half-assed. You can’t rely on secrets to keep things safe; you also need active countermeasures. It occured to me, though, that in this case security-through-obscurity thinking was appropriate.

Spambots aren’t personal; they just blindly go out and try to access the comment posting mechanisms on blogs based on the default settings. For example, the file that does the actual posting of comments to WordPress is wp-comments.php which lives in a certain directory on every installation. A spambot knows this; it’s just like an opportunistic thief trying the doorknob on your car or house to see if it’s unlocked. It’s just the obvious thing to try. It’s easy enough to do quickly, and the payoff when it works is enough to justify the effort. It just takes one payoff.

Now, knowing thisimagine that you’ve hidden your doorknob. Or moved the door behind something. A thief who’s in a hurry will just walk right on by your house…there are always easier targets. Spambots are similar; they need to make thousands of posts to be able to turn any kind of profit for their overlords, and there just isn’t time to spend cracking your particular blog. There are lots of default installations of WordPress out there, and those are the ones that are being targeted. Spammers aren’t out to make a personal connection; they need thousands and thousands of links to make their dark search engine optimization tricks work for dozens of sites. That means they will target high-yield scenarios like “default installations” of blogging software. They will also target sites that are protected by solutions like Akismet and Bad Behavior, if they’ve figured out a way to get around the countermeasures. Even in this case, however, they’re not targeting your specific blog, but a particular plugin.

Filename Switcheroo

If you measure security by not allowing any accesses to your valuables, then security through obscurity is clearly not enough. A determined attacker will find your secret. However, since Spambots aren’t determined attackers, and our goal is merely reduced violations, applying a little secret sauce is entirely appropriate.

I found Yami McMoot’s write-up on renaming script files on the WordPress Codex, and have changed my comment and trackback files to a random name. Spambots that try to access the default pages will get served a blank page; they’ll just appear broken to the spambot, which will move on its merry way to the next target.

Now, this won’t stop an attacker who’s actually using a browser to access and enter comment spam, and I imagine that there are attempts right now to create an automated browser-based spambot that can automatically find and navigate to the comment box. Spambots that parse HTML would requires active countermeasures like Bad Behavior, or perhaps would have to use an entirely different comment posting mechanism (Flash, anyone?).

Summing Up

So I’m hopeful that this will reduce the amount of incoming spam by a bit more. The anti-spam fortifications in place now look like this:

Secret Entrance — For automated spam scripts to access the comment posting mechanisms directly, they’ll have to guess what random name I’ve chosen for these two files. The OLD files are there, but they’re blank and will not invoke Bad Behavior. This procedural change is invisible to “regular” users.
Bad Behavior — All visitors, normal people and spambot alike, are submitted to the screening. Spammy behavior gets you booted before you even reach the blog. Everyone else is allowed to see the website.
Akismet — Any comments that are submitted will have been made by people (or spambots) that have made it through the two lines of defense. Akismet now looks at what’s been written; spam is marked as such.
Moderation — Any legitimate comment that’s made it through steps 1 through 3 are held in moderation, until such time when I personally read and approve it. On approval, the comment is posted to the blog.

p>In my WordPress dashboard, Akismet shows me how much spam it’s blocked since it was installed. I added a little hack to Bad Behavior 2 in bad-behavior-wordpress.php to display a similar message:

    function bb2_dashboard_stats() {
        bb2_insert_stats(true);
    }
    add_action('activity_box_end', 'bb2_dashboard_stats');

I’m not actually sure if adding countermeasure #1 will make a huge difference; I suspect my laggy server performance is just due to me starting to outgrow it. However, if there are a significant number of spambots just hammering the ser that bad behavior has to deal with, it might add up. Bad Behavior 2 reports about 150-200 blocked posting attempts a day, and Akismet blocks maybe 20-30 that get through.

We’ll see what happens.

Tags:Gweeping

5 Comments

Ian Muir 19 years ago

I’ve been using Akismet and it’s great. Even my low-traffic blog gets a fair amount of spam.

Some of the spam that gets through is kind of funny. You can see that the people writing the spamming software are trying to counteract spamblockers, but in the process they’re making their own spam useless.

Some of the results can be kind of funny. I actually approved a comment that I knew was spam just because it struck me as funny.

——-
Rosano 19 years ago

I use a slightly hacked version of Did You Pass Math?. It works pretty well and I’ve only had 1 or 2 spam comments sift through.
Michael Hampton 19 years ago

A few comments:

First, if you aren’t already using wp-cache, you should. This will dramatically reduce the load on your server from legitimate traffic as well as spammers (note it needs a slight change to work with Bad Behavior).

Oops, I just looked, and it seems you already are using wp-cache, but the server’s slow anyway? Try installing eAccelerator as well; the combination helped me survive being on the front page of digg.com and slashdot at the same time.

Second, I’ve gone to great lengths to ensure that people using browser privacy software aren’t blocked whenever possible. Unfortunately this isn’t always possible because some of the less well-written such software does some of the same dirty tricks that spammers do. When this happens I provide as much information as possible to the user, including the exact changes they need to make, when I know what they are.

Third, automated scripts which can read your comment form right off the web page and thereby determine exactly where to submit the spam already exist and are in wide use now. Where I know about these, Bad Behavior 2 already blocks them. Spambots which just attempt the defaults are much less common than they used to be.
Dave Seah 19 years ago

Ian: Heh, I never thought of allowing spam for humor value. I guess I should lighten up :-)

Rosano: Thanks for the link! Sounds like fun…I should give that a try sometime!

Michael: Thanks for the info, and for all your work with Bad Behavior! I think my slow server is due to my pages being formatted with Markdown (which I suspect is a pig), and I’m starting to hit the limits of pair.com’s DB servers (which are quite low, apparently). Or it could be something else that’s not me. I wonder if I can hide the name of the action for submit, and defeat some of the parsers.
Unsought Jason 19 years ago

About the performance problems you’re seeing – it could be due to Akismet’s aggressive use of OPTIMIZE TABLE when deleting comments. It’s not too hard to fix.