This is in partly in response to Nancy Van House’s post on Blog Spam. I had planned to post this as a comment on her weblog, but she has become so innudated with blogspam that she has completely disabled comments. So, I’ll post it on my weblog instead, and hope TrackBack works.
Weblog spam is particularly effective because it does not go after the same audience as email spam. Email spam, like Nancy mentioned, is ultimately dependant on some number of the recipients purchasing the advertised product…
In contrast, as far as weblog spammers are concerned, the ‘product’ being sold is actually Google search rankings. How to you game search results for a site like Google that relies upon PageRank, a technology based upon the idea of democratic ‘voting’ via distributed hyperlinks? You get lots and lots of ‘trusted’ sites to link to you. How do you do that illegitimately? Weblog and wiki spam.
Thus, most weblog spammers are hired out by companies that sell search result ‘optimization’ services. Since most weblogs and wikis are based upon well-established technical standards, it’s incredibly trivial to automate spamming them. Once you spam them with links to your client’s site, you of course submit the URLs of all the spammed sites to Google’s spider for indexing. Google ends up with hyperlinks from all over the web, from thousands of ‘legitimate’ domains. It’s quite clever, in a devious way.
Nancy asks if weblog spam is going to “make blogging impossible.” I don’t believe so. There are a number of (imperfect) ways for bloggers to combat this on an individual level: CAPTCHA systems, email registration, regular expression blacklists, moderation queues–all of which will increasingly become available as standard plugins to weblog publishing systems. However, these only solve the individual problem, not the distributed problem. For every weblogger who moves to these systems, there are still tens of thousands of abandoned weblogs and wikis out there, ripe for spamming. So perhaps a more pertinent question is not whether this spam will make blogging impossible, but rather will it make Google’s PageRank impossible?*
* my own theory: nope. It’s certainly making things problematic in the short term for them, but I have faith that if anyone will eventually figure out a way to do a better job of filtering out the crap, it’s Google.