Microsoft just release new research on blog spam which points a big finger at Google’s Blogspot for exacerbating the problem. Microsoft used a proprietary search tool to separate the good links from the bad, and stated that other free blog sites are subject to the same problems as Blogger.
Brian Krebs of The Washington Post noted that the research suggested Google’s anti-spam efforts are pretty weak. The “nofollow” tag attribute seems to be the biggest effort to date, and it is easy to see why that hasn’t worked.
The “nofollow” attribute is voluntary, meaning a blogger has to chose to put it in their comment and trackback functions, so all the nice bloggers put it in, and all the bad guys leave it out. Simple – and yes, some blog software comes with it turned on by default, but it is easily removed. The tag does nothing for the individual blogger – all it does is cover the butts of the search engines. And with search engines in a constant bragging rights battle over the number of pages they have indexed, it comes as no surprise that some might decide to go ahead and follow anyway. Throw ever improving blog anti-spam measures into the mix, and legitimate blogs become even less inclined to use the tag, as it disenfranchises the cross-blog communication and linking that makes the blogosphere so…interesting.
Of course, Microsoft isn’t the first bigwig to bring this Blogspot (and other free blog service) thing to light. Mark Cuban has been screaming about it for a while (see here and here). Nonetheless, it is a problem that the search engines have to deal with, and the burden should not necessarily fall on the blogosphere to do their work for them. Microsoft knows this, and for the first time in a while I have to give them kudos, just for pointing it out once again.
There ought to be a mandate that all
controversial ultra-biased blogs disable comment moderation. Yes, they would be required to put up with some blog spam, but what the hell. Half the reason I think political blogs are such a waste of drive space is that the individual authors simply delete the comments they don’t like, which is usually when the comment steers contrary to their ultra-right or ultra-left parroting.
Blogger, a fine amateur journalist’s platform, but the unfortunate haven for red and blue feathers alike, is now providing comment moderation. And without clearly defined trackback linkage, who’s to really know when there is disagreement?
So much for new media. Then again, at least they won’t have to directly attack any negative reinforcement.
My solution – when in doubt regarding the authenticity of a blog entry or the grand phenomena of perfect agreement on every count, I just review the “Top Ten Blogger Lies” and my mind is then clear.
Kathleen Fitzpatrick of Planned Obsolescence notes that blog spambots seem to be getting smarter. Recent trackback spam to one of Ms. Fitzpatrick’s blogs seemed to have extracted information from the site at hand, twisted it around, and sent it back hoping to evade filters. Well, it worked – manual intervention and a keen eye were required to keep it out.
We all know that fighting blog spam is a hassle no matter how good the filters you employ are, but nowadays it is getting a bit trickier. In this case, though, I find some irony. The trackback spam in question was displaying links claiming to be from the University of Virginia, so maybe the spambots are studying there? Ok, that wasn’t so funny. Could Planned Obsolescence be hinting that blog software purveyors will soon be coming out with “new and improved spam filters – upgrade now” announcements? Uh, that attempt was even worse.
Gimme a break – its early.
The Center for Citizen Media got a very stealthy piece of blog spam, and noted that Akismet (the WordPress blog spam filter) caught it anyway.
This wouldn’t be very interesting, except for the fact that automation was suggested. A blog spammer can be a pain in the ass, but a spammer with a tool that scrapes sites for familiar keywords, terms, and other defining factors that can then be used in the spam itself makes it really interesting. A small database of key phrases for injection, some word substitution scripts, and you’d have a real powerhouse for filling up junk comment and trackback tables.
Or maybe the spammer just read the “how to” before they did this to CCM?
Peter Kaminski discovered folks spamming his weblog with strange messages and five-digit numbers a while back, and I am curious to know if anyone figured out what it all means. Spamroll has been receiving them for the last few days, and they look like this:
Commenter: Janet Ashlow
Comment: I can’t believe it, my co-worker just bought a boat for $76917. Isn’t that silly!
There are no links in the comments, so they don’t get bounced by normal spam filtering procedures. My suspicion is someone, someplace wants those words and numbers available for others to find, but unfortunately I am no puzzle solver – the tinfoil hat scrambles too many of my internal brainwaves. And don’t try using the one above to solve it – I changed much of the original comment, just to muck up the process (if there is one). Peter has plenty of examples here for anyone still interested in figuring it all out.
Peter Kaminski is getting a lot of blog spam with 5-digit numbers imbedded in the messages. He’s made some tweaks to SpamLookup to take care of the problem, but has me reeling with curiosity now.
What is the significance of the five digits in the spams? Someone please tell me before I pull what little hair I have left out!