Did that title grab your attention? Well, it was mostly meant as a joke but read on.
I think open source is great (well, except I think GPL is evil but that’s another topic). But, lately I’ve been trying to figure out a way to fix the comment spam problem. Comment spam is generally when someone writes a script to automatically scan the web, find blogs and post ads for crap. Various ways have been tried to stop it but like e-mail, the spam still gets through.
People who run movabletype seem to have some of the worst problems as they are one of the largest targets. Write a script to spam one movabletype blog and you’ve generally covered all of them. The really bad part is it makes the web less usable. It seems most people that run blogs are forced to turn off comments on older topics so that after a couple of weeks you can no longer leave a comment on one of their posts. That’s probably not a big deal if it’s just a “here’s what I had for breakfast” kind of post but it’s a much bigger deal if it’s the kind of post that would benefit from comments over time. For example my posts on Japanese CE or Japanese Input on XP have been up for years and still get relavent comments.
So, one proposed solution is to require Javascript. Most spammers write simple scripts, probably in perl, to post their spam. If your blog requires Javascript in some form to post successfully that’s pretty good proof that there is an actual human posting and not some spam script. The amount of work to write your own Javascript interpreter would be high enough that I doubt any spammers would do it.
Except…
They don’t have to. All they have to do is download the source code to Firefox being that it’s open source. Compile it with their spam scripts and in probably a couple days they’d have Javascript enabled their scripts. So much for Javascript being the answer
Hopefully someone will figure out a better solution.
Spammers are people with lots of ressources. They can make custom programming as they need them. If movabletype want to a solution such as “using javascript” (which is pretty vague – for what usage?), it would be defeated in a matter of weeks.
This spamming problem is very annoying. The best working way I found right now is to hand-modify your scripts in some way (changing fields names and order, filenames, etc) and unless someone is really willing to target you specifically, you’re safe. Of course only developpers can do that, and it doesn’t make things easier for anyone.
I can’t think of a proper solution.
For MovableType I just found a great plugin that still (until the spammers figure out a way around it, damn them) works. It basically is a tiny bit of Javascript that verifies that a real person is entering comments. I’m sure it’ll get circumvented soon but hey, anything that stops the barrage for a little while is welcome in my book.
http://overstated.net/projects/mt-keystrokes/
I also agree that OSS isn’t the problem. I mean, by the same token, closed source software Internet Explorer is at fault for so much crappy web design on the internet these days…
Captcha and challenge/response systems have been interesting options. I like the one where you just have to do a simple math problem to enter a comment. Hmmm…. maybe a math problem encoded in a captcha? Either way, we need to take advantage of the fact that humans can do certain things extremely fast with little effort that typical computers and software cannot.
MT-Keystrokes doesn’t actually work because it’s solving any problem. It works because it’s mostly obscure. The spammers should just as easily send fake data to the service that said keys were pressed. If MT-Keystokes became popular that’s what the spammers would do. MT-keystrokes is basically the same as customizing your script a little.
Also, spammers do not have larger resources. You can write a spam script in minutes. Writing a Javascript interpreter would take many months. I doubt any of them have the resources to write one but now they don’t have to as a free one is available.
Make sure one cant use html tags and leave links in comments and you’ve got most of the problem solved. A more “bloated” solution to comment spam would be one central respository of “bad comments”. As soon as one person catches a spam comment, it gets flagged and put into the “bad comments” respository. Then, your blogging software checks every comment against the database which would contain the IP, the message, maybe the HTTP headers. It would be pretty slow, as there are quite the number of blogs, but it would work sort of like email black lists. The easy way would be to have your readers log in to post comments.
unfortnately, not allowing links makes the web much less useful. Any solultion which forbids liking is not a step in the right direction.
As for checking the content of the comments the latest version of comment spam involves them scanning your page (or another page) and copying an actual comment and then just adding a link to word or two.
That could also be a problem with visiters using dynamic IPs.
Actually frumin, there is such a plug-in called MT-Blacklist. You keep a repository of regular expressions that catch spam. When you install the plug-in, it starts off with an impressive database. Then, as you run into more comment spam, you can add it to your database and occasionally send it back to the central Blacklist repository to be merged in. If a human enters a comment that gets picked up by MT-Blacklist, it will return a page that says, “Sorry, your comment looked like spam because of this text:” and so people’s comments don’t disappear without them knowing. It works quite well.
You know when you sign up for some free email account and it says “Please put in the numbers/letters shown in the graphic above?”. There is a movable type plugin that does this and i’m sure plug in for other blog back ends. Go here to check it out or read the wikipedia entry.
oop! meant this instead of the same URL twice for my comment below
Why do you say that GPL is evil, btw ?
yah i’d like to know why gpl is evil too