The current state of the art in comment spam

Write, geek! gets a fair amount of spam replies. This surprised me at first, when it began happening almost immediately after the blog was set up and content was posted. I should have known better; there’s almost no cost to spammers in spamming even unpopular blogs, so why would they make an exception for mine?

I’m using the Akismet plugin for WordPress, so it’s not like any of these comments actually make it to my blog. In fact, I’d never even have to see them, if not for the fact that I regularly clean these comments out of my spam folder by hand. I do this partly to ensure that nothing legitimate gets filtered incorrectly (which happens sometimes) and partly because I like to sort of keep tabs on the current ‘state of the art’ in spamming.

The current state of the art in spamming is this: the comments are getting better. No longer are comments jam-packed with dozens of links commonplace (one particular default WordPress setting probably made those almost 100% ineffective), but they’ve been largely replaced with comments that masquerade as… actual comments!

The idea of noise disguised as signal is nothing new if you’ve used e-mail in the last 15 years, but that the noise is getting better (read: more difficult for humans to detect) is somewhat surprising. Of course, these comments are no match for a large, distributed system like Akismet, which all-knowingly sees what’s being posted to probably millions of blogs, but the well-disguised, largely pseudo-flattering comments are probably now designed to get human blog authors to click the “Not Spam” button, freeing them the comments the spam box so that they can do their SEO-based dirty work.

Of course, gentle readers, I’m far too smart to fall for that, but not so blinded by my hatred for spam to be unable to appreciate a well-crafted work of authorship, like this one I just found:

Spam that reads "Excellent read, I just passed this onto a colleague who was doing a little research on that. And he actually bought me lunch because I found it for him smile So let me rephrase that: Thanks for lunch!"

Sure, it’s not perfect, but someone out there put some modicum of thought into it, which is the least you could ask of the author of a work that’s going to be distributed on a massive scale.

Plus, it’s a lot better than this anti-gem I also just found:

Spam that reads "Why jesus allows this sort of thing to continue is a mystery"

Can you get more unintentionally self-referential than that? (No, you cannot… and yes, that was a challenge.)

Upgraded to WordPress 3.0

The old adage (which I think I made up) about spending more time geeking around with a WordPress installation than actually writing in the damned blog holds true, ladies and gentlemen.

I just finished upgrading this fine blog to the newly-stable WordPress 3.0.

In case you were wondering and/or sitting on the edge of your seats, I took great care to:

  1. Disable all of my plugins
  2. Dump a copy of my WordPress MySQL database using the aptly-titled mysqldump
  3. tar a copy of my WordPress directory
  4. Do the upgrade!
  5. Re-enable the plugins one-by-one, making sure each works (or at least doesn’t break anything)

While I know not everyone is so lucky, I’m glad to see that everything appears to work here, because I’d be deathly embarrassed if, you know, Google or Bing’s webcrawler came by and things weren’t looking up to my usual standards.

Why I don’t worry about blog stats, not even a little bit

I don’t obsess over this blog’s traffic stats. Doing so would be an example of kicking my own ass.

This graph is unimportant.

So while I use both Google Analytics and the WordPress Stats plugin, I don’t care a whit about the numbers. I don’t even have to check them to know that they are meaningless; they’re close enough to zero that they might as well be. (Words I’ve never spoken: “I had 12 pageviews today, up from 10. High and to the right, baby!”)

I can’t separate bot traffic from human traffic, and for all I know, I’m probably responsible for some incidental pageviews… at least if I happen to load pages when not signed in to WordPress. And why should I care about pageviews, anyway? It’s not like I’m looking to sell ads.

So why do I continue to use not one, but two solutions to not give me numbers? For the qualitative data. I can’t get enough of those.

My two favorites are as follows: referrers and search terms (which are, themselves, referrers, anyway). Both of these give me information that is actually useful, right now. Search terms tell me about a case where someone was looking for something and found my post’s title and/or summary promising enough to actually click through. And referrers, clearly, show me who (if anyone) is driving people my way.

(Even in my past life on Multiply, I hooked my account up with Site Meter‘s free service to see if they could show me any insightful stats. I took a look through what they offered and found that all I really cared about were the referrers… which were, more often than not, hilarious. Web browser, OS and screen resolution can be interesting for seeing how my visitors stack up against Web users as a whole, but what am I going to do with that sort of insight? Fix IE6 CSS issues? Ha.)

The qualitative data that these services collect from my blog have shown me that people have found my post about the crappy Vivitar Clipshot, some even wondering if it’s OS X-compatible. (Hint: it isn’t.) A bunch of different search terms brought people to my logo/visual puns post. And one search that didn’t even logically match up with content I’ve posted, recently learned words reappearing, gives me a great idea for a future post!

Should I be worrying more about appealing to the masses, or about creating the sort of content that people who actually do visit are interested in? That’s easy. The searches and referrers have shown me that (please cue the schmaltzy music) I’ve touched people’s lives… even if I didn’t necessarily give them anything of value, and perhaps even wasted their time with content that wasn’t relevant to their interests. I made a difference!

What’s all the PubSubHubBub hubbub?

Generally speaking, I’m a fan of emerging technologies and stuff like that. I just don’t always get it right off the bat.

I first heard of RSS/Atom in 2002 or 2003,  whenever LiveJournal started actively pushing syndication, making feeds on journals discoverable. I looked upon these alien terms with interest, but some confusion. Wait, I can subscribe to a blog? Why would I want to do that?

I know what I probably sounded like back then. Perhaps in a couple of years, I’ll be laughing at myself, wondering what I’d do without PubSubHubBub. Just perhaps.

For now, though, I’m not quite sure I get it. Since Google Reader now supports the format, I went ahead and found a WordPress plugin to enable it here on writegeek. I understand that to an RSS subscriber, it means faster or near-instantaneous updates. And to a publisher, it mean not only faster updates for one’s readers, but less load on the server, since millions of desktop feed-readers won’t be regularly requesting one’s RSS file. (Not that that applies to me… yet.)

Yeah, I’m a bit intrigued at the instant publishing, but have a bunch of unanswered questions. Which servers should I be pinging? What motivates one to run a server? What are their business models? A couple of years down the road, when they realize that they’re running the most popular servers but still aren’t making money, will they be putting ads in my feed? And I think I read something about servers talking to each other; how does that work?

There seems to be nothing to lose, no lock-in or single baskets in which to place all of my proverbial eggs,  so I’ll try it out. (That was basically the point of this post.)

Time to click Publish and start jabbing my F5 key…

An introduction

Hello, Internet. It’s Everett, and I’m blogging. I’m sort of new at this.

And at the same time, I’m not.

See, it was 2001 when I first became aware of the fact that people on the Web were writing regularly updated, reverse-chronological content about what they had for breakfast. I was a college freshman. I took up my keyboard and started a blog1 that no longer exists, on a service that I didn’t like very much (but is still around today).

After a few months there, I started a LiveJournal that exists to this day, but hasn’t been regularly updated in a number of years. I was once a paid user of LiveJournal, an acknowledged contributor to the project and, simply, a humongous fan.

Something changed in my life, a few years later, around the time I finished college. Perhaps I no longer felt the need to tell the world what I was having for breakfast (of course, today that’s Twitter’s job), or maybe my life got a lot less noteworthy (if it had ever been). Maybe LiveJournal’s multiple changes in ownership tarnished its image. Or maybe all the cool kids moved on to pure social networking services, which were coming of age at that point.

It was probably a combination of these things, plus another big one: I was hired to work in a public-facing role at blogging/social networking/photo sharing/etc. service extraordinaire Multiply.com. To be clear, Multiply didn’t silence me; I made sure I was allowed to continue blogging elsewhere before taking the position. But having a real job, one that had me among other things, blogging, simply wasn’t conducive to after-hours blogging.

With all of this in the past, I think it’s time I start blogging again. Everyone’s cat has a blog, in which they discuss what they ate for breakfast, so why don’t I?

Okay, now I do.

  1. Though I was at the time unaware of the term “blog,” which was by no means in common use in 2001[]