The current state of the art in comment spam

Write, geek! gets a fair amount of spam replies. This sur­prised me at first, when it be­gan hap­pen­ing al­most im­me­di­ately af­ter the blog was set up and con­tent was posted. I should have known bet­ter; there’s al­most no cost to spam­mers in spam­ming even un­pop­u­lar blogs, so why would they make an ex­cep­tion for mine?

I’m us­ing the Akismet plugin for WordPress, so it’s not like any of these com­ments ac­tu­ally make it to my blog. In fact, I’d never even have to see them, if not for the fact that I reg­u­larly clean these com­ments out of my spam folder by hand. I do this partly to en­sure that noth­ing le­git­i­mate gets fil­tered in­cor­rectly (which hap­pens some­times) and partly be­cause I like to sort of keep tabs on the cur­rent ‘state of the art’ in spam­ming.

The cur­rent state of the art in spam­ming is this: the com­ments are get­ting bet­ter. No longer are com­ments jam-packed with dozens of links com­mon­place (one par­tic­u­lar de­fault WordPress set­ting prob­a­bly made those al­most 100% in­ef­fec­tive), but they’ve been largely re­placed with com­ments that mas­quer­ade as… ac­tual com­ments!

The idea of noise dis­guised as sig­nal is noth­ing new if you’ve used e-mail in the last 15 years, but that the noise is get­ting bet­ter (read: more dif­fi­cult for hu­mans to de­tect) is some­what sur­pris­ing. Of course, these com­ments are no match for a large, dis­trib­uted sys­tem like Akismet, which all-knowingly sees what’s be­ing posted to prob­a­bly mil­lions of blogs, but the well-disguised, largely pseudo-flattering com­ments are prob­a­bly now de­signed to get hu­man blog au­thors to click the “Not Spam” but­ton, free­ing them the com­ments the spam box so that they can do their SEO-based dirty work.

Of course, gen­tle read­ers, I’m far too smart to fall for that, but not so blinded by my ha­tred for spam to be un­able to ap­pre­ci­ate a well-crafted work of au­thor­ship, like this one I just found:

Spam that reads "Excellent read, I just passed this onto a colleague who was doing a little research on that. And he actually bought me lunch because I found it for him smile So let me rephrase that: Thanks for lunch!"

Sure, it’s not per­fect, but some­one out there put some mod­icum of thought into it, which is the least you could ask of the au­thor of a work that’s go­ing to be dis­trib­uted on a mas­sive scale.

Plus, it’s a lot bet­ter than this anti-gem I also just found:

Spam that reads "Why jesus allows this sort of thing to continue is a mystery"

Can you get more un­in­ten­tion­ally self-referential than that? (No, you can­not… and yes, that was a chal­lenge.)

Upgraded to WordPress 3.0

The old adage (which I think I made up) about spend­ing more time geek­ing around with a WordPress in­stal­la­tion than ac­tu­ally writ­ing in the damned blog holds true, ladies and gen­tle­men.

I just fin­ished up­grad­ing this fine blog to the newly-stable WordPress 3.0.

In case you were won­der­ing and/or sit­ting on the edge of your seats, I took great care to:

  1. Disable all of my plu­g­ins
  2. Dump a copy of my WordPress MySQL data­base us­ing the aptly-titled mysql­dump
  3. tar a copy of my WordPress di­rec­tory
  4. Do the up­grade!
  5. Re-enable the plu­g­ins one-by-one, mak­ing sure each works (or at least doesn’t break any­thing)

While I know not every­one is so lucky, I’m glad to see that every­thing ap­pears to work here, be­cause I’d be deathly em­bar­rassed if, you know, Google or Bing’s we­bcrawler came by and things weren’t look­ing up to my usual stan­dards.

Why I don’t worry about blog stats, not even a little bit

I don’t ob­sess over this blog’s traf­fic stats. Doing so would be an ex­am­ple of kick­ing my own ass.

So while I use both Google Analytics and the WordPress Stats plugin, I don’t care a whit about the num­bers. I don’t even have to check them to know that they are mean­ing­less…

I don’t ob­sess over this blog’s traf­fic stats. Doing so would be an ex­am­ple of kick­ing my own ass.

This graph is unim­por­tant.

So while I use both Google Analytics and the WordPress Stats plugin, I don’t care a whit about the num­bers. I don’t even have to check them to know that they are mean­ing­less; they’re close enough to zero that they might as well be. (Words I’ve never spo­ken: “I had 12 pageviews to­day, up from 10. High and to the right, baby!”)

I can’t sep­a­rate bot traf­fic from hu­man traf­fic, and for all I know, I’m prob­a­bly re­spon­si­ble for some in­ci­den­tal pageviews… at least if I hap­pen to load pages when not signed in to WordPress. And why should I care about pageviews, any­way? It’s not like I’m look­ing to sell ads.

So why do I con­tinue to use not one, but two so­lu­tions to not give me num­bers? For the qual­i­ta­tive data. I can’t get enough of those.

My two fa­vorites are as fol­lows: re­fer­rers and search terms (which are, them­selves, re­fer­rers, any­way). Both of these give me in­for­ma­tion that is ac­tu­ally use­ful, right now. Search terms tell me about a case where some­one was look­ing for some­thing and found my post’s ti­tle and/or sum­mary promis­ing enough to ac­tu­ally click through. And re­fer­rers, clearly, show me who (if any­one) is dri­ving peo­ple my way.

(Even in my past life on Multiply, I hooked my ac­count up with Site Meter‘s free ser­vice to see if they could show me any in­sight­ful stats. I took a look through what they of­fered and found that all I re­ally cared about were the re­fer­rers… which were, more of­ten than not, hi­lar­i­ous. Web browser, OS and screen res­o­lu­tion can be in­ter­est­ing for see­ing how my vis­i­tors stack up against Web users as a whole, but what am I go­ing to do with that sort of in­sight? Fix IE6 CSS is­sues? Ha.)

The qual­i­ta­tive data that these ser­vices col­lect from my blog have shown me that peo­ple have found my post about the crappy Vivitar Clipshot, some even won­der­ing if it’s OS X-compatible. (Hint: it isn’t.) A bunch of dif­fer­ent search terms brought peo­ple to my logo/visual puns post. And one search that didn’t even log­i­cally match up with con­tent I’ve posted, re­cently learned words reap­pear­ing, gives me a great idea for a fu­ture post!

Should I be wor­ry­ing more about ap­peal­ing to the masses, or about cre­at­ing the sort of con­tent that peo­ple who ac­tu­ally do visit are in­ter­ested in? That’s easy. The searches and re­fer­rers have shown me that (please cue the schmaltzy mu­sic) I’ve touched people’s lives… even if I didn’t nec­es­sar­ily give them any­thing of value, and per­haps even wasted their time with con­tent that wasn’t rel­e­vant to their in­ter­ests. I made a dif­fer­ence!

What’s all the PubSubHubBub hubbub?

Generally speak­ing, I’m a fan of emerg­ing tech­nolo­gies and stuff like that. I just don’t al­ways get it right off the bat.

I first heard of RSS/Atom in 2002 or 2003, when­ever LiveJournal started ac­tively push­ing syn­di­ca­tion, mak­ing feeds on jour­nals dis­cov­er­able. I looked upon these alien terms with in­ter­est, but some con­fu­sion.

Generally speak­ing, I’m a fan of emerg­ing tech­nolo­gies and stuff like that. I just don’t al­ways get it right off the bat.

I first heard of RSS/Atom in 2002 or 2003,  when­ever LiveJournal started ac­tively push­ing syn­di­ca­tion, mak­ing feeds on jour­nals dis­cov­er­able. I looked upon these alien terms with in­ter­est, but some con­fu­sion. Wait, I can sub­scribe to a blog? Why would I want to do that?

I know what I prob­a­bly sounded like back then. Perhaps in a cou­ple of years, I’ll be laugh­ing at my­self, won­der­ing what I’d do with­out PubSubHubBub. Just per­haps.

For now, though, I’m not quite sure I get it. Since Google Reader now sup­ports the for­mat, I went ahead and found a WordPress plugin to en­able it here on writegeek. I un­der­stand that to an RSS sub­scriber, it means faster or near-instantaneous up­dates. And to a pub­lisher, it mean not only faster up­dates for one’s read­ers, but less load on the server, since mil­lions of desk­top feed-readers won’t be reg­u­larly re­quest­ing one’s RSS file. (Not that that ap­plies to me… yet.)

Yeah, I’m a bit in­trigued at the in­stant pub­lish­ing, but have a bunch of unan­swered ques­tions. Which servers should I be ping­ing? What mo­ti­vates one to run a server? What are their busi­ness mod­els? A cou­ple of years down the road, when they re­al­ize that they’re run­ning the most pop­u­lar servers but still aren’t mak­ing money, will they be putting ads in my feed? And I think I read some­thing about servers talk­ing to each other; how does that work?

There seems to be noth­ing to lose, no lock-in or sin­gle bas­kets in which to place all of my prover­bial eggs,  so I’ll try it out. (That was ba­si­cally the point of this post.)

Time to click Publish and start jab­bing my F5 key…

An introduction

Hello, Internet. It’s Everett, and I’m blog­ging. I’m sort of new at this.

And at the same time, I’m not.

See, it was 2001 when I first be­came aware of the fact that peo­ple were writ­ing reg­u­larly up­dated, reverse-chronological con­tent on the Web…

Hello, Internet. It’s Everett, and I’m blog­ging. I’m sort of new at this.

And at the same time, I’m not.

See, it was 2001 when I first be­came aware of the fact that peo­ple on the Web were writ­ing reg­u­larly up­dated, reverse-chronological con­tent about what they had for break­fast. I was a col­lege fresh­man. I took up my key­board and started a blog1 that no longer ex­ists, on a ser­vice that I didn’t like very much (but is still around to­day).

After a few months there, I started a LiveJournal that ex­ists to this day, but hasn’t been reg­u­larly up­dated in a num­ber of years. I was once a paid user of LiveJournal, an ac­knowl­edged con­trib­u­tor to the project and, sim­ply, a hu­mon­gous fan.

Something changed in my life, a few years later, around the time I fin­ished col­lege. Perhaps I no longer felt the need to tell the world what I was hav­ing for break­fast (of course, to­day that’s Twitter’s job), or maybe my life got a lot less note­wor­thy (if it had ever been). Maybe LiveJournal’s mul­ti­ple changes in own­er­ship tar­nished its im­age. Or maybe all the cool kids moved on to pure so­cial net­work­ing ser­vices, which were com­ing of age at that point.

It was prob­a­bly a com­bi­na­tion of these things, plus an­other big one: I was hired to work in a public-facing role at blogging/social networking/photo sharing/etc. ser­vice ex­tra­or­di­naire Multiply.com. To be clear, Multiply didn’t si­lence me; I made sure I was al­lowed to con­tinue blog­ging else­where be­fore tak­ing the po­si­tion. But hav­ing a real job, one that had me among other things, blog­ging, sim­ply wasn’t con­ducive to after-hours blog­ging.

With all of this in the past, I think it’s time I start blog­ging again. Everyone’s cat has a blog, in which they dis­cuss what they ate for break­fast, so why don’t I?

Okay, now I do.

  1. Though I was at the time un­aware of the term “blog,” which was by no means in com­mon use in 2001