Spam-bashing on Twitter – A True Story

Earlier this week, AppStoreHQ released Hottest Apps on Twitter, a ranking of iPhone apps based on the volume and quality of tweets about each app (if you’re curious, you can read more about the ranking methodology here). The project was a fun one for several reasons, but the biggest eye-opener was the sheer volume of “app spam” – tweets auto-generated by iPhone apps – when stacked up against actual user-generated tweets. We haven’t run a formal analysis, but a back-of-the-envelope estimate is that spam tweets are running 10-to-1 against actual user-generated ones.

Since the goal of AppStoreHQ’s service is to accurately reflect the aggregate sentiment of Twitter users about iPhone apps, the spam issue is more than an annoyance – it has the potential to grossly skew the results in favor of the worst “app spam” offenders. You can see this effect in action right now if you visit the results: the current leader is Chorus, an app for sharing iPhone app purchases with friends. By design, Chorus auto-generates a tweet for every new install, and for each new iPhone app that a user downloads to their phone. This may be a smart short-term marketing win for the app, but my bet is that users (and their friends) will tire pretty quickly of the tactic.

The good news is that these bots aren’t very well-designed – the spammy tweets fail the Turing test almost immediately by revealing their pattern after two or three iterations. This makes it relatively easy (if an annoying waste of engineering cycles) to exclude spam tweets from our process. But it’s a troubling indication of where Twitter is headed as a platform. They’ve made it so easy to script tweet generation that it’s almost inevitable that the ratio of machine- to human-generated tweets will tilt rapidly in the machine direction. And even though the machines will get better at concealing themselves, the signal-to-noise ratio on Twitter (not so hot already) is certain to get a lot worse.