Applying Prediction Models to Twitter (28 Days of Hacking: Day 7)

So I started building yet another reverse engineering framework in Python today, but it is way too far away from being done to discuss.

Today I created my first twitter account, only so I could get access to the Twitter API. After a bit of playing around with it, the Tweepy wrapper seemed to be the best way to interface it from Python. So once I had OAuth set up, I decided to start looking at what was trending on Twitter. That's pretty easy to do after I started reading the documentation for Yahoo! Where on Earth Identifiers. But just getting the current trends for a location really doesn't give me much to work with, so I wrote some code to set up a stream on what was trending in the US. I refresh the trends every 5 minutes.

Now I have data, and I need something decently cool to do with it. So I decided to write a Markov Model for predicting the most likely thing someone posting on each trend would say. One thing I noticed is most of the tweets I was grabbing were retweets, which aren't really good for my implementation. As a result, I was just really producing a duplicate of a retweet and overlearning the data. So it's pretty easy to remove them by just looking for an RT at the start of the string. After re-running, I produce a tweet for each trend based on the Markov model.

I plan on playing with some NLP to use the Twitter API to do some other mildly interesting things.

Update - February 8, 2015:

The results are starting to get a little weird. I decided to run it overnight, went to lab this morning and came back to check on it. Here are some of the most recent tweets generated:

Can't wait till Tuesday to get only 2 or no siblings.

HI MOM IM ON DRUGS #BlueMichaelIsBack

Two golf plays in the future? What would you describe your fans the way to be killed so I don't watch the GRAMMYS and is nominated!

Falcao not get a Grammy 😕..but my baby 😂 goodnight cameron sweet dreams loml 💖😇

Wait she said all the blood buddies #KillLaKill She's asleep already?

Gators got the giggles on FaceTime then you realize youre about to cry with me and some gifts on Valentine's Day? What is your favorite song?

And since all the tweets I am currently processing seem to be due to the #askNick trend:

@nickjonas if you had one word #AskNick #GRAMMYs @nickjonas WILL YOU EVER NOTICE ME PLS ILYSM BABY YOU IS ALL THAT A CHANGE OF COLOR IN THE GRAMMY'S