Code project: create a Python Twitter bot
Once upon a time, there was a person who decided that people needed more distractions in their lives, so he created Twitter. This may not be exactly how they tell the story at Twitter HQ, but that's probably because it would create a less than glamorous image (oh, and it's also wildly inaccurate). After all, Twitter is pretty much constantly in the news. If you want to catch up with where in the world Stephen Fry is now, what everybody in North America had for lunch or precisely how smugly great Jonathan Ross thinks he is today, there's really only one place to turn.
Amazingly, Twitter can be put to useful things as well. As it happens, Twitter's application programming interface (API) is particularly convoluted - it seems to have evolved by using many different ways of doing various things. That needn't worry us, though, because there are plenty of API wrappers for Python. The one that's most suited for us is the standard Python-Twitter, which is available through most repositories and also at http://code.google.com/p/python-twitter.
OK, we're going to pause the tutorial here to give you an option. Have you ever heard of Identi.ca? It's, erm, identical to Twitter, more or less, except it runs on free software and it's covered by the GPL. The content is covered by the Creative Commons licence, meaning that it's altogether more winsome and lovely than Twitter. Honestly.
Still with us? Then here's how to make everything work in Identi.ca too. There's no specific Python module for the service, but since the API is the same as Twitter's, we can cheat just by changing the server connections in the twitter.py file. In fact, someone has already done it for us - check out the file at www.dilella.org/foo/twitter.py_new. All this really does is enable you to pass in a value for the server, so later on when you see:
client = twitter.Api(username="foo",password="bar")
...you can substitute the following:
client = twitter.Api(username="foo", password="bar", twitterserver="identi.ca")
If you want to use this enhanced library, just download the file from the link. It needs to replace the Twitter-related library already installed (for system consistency purposes, it's better to install the original Python-Twitter package first). Depending on your version of Linux and Python, this should be in /usr/lib/python2.5/site-packages. Just replace the twitter.py file with the new one.
Identi.ca is pretty much identical to Twitter, just without all the fake celebs, 'social marketing gurus' and bots. Well, until you get there, anyway.
In order to make any real use of the Twitter system, first we'll need to have an account. As with previous projects, you could write a script to set up the account for you, but it's far easier to just go to the webpage, register a new account and take down the details for you to use. You may also want to add some friends while you're there (otherwise things could get a bit dull), or you could just use an existing account.
For the purposes of this tutorial, I've set up a profile for evilbotx and decided to follow the delicious tweets of the very lovely Britney Spears. Now fire up Python in a shell (by typing python) and we're ready to do some microblogging:
>>> import twitter >>> client = twitter.Api(username="evilbotx", password="mypassword") >>> client.PostUpdate("Hello World!") <twitter.Status object at 0xb7c2f44c>
The above completes the ritual self-announcing of our application. First, we created an object called client, which connected to Twitter's server and authenticated itself, then the next line used object methods to post a status update.
If you were building a straightforward autonomous system, that's probably all you'd need to know - you can plug this functionality into another script and send out tweets whenever you like. However, we want to add more to this basic framework. The next thing to do is get a list of those people we've decided to follow on the service and retrieve their statuses too. The process isn't too complicated, because there are methods available for most functions:
>>> userlist = client.GetFriends() >>> for username in userlist: ...print username.screen_name, username.status.text ... evilnick @tweeny4 it's hard to beat a poached egg serenajwilliams @celebsdontreply. Of course, I reply. britneyspears The Circus is coming back to the states -Britney
What we can see from this code is that the GetFriends() method returns a list of user objects. User is a class in the Twitter module with various bits and pieces attached to it, such as a user's bio, screen name and so on. All this is fetched from Twitter when the objects are created and can be accessed by us. Some useful properties include:
- user.id A unique ID number associated with the user on the Twitter service.
- user.name The user's real name. *
- user.screen_name Their Twitter username.
- user.description The short bio entered by the user. *
- user.profile_image_url A link to a user's profile picture.
- user.url A URL this user entered, often their homepage. *
- user.status The latest status object for this user.
* These properties may be null if the user hasn't supplied them to the service.
We could use these programmatically if we wanted to, fetching a graphic for a graphical Twitter client app, for example, or for grouping users by interest.
How cool would it be to have an audio based Twitter client? Instead of having to look at a silly screen and glance away from the jolly important code you were writing, you could simply open your ears and have status updates read out to you. There are a few text-to-speech utilities available and there's even a speech dispatcher system for Linux.
You may have Festival or Espeak already installed as part of your distro, but if not, the packages are easily available from your usual repository. We're going to use Espeak for this, but it doesn't matter if you want to use something else, because the code is almost identical. We won't bother with complicated modules for this simple job, either - we'll just use our old favourite, the subprocess module. This, as you may remember, enables you to call shell executables from within Python. We'll use the call method here, which just takes a list containing the arguments you want to use. A simple example is:
>>> import subprocess >>> subprocess.call(['espeak','"Hello World!"])
If a picture is worth a thousand words, why can't I hear anything? Trust us - this can talk.
I spit your tweaks
You should be greeted with a friendly (if you like synthetic tones) voice. If you get a syntax error, take a close look at all the quote marks you've entered. The last element in the list here is a text string enclosed in double quotes, which is itself enclosed in single quotes. It translates to typing espeak "Hello World!" on the command line. So, a functional client would look something like this:
import twitter, subprocess, time client = twitter.Api("evilbotx", "evilbot") while True : list = client.GetFriends() for name in list: print name.screen_name, name.status.text, name.status.id texty= name.screen_name + name.status.text time.sleep(2) subprocess.call(["espeak", texty]) time.sleep(60)
In the above, we connect, enter an endless loop and get the list of friends. A further loop processes the statuses and prints them, converts the information we want into a string and then uses subprocess.call to dump them out of your audio channels. There's also a delay at the end instigated by time.sleep(60), which prevents us from upsetting the server by hitting it too often.
You may ask yourself why we've decided to fetch the list of friends from within the main loop, though. Well, that's because it simplifies things for two reasons. Firstly, it automatically populates all the User objects in the list with their latest statuses. If we loaded the friends list just once, we would still have to iterate through it each loop to get the statuses, which would be messy and (don't quote me on this, because I have no evidence other than a sneaking suspicion) would probably result in more client-server communication overhead. The second reason is that we can safely run this and still run another Twitter client, or visit the web. Any changes you make to your list of friends will automatically be picked up by this script.
At this point the code does work, but there's a problem with it - statuses will be read out whether or not they've been updated in the time period. What we need to do is check the time the status message was created and compare that with the time now. If the message was created less than 60 seconds ago (or 61, say, to give some time for the rest of the code to work), then we should say it out loud. Unfortunately, time is relative. Python's time module can give you a representation of the current epoch time (seconds since the beginning of the universe - according to Unix that was midnight on 1 January 1970), but Twitter reports the time the status message was created in a text format.
To compare the two, we need to do some wrangling. It appears that the Twitter-Python module makes an erroneous assumption in translation, because its methods for determining the time the status was retrieved and the time it was posted at seem to be adrift. This does complicate things, but it isn't impossible to get around. The Twitter API returns dates and times as text strings in the format Mon Jun 8 11:46:34 +0000 2009.
This is fine, because Python has a way of converting string formats back into its normal numeric format and then into a value in seconds since epoch. The only slight problem is that the Twitter date format doesn't show a time zone. However, we can see from experimenting that the time is, as you might expect, in UTC (or GMT if you remember that Britain invented time).
Thus we can simply add UTC to the end of the string and have Python parse it into a more convenient numeric format. The status object preserves the time direct from Twitter as a property called created-at, so we can use that to bypass the internals that are causing issues with other properties. The time.strptime function processes a given string into a set of numeric values in a standard form.
To achieve this, you have to pass it the string, and a string describing the format. This second string contains directives or describers according to a list of values accepted by the module. For us, these are - %a: the abbreviated day name, %b: the abbreviated month name, %d: the day of the month, %H: numeric hours, %M: numeric minutes, %S: numeric seconds, %Y: numeric year, and %Z : a three-character string representing the time zone.
What time is tweet?
As you can see, we added the last value ourselves so it can be picked up when we process the time in Python. The internal numeric format used by Python simply specifies all this type of data as numbers, which you can see by manually processing a string:
>>> time.strptime('Mon Jun 8 10:51:32 +0000 2009 UTC', '%a %b %d %H:%M:%S +0000 %Y %Z') (2009, 6, 8, 10, 51, 32, 0, 159, 0)
These numbers are, respectively, the year, the month number, the day number, hours, minutes, seconds, day of the week (Monday is 0), day of the year and a flag for daylight savings. That's why it's crucial to add the timecode, because Python tries to resolve daylight savings itself if it isn't given, which can cause some odd platform-specific behaviour and uncertain results.
This time can then be converted again into the common Unix seconds since epoch format using time.mktime(). You can find out plenty more about the time module and its various methods by pointing your browser at http://docs.python.org/library/time.html.
So, for now our adjusted code looks like this:
import twitter, subprocess, time client = twitter.Api("evilbotx", "password") while True : list = client.GetFriends() for name in list: texty= name.screen_name + name.status.text now = time.mktime(time.gmtime()) stringmsgtime =name.status.created_at + ' UTC' msgtime=time.mktime(time.strptime(stringmsgtime, '%a %b %d %H:%M:%S +0000 %Y %Z')) if ((msgtime+61)>now): subprocess.call(["espeak", texty]) time.sleep(60)
So, now you have a functioning audio Twitter client in 13 lines of code. Not bad - the only slight problem with this is that if you follow more than a few dozen active people, you'll probably never get the script to shut up. If this is the case, you may want to go down the route of providing a list of specific screen names to follow. We'll then need to get the status of each member on the list, but only a few lines of code need to be changed:
import twitter, subprocess, time client = twitter.Api("evilbotx", "password") list = ['evilnick', 'evilbottester', 'tweeny4'] while True : for item in list: name=client.GetUser(item) texty= name.screen_name + name.status.text now = time.mktime(time.gmtime()) stringmsgtime =name.status.created_at + ' UTC' msgtime=time.mktime(time.strptime(stringmsgtime, '%a %b %d %H:%M:%S +0000 %Y %Z')) if ((msgtime+61)>now): subprocess.call(["espeak", texty]) time.sleep(60)
In this version of our script, the inner loop iterates through the list and calls the GetUser() method for each screen name. This is returned as a User object with properties, which include the most recent status. Now you'll get a verbal alert only when your favourites post a new status message.
A full explanation of the Python Twitter module can be found at http://static.unto.net/python-twitter/0.5/doc/twitter.html.
If you want to go further, a useful addition to our script might be a GUI for quickly updating your own status. Using PyQt, wxWidgets or your favourite GUI, all you'd need to do is create a text a text input (limited to 140 characters) and connect a method to post status updates when Return is pressed. Alternatively, you could use your knowledge of Twitter to attach tweetability to anything you like - your servers could tweet their load averages and free disk space, or you could make Amarok tweet your music tracks as you played them. It's up to you!
First published in Linux Format magazine