The Kenyan Twittelection
While most Kenyans were awaiting the Presidential results this past week, here at Odipo Dev, data analysts were busy at work trying to make sense of it all.
The first step in this Twitter project was to actually collect and archive the twitter data coming out of Kenya during the period. For this, a cron job was written as a PhP script. The script used the Twitter search API to find and filter tweets based on relevant hashtags, and dumping them into our own MySQL database. The cron job ran every 10 minutes for 30 days, collecting over 250,000 tweets during this time period.
Once the Twitter data was safely in our MySQL database, we queried out and generated 30 separate text files, one for each day of the period. Each “day” file consisted of just the tweet text from the thousands of tweets that belonged to that day (on average there were about 7,500 tweets per day).
We then arranged the data into a reasonable format, and using an emotions dictionary, were able to analyze sentiment levels of the tweets.
Our major problem came when using tweets with Swahili words.
We hence categorized most of the words into different emotions, and using WordSmith, we were able to run the words against the day files and get the emotions.
After hours of deliberation and work, we finally managed to put together an analysis of the Election in relation to the KOT (Kenyans on Twitter).