image.png

Visualizing Áras Election

(image from Wikipedia)The Irish presidential election is just about a week away. As a non-citizen resident of Ireland, I can’t vote in this election (only local elections). But I still find it interesting so I took a look at some social media data on the topic to make some visualizations. These are not meant to be predictions, it’s just a bit of fun to see what people are thinking today in Ireland.

I am using a tool called ScraperWiki that I learned earlier this year at a Hacks and Hackers Day in Dublin. ScraperWiki lets you scrape data from various sources such as a PDF or in this case, Twitter. My scraper grabs any tweet mentioning aras, aras11 or president originating from Ireland.

 

JUST THE WORDS

We can use a tool like IBM’s Many Eyes to visualize the most frequently referenced words in these tweets. The visualization below, embedded from Many Eyes, shows that Norris and Gallagher are probably the two most discussed politicians on Twitter.  You can right click on the visualization to alter it, remove certain words (I removed things like “RT” and “QUOT” and “ARAS11” as they weren’t relevant), change colours, etc.

*NOTE: Many Eyes is a Java tool, so you will need Java to interact with the data. If you can’t view the visualizations, please scroll to the bottom where I have screenshots of the data instead*

 

WHAT ABOUT ASSOCIATIONS?

More interesting than the individual words themselves, to me, are the associations they have. In other words, is one candidate’s name mentioned frequently in the context of other certain words or phrases?

The Customized Word Tree, another tool from Many Eyes, allows you to upload a text and then enter specific words to find other terms associated with it. To use this interactive tool, simply type in a name like “Gallagher”, “Norris”, “Dana”, etc. into the Search textbox & hit return. You’ll see a visualization of words and phrases most frequently associated with that candidate.

*NOTE: Many Eyes is a Java tool, so you will need Java to interact with the data. If you can’t view the visualizations, please scroll to the bottom where I have screenshots of the data instead*

 

SENTIMENT

Does anyone care about sentiment analysis anymore? Sentiment analysis is trying to understand the general feeling, positive or negative, from a group given a topic. So if you did sentiment analysis on Twitter for the term “taxes”, you’d probably find most people associate that with negative feelings, frown emoticons, and an overall negative sentiment. Unless of course the government had announced huge tax refunds for everyone, in which case it would likely be overwhelmingly positive.

twitrratr is an example of a tool that does sentiment analysis given a topic. It’s as simple to use as Twitter search, but the results in this case aren’t incredibly useful.  You can check the sentiment yourself easily by clicking here: http://twitrratr.com/search/aras11.

twitrratr

 

HOW TO CREATE YOUR OWN

ScraperWiki is great because you can use a variety of programming languages and it has support for lots of different sources including PDFs which are notoriously hard to parse.

I forked a basic Twitter scraper that looks for tweets containing keywords. You can see my scraper here: https://scraperwiki.com/scrapers/aras_election_data/.  The Twitter search API lets you use regular expressions, so I edited the keyword to be ‘aras OR aras11 OR president’.  Searching for president could definitely bring up irrelevant tweets for this purpose, so I also added a geolocation query. The Twitter search API lets you use a latitude & longitude followed by a radius to find tweets in a particular area. I added some very simple Python code to the scraper to allow it to handle geolocation queries.

As you’re developing your scraper, you can run it every time you change something to make sure you are getting the results you expect. Once you’ve finished, you can schedule it to run daily, weekly, etc.  If you get stuck, the ScraperWiki community is a great group of people, they have a very active Google Group and growing documentation.

Once you have the data you need, you can export it as a SQLite database or a CSV file. There are plenty of tools you can use with this data. Many Eyes is a good one to start with as it’s very user friendly. If you’re into programming, there are many good JavaScript libraries and other tools you can use to manipulate the data.  Just search online for things like “data visualization tools.”

 

MORE ON THIS

My scraper runs once a day, so I’ll be updating the interactive charts daily from now until October 26th when the election is held. If there’s other information you think would be useful or interesting to look at, related to the candidates or the upcoming election, please leave a comment and I’ll take a look.

 

SCREENSHOTS FOR THOSE WHO CAN’T VIEW MANY EYES

Word Analysis:

Many Eyes Word Cloud

 

Candidate Association Examples:

Many Eyes Word Tree

Many Eyes Word Tree

Many Eyes Word Tree

3 Comments

  1. Hi Martha,
    Thats very interesting. Wouldn’t it be interesting to see what the data looks like relative to media sources? I.e. does ‘RTE’ reflect broadly negative sentiment or conversations and, for example, does ‘MRBI’ reflect positive sentiment? Taking it a step further how about individual presenters…’Marian’,’Vincent’,’George’,’Sean’ etc.

    Just a thought.
    Cheers,
    Greg

  2. Hi Greg,

    Thanks very much for your comment and for taking a look. I agree, I think it would be really interesting to take a look at how the media outlets are perceiving the individual candidates. Great idea. I’ll take a look this week and see if I can put together some sentiment data from news sites.

    Thanks again Greg.

Comments are closed.