Australians reacted more “positive” than “negative” to the election of Donald Trump as the next president of the United States, according to a sentiment analysis study of tweets that were posted at the time.
Only tweets sent on November 10, 2016, (just after the result of the US election) that included the word “Trump” and were sent from an Australian capital city were analysed.
This resulted in 32,908 tweets including retweets being retrieved. For the purpose of this analysis we classified the tweet sentiment as either positive, negative or neutral.
The figures (above) display the sentiment for each capital city and show that in Sydney, Brisbane, Canberra and Hobart there were more positive tweets about Trump. In Darwin, Adelaide, Melbourne and Perth there were more negative tweets.
But counted overall, 48.63% of the tweets were considered positive compared to 44.65% negative and 6.72% neutral.
More detail in the tweets
To try to get a better understanding of the divided sentiment, individual tweets were investigated and it soon became apparent that the sentiment analysis had difficulty identifying sarcasm and humour.
For example, tweets that included “LOL” (Laughing Out Loud) were interpreted as a strong positive sentiment where in many cases it was not.
In analysis of another tweet “Dear Harvard Business School: don’t normalize Trump’s rise to power. Fascism is not a “marketing strategy”. @HarvardHBS @HBSWK“, from Melbourne’s @creatrixtiara (above), it was determined that this was a request rather than a sentiment.
The system used could not determine the sentiment of a tweet by Brisbane’s @JamesPinnell: “The only thing that gives me a tiny inkling of hope is that Trump’s kids seem to actually be fairly bright and also in his ear a lot.”
It’s the popular words that count
We performed an analysis of the most popular words and terms used in the positive and negative sentiment tweets, in order to get a better insight into the intended sentiments.
The analysis shows the disparity of views and sentiments which have characterised this election.
The tools of the study
This study was made possible due to recent advances in business analytic tools, with Victoria University’s Business Analytics and Big Data Lab working in partnership with SAP.
Traditionally business analytic tools focus on structured data to gain insight and facilitate decision making. This type of data is contained in databases and spreadsheets and is characterised by a combination of fields in a record. Structured data has the advantage of being easily entered, stored, queried and analysed.
But much of the data that is contained in social media, including tweets, is referred to as unstructured data. It doesn’t reside in fields and record structures and so it’s difficult to analyse using traditional methods.
A technique referred to as Text Analysis is the process of analysing unstructured text to extract relevant information and then transform that information into a structured format for analysis.
Text Analysis uses Natural Language Processing to linguistically understand the text and apply statistical techniques to facilitate the analyses.
SAP’s HANA database platform can search, analyse and mine text. It allowed us to perform a traditional exact string search such as “Trump is wonderful” or a fuzzy search (Google like) where text can be found irrespective of the sequence of words.
It also allowed us to provide meaning to the text through tokenisation and stemming.
For example in the text “Trump wins Florida in 2016”, SAP HANA would identify the entities of Trump as a Person, Florida as a State and 2016 as a Year.
This form of analysis can be further enhanced through the use of fact extraction. This is where rules are used as a basis to determine relationships between the identified entities.
Finding the positives and negatives
The most common form of fact extraction is sentiment analysis. A statement like “I love Trump” would be identified as a strong positive sentiment in relation to Trump.
The polarity of a statement can be identified (either strong or weak) in addition to the sentiment (either positive, negative or neutral).
A number of pre-defined rules exist in SAP HANA to facilitate sentiment analysis but these can be further extended through customisation of keyword dictionaries depending on the scenario.
For example the word “Trump” can be restricted to refer to only a person rather than an action (for instance playing a trump card). The sentiment analysis can be applied in ten different languages and can also identify requests, emoticons and profanity.
It is obvious that any analysis needs to be treated with caution in regards to sarcasm, humour and other possible variations. As the tools improve, these issues will hopefully be addressed.
But the research provided an example of how natural language processing can be applied to social media to gain insight to the sentiment of a specific population in regards to an event, as well as to potential limitations.
We were also mindful that we were looking at tweets in Australia of an event that was happening elsewhere, in the US. We look forward to analysing the next election, possibly in Australia.
It would also interesting to see how people react once Trump is installed as the 45th US president. He has promised to continue using his personal Twitter handle @realDonaldTrump instead of @POTUS, used by outgoing 44th president, Barack Obama.