Can we analyze emotions with Artificial Intelligence?

da | 1 Mar 2021 | Blog

In recent years, thanks to the Internet, our opportunities to express opinions and make judgments has increased exponentially, in the form of both ‘thumbs up’ and longer reviews. Unlike cocktail hour banter, however, our comments do not float away but remain in digital storage forever. Is it possible to extract information, or even better, something of value from this amount of data?

A few days ago a politician posted a farewell speech on a well-known social network that was commented on by over one hundred thousand users. Or think about a Youtube video by a pop group that can receive hundreds of millions of views but also millions of comments. It is inconceivable that a person, or even a team, could read all these comments and draw any conclusions from them. Is the politician still beloved? Did fans like the new song? Only an automatic analysis tool can provide these answers.

Sentiment Analysis

Sentiment Analysis is a calculation process that purposefully seeks to identify and categorize the opinions expressed in a text fragment. The main goal is to understand whether the author’s attitude towards that particular subject is positive, negative, or neutral. This type of analysis, which got started as a field of natural language processing (NLP), is therefore specialized in interpreting and cataloging emotions.

What are the possible usage scenarios? Sentiment Analysis is a great tool for analyzing product reviews, conducting market research, monitoring customer service or a social media account, and also managing the reputation of an organization or person.

But when using it for statistics, is it reasonable to think that the sample of people surfing the web is representative of the world, national, or local public opinion? Many companies and organizations think so and rely on this tool to guide their activities. So how do you conduct a sentiment analysis?

Let’s start with the data: a sample of phrases, which could be tweets or reviews of any kind, and then use software libraries to conduct an analysis. The key steps to take are:

  • delete words that do not help to determine a mood: articles, prepositions, conjunctions, etc. In jargon, these words are called “stop word”;
  • create a table, called frequency distributions, to determine how many times each word appears in a text;
  • extract concordances, that is, understand where a word appears in the text and which words appear before and after it;
  • extract bins, that is, that set of words that often appear together in a sentence, we can have combinations of 2, 3 or 4 words;

But these are just the initial steps that lead us to the point where we need to choose a pre-trained model according to the context of our problem. We could use, for example, NLTK (Natural Language ToolKit), a platform for writing programs in Python that work with human language. This platform offers, among its libraries, VADER (Valence Aware Dictionary and sEntiment Reasoner), a specialized model for the language used in social media, particularly suitable for short and slang phrases, but which loses effectiveness when sentences are longer and more structured.

The next steps are the usual ones: evaluate the efficiency of the model, possibly modify it by adding additional features and finally use it on new data.

As solutions become increasingly complex, it becomes convenient to choose a Cloud in order to take advantage of its simplicity when integrating between services. Just think about the infrastructure needed to capture continuous streams of sentences from multiple sources and analyze them in near real time, while channeling the results to other databases.

Let’s take an example

Major Cloud providers offer ready-made services for natural language processing as we saw in Salvatore’s article. So these services can identify the language of the text, extract the key phrases, places, people, brands or events, and understand whether the text has a positive or negative connotation, all while giving us near-immediate results. They all offer the ability to create a custom template for a set domain by defining specific custom entities.

In EllyCode we developed some multi-cloud libraries that allow us to do very simple tests on Azure, AWS and GCP. I couldn’t resist the temptation to try them out by analyzing the lyrics to Radiohead‘s song “No Surprises”.

Why Radiohead? Let’s say that I wanted to facilitate machine learning algorithms by choosing what is considered by many (but not by me) as one of the most depressing groups in the history of music.

The first lines are:

A heart that’s full up like a landfill
A job that slowly kills you
Bruises that won’t heal

You look so tired, unhappy
Bring down the government
They don’t, they don’t speak for us

There’s little doubt here. The results of the algorithms are very clear:

Score, Magnitude-0,8  0,8

In GCP, the score indicates the overall emotion found in the text, while magnitude indicates how much emotional content is present in the document. For example, clearly positive sentiment can be evaluated with a score of 0.8 and a magnitude of 3.0 while a clearly negative one with a score of -0.7 and a magnitude of 4.0.

The next sentence is the following:

I’ll take a quiet life
A handshake of carbon monoxide
No alarms and no surprises

Interestingly, Amazon interprets it clearly positively, while the other two services have less certainty.

Score, Magnitude-0,3  0,3

Of course, the sentence is shorter and the real meaning is surrounded by seemingly positive words. However, these systems are incapable of distinguishing sarcasm and have not been able to grasp an even more negative message than the first verse.


It is clear that we are faced with very powerful instruments but also with largely unexplored scenarios. A tweet is different from a poem or a chat in a room or a political rally: context, as we will see again in upcoming articles, is always essential!

Keep following us!

Scritto da

Scritto da

Salvatore Sorrentino