Remember that video US president Donald Trump tweeted in which he wrestled someone to the ground and started punching them? It was genuine footage of Trump from a popular wrestling show but he had the image doctored to replace the victim's head with the CNN logo and added the hashtag #FraudNewsCNN, just in case we didn't get the memo that he really dislikes the news network.
A vocal fan of Fox News, Trump often tweets his praise while favouring the words “fake news” when mentioning CNN. But are these news networks as biased as he thinks? Do Fox News journalists say mostly nice things while those at CNN are busy portraying him in a negative light? Artificial intelligence (AI) in the form of sentiment analysis and stance detection can tell us what is really happening.
Dublin-based AI start-up Aylien, which recently secured €2 million in investment and the creation of 70 new jobs, specialises in understanding the meaning and sentiment in online news stories and social media posts.
Stance detection
Last week they launched a new feature called 360° stance detection, which can classify the attitude of a journalist as ‘for’, ‘against’, or ‘neutral’ towards the person, place or thing they are writing about. And what better way to explain how this kind of AI – machine learning and natural language processing (NLP) – works than to see if Trump is right about his beloved Fox News and nemesis CNN.
Aylien took the task of analysing the stance of individual journalists towards Trump over the course of a year and a half. Despite Trump’s regular tweets claiming that CNN is negatively biased towards him, Aylien found that “the vast majority of the content from CNN about Trump is neutral, but that journalists also take non-neutral stances toward the president”.
Interestingly, they also found that the vast majority of stories covering Trump on Fox News were also neutral but these were stories from foxnews.com, which appear to differ from the Fox News television channel, which Trump is known to watch frequently.
The purpose of this exercise was to demonstrate how far machine learning techniques have come in order to understand the nuances of written language and the complexity of human emotion. Using neural networks, or computer networks inspired by the structure of the human brain, it is possible to scan websites, tweets or Facebook posts and know how a person or entire crowd of people feel about a new brand of shampoo, a television show or even a political figure.
The Cambridge Analytica scandal has alerted us to the value of the information we post on social media platforms and what can happen when that information is abused. But this technology is also being used to gain further insights from things we want to be seen and understood such as hotel reviews on travel websites or media coverage on news sites. This is done through NLP.
“Language processing deals with giving computers the ability to understand human languages,” says Aylien founder and chief executive Parsa Ghaffari.
Things that come naturally to people are extraordinarily complex undertakings for a computer
“It gives computer software the ability to understand the tone of text or the entities – people, companies, etcetera – that are mentioned in a piece of text. NLP is a set of techniques that have developed over the last couple of decades that try to replicate the human ability to understand language.”
“The latest trend is deep learning using neural networks: when you marry NLP with state-of-the-art machine learning, or statistics, you get really interesting results. It is capable of picking up subtle hints and understanding complex things about language like stance and sentiment,” says Ghaffari.
Neural networks
Although AI has been around for more than half a century, an explosion in the area has happened in the past 10 years, says Rob Fraser, senior director of software engineering for Microsoft, which is a client of Aylien and its text analysis technology.
“The expression we use is ‘modern AI’ to separate it from the 60 years of history around artificial intelligence. Classical AI was about writing programs to mimic intelligence, whereas modern AI is about using machine learning and actually learning from data in the same way humans do. That’s where neural networks come in,” says Fraser.
“The way to think about AI is that it is not one thing. You think about AI as a spectrum: at one end of it – the simpler end – are voice agents or AI assistants like Cortana and bot frameworks to create simple chatbots.”
“This is easy for us to do,” adds Fraser, who says the spectrum then moves towards more intelligent cognitive services such as understanding sentiment in text, detecting objects in photos, or index video content. Things that come naturally to people are extraordinarily complex undertakings for a computer. We can read a document and explain the gist of it to another person but this kind of deep understanding of language and concepts for machines means breaking new ground using neural networking.
This is where Aylien’s technology comes into play. Their customers usually look for insights into news coverage and social media content that might be about their company or product.
“You might try to determine the sentiment, what are people saying about topic X or person Y and what the sentiment is towards that topic – how has it changed over time? And then some deeper insights: what aspects of that thing are they positive or negative about. If they’re talking about a new phone, what aspects are positive, what are negative, the battery life or screen quality? It’s about going in deeper and getting a very holistic view of what people think of a product,” says Ghaffari.
If a client wants to know how their product is being received by media or social media users, Aylien needs to figure out what ‘aspects’ exist for the product, be it a smartphone or a hotel room. If it’s a review on TripAdvisor, the neural network needs to figure out the important words, ie the staff or beds, and then determine if positive or negative things are being said about them.
How do these neural networks ‘know’ what they should be looking for? Like humans, they have to learn. This, Ghaffari says, is done through supervised learning: “You train your algorithms on a bunch of data in advance that are labelled by humans and the algorithm tries to mimic the same.”
Unsupervised learning is also useful. It doesn’t use labelled data but also doesn’t have rely on the time-consuming task of being trained against human-vetted examples: “We use a hybrid of both called semi-supervised learning. We have tens of thousands of clearly annotated examples but then we can very cheaply and easily get hundreds of thousands of additional reviews from various websites and those reviews will have star ratings, for example a review alongside three out of five stars, that act as weak supervision.
“In addition to your smaller subset of carefully annotated examples you also leverage this larger collection of unsupervised data and this works pretty well in practice,” says Ghaffari.
You can run someone's tweets through a detection model and come up with some judgments that this person is pro-this or anti-that
This kind of deeper understanding using machine learning is not just in specialist products, it is becoming ubiquitous, says Fraser: “AI is starting to become deeply embedded in all of our products. Whether it’s Skype, where you can now have a conversation among 100 people in 10 different languages where there is simultaneous translation across that conversation, or our Office products, where the inbox helps manage which email comes to you, which ones you should focus on and it learns from emails you get and how you handle them.”
Comprehension
Machine learning means an algorithm can not only read text but can also understand the story being told. Using the example of Harry Potter and the Philosopher's Stone, Fraser explains that Canadian AI start-up Maluuba (recently acquired by Microsoft) was able to answer reading comprehension questions after 'reading' all about the boy wizard.
A lot of these technologies, as Fraser says, would seem like magic if we demonstrated them to an audience seven or eight years ago but there is, of course, the fear that all this automation will replace even more human jobs.
“Microsoft has a very strong opinion around this and our position is that we very much think about AI as a way of empowering people, not replacing them. It is about enabling people by taking the robot out of their jobs,” he says, citing the example of its InnerEye technology that automates the mapping out of a patient’s tumour on X-rays, a task that normally takes hours by hand for an oncologist.
And in the wake of Cambridge Analytica, people are less worried about jobs than how much data they have inadvertently volunteered through social media posts and other online platforms and services – and the way this data can be used to reveal things about us that we never intended.
“You can run someone’s tweets through a detection model and come up with some judgments that this person is pro-this or anti-that. With Cambridge Analytica and so on people are worried and there is no reason not to be. It is the same technology, just applied or used for very different and evil purposes,” says Ghaffari.
"It sounds like an episode of Black Mirror but we've known this for years; people are warned all the time about these silos of data like Facebook and Instagram. The problem is when these data are put together and inferences are drawn from that: it becomes extremely powerful and potentially extremely dangerous.
“That’s the thing with technology: with great power comes great responsibility, it matters what use you put it to.”