PAPER: News stories’ heroes and villains can be detected automatically

Untitled by Marc Mueller, licence CC0 1.0

The way different actors are framed in news stories is an important part of news literacy, a team of Northwestern University researchers posits. Analysing each story carefully enough to detect these frames, however, is a tall order for the average reader. In order to assist critical reading, Diego Gomez-Zara, Miriam Boon and Larry Birnbaum have been developing an automated role detection software, that could be easily installed as a browser plug-in.

The software works by first detecting the relevant entities in a news story. This is done by taking into account all entities present in the headline, and three entities from the main text, which have the highest relevance scores. These scores are calculated according to how often the entity is mentioned and how early on in the text does the first mention occur.

The detected entities’ roles are determined based on three “dictionaries” – lists of words that are commonly associated with “heroes”, “villains” or “victims”. Each of these lists is approximately 200 words long and hand-picked by the researchers.

The actual analysis combines sentiment and similarity analysis. Words that are determined as negative, are compared against the “villain” dictionary, while positive and neutral words are compared against both “hero” and “victim” dictionaries. The model also takes into account the proximity of words to the entity in question: words that are very close are given more weight than words that are separated by many other words.

The researchers tested the software on real-life news stories with promising results. They present an example of two news articles regarding president Donald Trump’s visit to Paris, published by The New York Times and Fox News, respectively. The software detected president Trump as the “villain” of the NYT story, and the “hero” of the Fox News story.

The program is still in development, but the researchers say they are planning to release it to the general public soon. The team, however, emphasizes that the program’s accuracy should be tested against human assessments, which so far has not been done.

The paper “Who is the Hero, the Villain, and the Victim?” was presented at the 23rd International Conference on Intelligent User Interfaces. It is available online on Association for Computing Machinery’s digital library (open access).

Picture: Untitled by Marc Mueller, licence CC0 1.0.

Give us feedback