• Fellowships
  • Reports
  • Lab
  • Storyboard
  • AndroidForMobile Foundation at Harvard
    HOME
              
    Foundation
    Reports
    Storyboard
    LATEST STORY
    Here’s Chalkbeat’s vision for local education news by 2025
    ABOUT                    SUBSCRIBE
    Aug. 16, 2019, 9:50 a.m.

    One potential route to flagging fake news at scale: Linguistic analysis

    It’s not perfect, but legitimate and faked news articles use language differently in ways that can be detected algorithmically: “On average, fake news articles use more expressions that are common in hate speech, as well as words related to sex, death, and anxiety.”

    Have you ever read something online and shared it among your networks, only to find out it was false?

    As a software engineer and computational linguist who spends most of her work (and even leisure) hours in front of a computer screen, I’m concerned about what I read online. In the age of social media, many of us consume unreliable news sources. We’re exposed to a wild flow of information in our social networks — especially if we spend a lot of time scanning our friends’ random posts on Twitter and Facebook.

    A study in the United Kingdom found that about two-thirds of the adults surveyed regularly read news on Facebook, and that half of those had the experience of initially believing a fake news story. Another study, conducted by researchers at MIT, focused on the cognitive aspects of exposure to fake news and found that, on average, newsreaders believe a false news headline at least 20 percent of the time.

    It’s often difficult to find the origin of a story after partisan groups, social media bots and friends of friends have shared it thousands of times. Sites that do fact-checking such as Snopes and BuzzFeed can only address a small portion of the most popular rumors.

    The technology behind the internet and social media has enabled this spread of misinformation; maybe it’s time to ask what this technology has to offer in addressing the problem.

    My colleagues and I at the Discourse Processing Lab at Simon Fraser University have conducted research on the linguistic characteristics of fake news. Recent advances in machine learning have made it possible for computers to instantaneously complete tasks that would have taken humans much longer. When machine learning is applied to natural language processing, it is possible to build text classification systems that can distinguish one type of text from another.

    During the past few years, natural language processing scientists have become more active in building algorithms to detect misinformation; this helps us to understand the characteristics of fake news and develop technology to help readers. One approach finds relevant sources of information, assigns each source a credibility score, and then integrates them in order to confirm or debunk a given claim. This approach is heavily dependent on tracking down the original source of news and scoring its credibility based on a variety of factors.

    A second approach examines the writing style of a news article rather than its origin. The linguistic characteristics of a written piece can tell us a lot about the authors and their motives. For example, specific words and phrases tend to occur more frequently in a deceptive text compared to one written honestly.

    Our research identifies linguistic characteristics to detect fake news using machine learning and natural language processing technology. Our analysis of a large collection of fact-checked news articles on a variety of topics shows that, on average, fake news articles use more expressions that are common in hate speech, as well as words related to sex, death, and anxiety. Genuine news, on the other hand, contains a larger proportion of words related to work (business) and money (economy).

    This suggests that a stylistic approach combined with machine learning might be useful in detecting suspicious news.

    Our fake news detector is built based on linguistic characteristics extracted from a large body of news articles. It takes a piece of text and shows how similar it is to the fake news and real news items that it has seen before. (Try it out!)

    The main challenge, however, is to build a system that can handle the vast variety of news topics and the quick change of headlines online. Computer algorithms learn from samples, and if these samples are not sufficiently representative of online news, the model’s predictions would not be reliable.

    One option is to have human experts collect and label a large quantity of fake and real news articles. This data enables a machine-learning algorithm to find common features that keep occurring in each collection regardless of other varieties. Ultimately, the algorithm will be able to distinguish with confidence between previously unseen real or fake news articles.

    Fatemeh Torabi Asr is a postdoctoral research fellow in the Discourse Processing Lab at Simon Fraser University. This article is republished from The Conversation under a Creative Commons license.The Conversation

    POSTED     Aug. 16, 2019, 9:50 a.m.
    SHARE THIS STORY
       
      TWITTER   FACEBOOK   EMAIL   TUMBLR   LINKEDIN
    Show comments  
    Show tags
     
    Join the 50,000 who get the freshest future-of-journalism news in our daily email.
    Here’s Chalkbeat’s vision for local education news by 2025
    The network’s pitch to local funders: “By the time the school reforms reached their zenith, there was not a single local education reporter dedicated to covering them.”
    The New York Times shutters NYT en Español after three years: “It did not prove financially successful”
    NYT en Español’s founding editorial director called the decision “extremely short-sighted,” and many others who’d worked on the product or read and followed it expressed their disappointment.
    Nonprofit news outlets aren’t relying as heavily on foundations — but journalism philanthropy continues to grow
    “Nonprofit news organizations have much in common even if their scope or mission differs. Their journalistic missions are shaped largely by the gaps they are trying to fill — investigative at the state, national and global level; more general news at the local level.”