Fact-Checking With Wikipedia

from Pacific Standard

Fact-checking a news article before it publishes is, by most accounts, important. Yet cash-strapped news outlets have been hiring fewer fact checkers over the years. Today, the task falls mainly on already busy reporters and their editors—and increasingly on average citizens, who could use some help sorting fact from fiction, too. In a study out today, researchers suggest an unexpected source of help: Wikipedia.

The idea to use Wikipedia as a fact checker came not from journalists or social network users, but instead from the combined interests of six computer scientists, led by Indiana University postdoctoral fellow Giovanni Ciampaglia. “I got into this project as the Wikipedia expert,” says Ciampaglia, whose research focuses mainly on how information diffuses through societies. Others on the team were thinking about things like recommendation algorithms. But the researchers shared an interest, Ciampaglia says, in the practicalities of differentiating the real McCoy from complete malarkey.

Their solution: Look to the links within Wikipedia’s infoboxes, the boxed summaries near the top of many wiki entries. Infoboxes connect to all kinds of information—for example, Barack Obama’s infobox connects to the United States Senate, Illinois, and Columbia University, among other things. By following the links, you’ll find more distant connections to places like Clinton, New York, and, yes, Islam.

The researchers’ found that two aspects of those connections reveal something about a claim’s reality. First, they reasoned, the more clicks it takes to get from one page to another, the more tenuous the connection between the ideas represented on those pages. Second, the more generic intermediate pages are, the less likely they are to represent meaningful connections.

For example, it takes seven steps to get from Obama to Islam, and that path traverses the infoboxes of such entities as the Association of American Universities and Canada. In other words, you can find a connection between Obama and Islam if you want, but it is about as weak as they come.

Intuitive as that is, the team wanted to know how well their ideas worked in practice. After transforming their intuitions into a proper algorithm, the researchers tested it using topics including movies, presidential spouses, and U.S. and world capitals. Adding to the challenge, the team removed direct links, such as the one from Barack Obama’s infobox to Wikipedia’s Michelle Obama entry. Despite that, the algorithm gave greater credence to correct statements than to incorrect ones about 95 percent of the time.

Though the method isn’t always so accurate—it doesn’t actually do that well with U.S. capitals, for reasons the team isn’t quite sure of—it could still be a valuable tool, Ciampaglia says. He likens it to the grammar checkers now standard in word processors. “It’s there, but it doesn’t need to tell you” what to conclude, Ciampaglia says. “The journalist is not a passive entity.”


