Beyond
Predictive Text

Will your city be under attack in the next six months? What are the odds that your favorite team will make the playoffs? USC researchers use AI to look for answers.

By Magali Gruet
(Illustrations/Israel Vargas)

Kristina Lerman remembers conducting her first successful forecast work in 2012.

The principal scientist at USC’s Information Sciences Institute (ISI) was predicting which news articles would go viral on social media: “We focused on Twitter data and wanted to predict how many people would retweet something, and it worked very well.”

Though they’re not Nostradamus, Lerman and her colleagues at the ISI have a strong track record of forecasting. They have been working on predicting events using artificial intelligence (AI) for more than a decade. Their studies range from guessing sports results to surprising topics like anticipating insurgent attacks or predicting who will get married.

Finding ways to make accurate, informed predictions helps industries, businesses and policymakers develop proactive strategies for managing resources and planning for the future. In the past, researchers used good ol’ math—involving probabilities and statistics—to create forecasts.

Today, ISI’s methods are more sophisticated.

Using AI allows forecasters to predict societal events from various sources, including structural data, news feeds, blogs, social media, web search queries and other publicly available information.

Can Machines Learn from Humans?

In the past decade, the U.S. intelligence community has worked with ISI to help identify national security problems before they happen. In 2015, the Intelligence Advanced Research Projects Activity (IARPA), a government agency that invests in high-risk, high-payoff research, funded ISI’s work on a project called Mercury. Researchers used classified data to see if it would improve their performance in predicting political crises, disease outbreaks, terrorist activity and military action — and it did.

In 2017, ISI began working on the SAGE project, also funded by IARPA. SAGE used a hybrid approach: “We were trying to team humans up with machines to get a more accurate forecast,” explains Fred Mortsatter, ISI researcher and SAGE principal investigator.

To do so, his team needed to understand human thought processes, decode human values and gather human experiences. SAGE then created mental models to feed an algorithm able to map human knowledge based on their subjects’ location. They called the process “humanization.”

Morstatter notes that this model would have been beneficial for a crisis like the Ebola outbreak in West Africa in 2016.

“A lot of the groups that were affected had the practice of taking the dead home with them before they buried them, and that is exactly how Ebola spreads,” Morstatter says. “If the health authorities had known this custom existed, they could have better predicted the actual extent of the epidemic.”

“Locals know things about their environment just from experiencing it. It is not necessarily written down or explicit; they have customs they follow. For an outsider understanding what they are doing can be very difficult,” he says.

Is an Attack Imminent?

In 2022, ISI is focused on Mozambique’s Cabo Delgado province through the DARPA (Defense Advanced Research Projects Agency)-funded VENICE (VErifyiNg Implicit Cultural modEls) project. According to the nonprofit Save the Children, 30,000 children have fled the region due to insurgent violence this past June alone. The VENICE project captures local knowledge to give military operators an insider perspective for decision-making.

Keith Burghardt, an ISI researcher working on VENICE, uses data from satellite images to map the likelihood of a village attack. “By understanding where ethnic conflict occurs, we have a way to improve government stability,” he says. “With this information, [governments] can better utilize their resources to keep civilians safer and protect important infrastructure.”

ISI researchers in AI, such as Burghardt, have been working with political scientists, psychologists and anthropologists to develop risk assessments for attacks within nations experiencing low-intensity conflicts. They base the assessments on ethnic and religious tensions and locations of strategic importance.

For example, Burghardt says an attack on an oil refinery in Mozambique can result in reduced economic output, harming the local community’s income and increasing tensions elsewhere. He says, “Attacks harm nations indirectly through internal migration and a loss in government trust.”

The Ultimate Goal: Predicting any Event

“The big idea behind those projects is that we want to be able to answer any questions: Who is going to win the next election, what will be the price of corn in France next year,” Morstatter says.

“We want to know where events are more likely to occur,” adds Jay Pujara, team leader at ISI. “What are the hotspots for protests, attacks or conflict? Where is the next drought going to be, or the next famine?”

For example, in the Dallas-Fort Worth metroplex in Texas, flash flooding kills many annually. ISI’s team manages the data needed by the flash flood forecasting system (including soil moisture, temperature, surface permeability, vegetation, and rainfall) and maps the results, triggering alerts in risk areas. “Our ultimate goal is to create a heatmap to see where we should focus our energy on a particular matter,” says Pujara.

But ISI’s forecast work is not only about predicting the worst.

Dodgers or Mets?

Who are the hot teams this year? Pujara and his team have been trying to answer this question using an AI tool using temporal knowledge graphs. He says that predicting exact wins and scores is difficult because so many variables have to be considered.

But Pujara can predict trends such as who will be in the playoffs or the World Series with 70% accuracy. The main challenge is the lack of data available for events that happen only once a year.

Context—who the players are, who is injured, where they’re playing, and everything that surrounds the event—also needs to be considered, says Pujara. “We don’t want to learn just the outcome of the match but also how the team and the individual players might influence the outcome of that match.”

Seeing the Cites

ISI researchers have been working on a topic especially close to their hearts: Why are some scientific papers cited more than others, and where is science going? They created a dataset of 200 million papers and six billion citations, including 8.4 million papers for 2021 alone. “We want to show how scientific research interconnects,” says Pujara. “If we could predict those research relationships, we could support [the research] by sending more funding. Having this knowledge would help accelerate innovation.”

Lerman would like to see forecasting applied to guide people’s behavior. “If we can understand how people think, we can encourage them to make the best decisions for their health, for example.”

Morstatter is also foreseeing potential new projects for ISI: “We could look at how people’s expression of identity evolves over social media,” he said. “Understanding the process by which they change their opinion would be very interesting, and we could reuse our work on mental models for this project.”

But could event prediction ever go too far? “It’s a tricky question. If we were very good at this, what would you do?” Pujara asks. “We are not trying to build this dystopian fantasy. We are trying to understand the world to make it a better, safer place. We could intervene earlier to help research innovation and prevent violent attacks, so people don’t get hurt.”