Professor’s Algorithm Forecasts Italian Outbreak
April 28, 2020
In a rare crisis, such as the new coronavirus (COVID-19) pandemic, global leaders often look to other countries’ experiences to try and learn how to strategically respond.
With Italy heading the virus in such a high magnitude in early March, it became the country that policymakers across the globe looked to for navigating the uncharted territory of the virus within their own countries.
However, Italy’s projection models for coronavirus cases are only as accurate as its data. Currently, the data projections made in Italy are based on non-random, non-representative samples of the population, as the only people being tested are those who show signs of symptoms related to the virus.
Since many people with the virus can be asymptomatic, only testing those with symptoms leads to inaccurate data. Knowing how to construct accurate projections from limited data sets is crucial to prevent the spread of misleading information.
In 2009, professor of economics at Fordham University Hrishikesh “Rick” Vinod developed a statistical algorithm, the maximum entropy bootstrap (MEB) based on the traditional bootstrap method. His statistical tool is being used by researcher Livio Fenga, Ph.D., from the Italian National Institute of Statistics in order to predict the trend of the coronavirus in Italy.
The MEB is a computer algorithm that constructs confidence intervals by shuffling short-period datasets while upholding the data’s time dependency, according to Vinod.
The computer algorithm can remember the up-down pattern of the data over time, which the traditional bootstrap method is not capable of doing. This allows Vinod’s algorithm to minimize its reliance on arbitrary guesswork for generating the projection.
Vinod used the analogy of forecasting snowfalls to help explain his method. When a weather forecaster states that he is 90% confident one to two inches of snow will fall in a particular region, then he is providing a 90% confidence interval for that snowfall range.
“Since meteorologists have many years of data for developing snow forecasting models based on long time series of wind velocity, temperature and humidity, they have a summary of 100 scenarios ranging from zero inches (best case) to five inches (worst case),” Vinod explained.
If a meteorologist went to a new area, where only 30 instead of 100 observations of snowfall data were available, then this would make him unable to predict snowfall with 90% confidence. He would have to use the limited data available to him to make his prediction.
With the traditional bootstrap method, the meteorologist’s prediction process would essentially look like this: Write down the 30 snowfall observations on a deck of 30 cards, pick one up and record the number, shuffle the cards, and then repeat 100 times. “The 90% confidence interval is constructed by focusing on the middlemost 90 snowfall numbers,” Vinod said.
Now, change “snowfall” to COVID-19 deaths or ICU requirements in a small region; the random selection of each card would not be sufficient to predict the trend of the virus or ICU demand in each region.
“The up-down pattern in these 30 numbers contains valuable information,” Vinod said.
With the MEB, however, the algorithm allows each regional data series to have its own unique pattern. In this way, the likelihood of each data point’s selection is based on previous trends, and thus not arbitrarily generated.
Vinod said that he never imagined his MEB would be applied to a situation like this, referring to the pandemic.
“I did know that it has wide applications in different fields, but something like this was quite a surprise,” he said.
Since he last checked on April 21, Vinod’s computer algorithm has been downloaded almost 49,000 times, according to Vinod.
“It is applied in so many different spheres,” Vinod said. “There are a lot of problems where the inference is difficult and you have time sensitive data.”
Accurate projections for time-sensitive data are necessary at the regional level in order for public officials to make critical decisions, such as how many snow plows to have on hand. With the current crisis of the pandemic, accurate projections are even more critical.
Having accurate projection models regionally for COVID-19 is necessary so officials can optimize the number of ventilators, hospital beds and ICU beds for the situation.
For this reason, Vinod said that he thinks the “MEB would be helpful to apply to NYC or other US states, since currently the US is struggling with a lack of testing data.”
In a recent interview with Fordham News, he said that it is very important that decisions be based on science and data, rather than on hope. He is very pleased that the work of economists is being applied to the medical sector.