The Importance of Data Understanding in Policy Decisions



Number of words: 1,235

Even as data science becomes ubiquitous, we still have a shortage of people who truly understand data. Yaneer Bar-Yam is a Professor and President of the New England Complex Systems Institute. He graduated from MIT and is an expert in complex systems. Throughout this pandemic, he has been meticulously analyzing COVID-19 data using both simple statistical models and complex system models and following government policies from all over the world. He wrote more than a few blog posts on his website to try to validate policy responses and to urge policymakers to consider how they are using data in their decision-making process.

What he sees are some common mistakes that we can avoid when analyzing our data. These are pervasive mistakes that people often make in different industries and in different research areas. As scientists, technologists, and policymakers, it’s possible to vastly improve our day-to-day judgments by side-stepping these mistakes.

Yaneer says, “If you want to use science for policy, you have to check the assumptions. In this case, it’s imperative to check the assumptions. It’s a responsibility.”
There’s a difference between doing science and setting policy. Politicians and business leaders are setting policy. These policies affect lives. The stakes are much higher when lives are involved.

During this pandemic, globally, according to the World Health Organization, we already have over 3 million sick individuals and lost 200K+ lives. Some countries’ policies prevented the community spread of the disease. However, other countries’ policies were not enough.

Yaneer says, “There are always scientific assumptions made in scientific papers. Policy-makers have to know how to check those assumptions made and learn to evaluate the findings in light of those assumptions.”

At the same time, in epidemiology, social science, and economics, statistical models are used widely for analysis. These models are adequate to find answers to certain questions. But, to model our world taking account of changes that may occur, we need more sophisticated models. In physics, there’s the notion of modeling complex systems that Professor Yaneer has been using in his decades of research.

Yaneer says, “Why do we need the science of complex systems? If there are dependencies in the systems, then statistics don’t work. Standard calculus can’t describe things properly when there are abrupt large scale changes that involve changes in what many individuals are doing.”

No matter what models you use, you are selecting a group of variables that will give you the best picture of the answers that you seek. Depending on the data, with the right variables, you will gain the answers you are looking for. But, with the wrong variables, you can be drastically misled.

Yaneer says, “Often, it’s not the math that is wrong. It’s the variables that are wrong. You need to figure out what the right variables are. When people write down models, there’s a perspective that you have to include all the details. That’s not the case. It’s not possible to include everything. So you may miss something important. At the same time, most of the details are not important.”

There are techniques in physics such as Renormalization Group that will enable you to identify the relevant variables. Then, you can validate your model.

Yaneer says, “If you did your analysis wrong, then your model won’t fit the data. We need to clarify what we need to pay attention to. Then, you can figure out what the interventions can be.”

In the past decade, we’ve built models in epidemiology, social sciences, and economics that are largely based on statistical frameworks. These models are using data from past events to predict future events.

The use of statistical models has a hundred-year history and scientific fields haven’t caught up to the advances in mathematical methods of complex systems science. They are not taught in most universities. Even the new AI methods are mostly based on statistical assumptions. New big data sources help but only if the modeling assumptions are changed.

There are many modeling techniques, from more classic differential equations to agent-based models. However, only if they use the right variables will they get the right answers.

We haven’t had coronavirus outbreaks globally before this outbreak. We don’t have past data to go on. During this pandemic, agents within the system: a business, a person, a family, a community, their behaviors change. Through these behavioral changes and interactions with one another inside this complex system, events occur. This points to the need of using complex system models that focus on the most important information in our analysis.

Yaneer says, “It’s not about the sophistication of the math. It’s really about the right variables. You can find a simple model. Sometimes you just need a simple model. Ask: Can that model answer the question that we want to know?”

When you think about the pandemic, its effect on people’s lives, its effect on societies, and healthcare systems, you quickly realize that you will need to adapt a complex system framework rather than a statistical framework, especially when it comes to policy.

Often, each community has different characteristics. For instance, in large cities, you have more density in your community, in rural areas, you have less contact between people, but you may also have more frequent gatherings of church groups. These characteristics will determine the rate of transmission within the community. Once a community spread occurs, it is difficult to stop.  

In contrast, the U.S., United Kingdom, Sweden are countries that are dealing with community spread due to acting too late.

Academic scientists often think that behavioral changes are not possible, especially in a short time frame. But, when the stakes are high, not only are behavioral changes possible, these individual behavior changes impact the outcome of the pandemic drastically on a society as a whole.

Recommendations for Citizens
We are not nearly at the end of the pandemic even though some communities are choosing to open back up. It’s a good time to review the policies that came out of the data we’ve collected since the beginning of the pandemic.

Notably, in some countries with community spread, ordinary citizens are making their own judgments. That is reassuring. Our democratic process allows us the freedom of choice. That choice has to be exercised to protect ourselves and not to put us in further danger.

Yaneer says, “Coronavirus is a terrible disease. As a result, we don’t want it. When you get it, you can get really sick. It spreads fast through the community. No matter what society you live in, the multiplication rate for the infection is about 10x per week. When the rate grows exponentially, we quickly go from outbreak to pandemic. You don’t want community spread. If you take the right actions quickly, then you save your life and save people’s lives in your community by preventing community spread of the virus.”

Debunking a few myths:

Herd Immunity
Yaneer says, “Herd Immunity is scientism and not science. It’s the idea that we should just let the disease kill people.”

Scaling Population Size
Yaneer says, “When it comes to preventing the community spread of the disease, scaling the population does not matter. If you look at countries that have been successful in preventing community spread, it only takes 5 weeks to stop the spread if you take drastic actions such as: implementing contact tracing, travel restrictions, extensive testing, wearing masks and isolating people without infecting their families.”

Excerpted from https://www.forbes.com/sites/cognitiveworld/2020/05/07/this-professor-says-weve-been-looking-at-the-coronavirus-data-wrong/

Leave a Comment