Calling it a tidal wave of numbers, graphs, or data won’t be an overstatement in the current COVID-19 scenario. Petabytes of data running through the information expressway and almost every corporation scurrying to assimilate it – for what exactly?
As the entire globe is struggling to grab hold of the sheer scale and scope of the problem, data aggregators and tech giants are trying to step up by introducing newer ways to predict, analyze, and derive comforting insights from numbers. From allocating resources, measuring the effectiveness of pandemic, to crafting responses – data can be the only tangible thing in the face of uncertainty.
A recent webinar on ‘Using Data Science to Fight COVID-19’ by Munther Dahleh, Director of MIT Institute for Data Systems, and Society (IDSS), conducted in collaboration with Great Learning suggested the same. IDSS has created a volunteer research group, called Isolat, that helps in providing pandemic-related data.
“Statistics and probability are primarily tools that can be leveraged to measure uncertainty, and IDSS can use the expertise to impact policymaking.” Said Prof. Dahleh. The webinar spoke about how data can be noisy but still made relevant followed by the approaches and broad areas that data can bring to the table including –
- Creation of heterogeneous data sets such as interventions, mobility, and spread of the virus
- Predicting an array of critical time-bound variables
- Making sense of policies and effects on intervention with respect to the virus
Relatively less has been spoken about how COVID-19 and advanced technologies can be the best solution to assess the situation. Conversations on the dinner table have now turned to the present problems – owing to the unpredictability of COVID-19. about the future on the dinner table have now translated into the ‘present’ problems – owing to the unpredictability of COVID-19.
“Is a financial collapse inevitable? The 2008 crisis was a systemic risk because the mortgage market was fragile. And if real-time data collection can be an essential part of the ecosystem today, this can be avoided.” – While the financial side of the COVID-19’s consequences looks far-sighted now, Prof Dahleh believes this is one aspect that should be taken into consideration.
More data, more insights, or less data, even better insights?
The debate between quality and quantity can’t and shouldn’t be overlooked. Probably, the answer here lies in the right methods and most importantly, computation. Putting the right computational power and data into the problem could be a driving factor to reduce the impact of a pandemic.
To get the basics of data points in order – There are around 31mn cases, 980k deaths, limited PCR testing, different symptoms, a larger incubation period, and questionable antibody testing. The only motive behind these numbers is to manage the number of infections and use the resources, while at it.
The three heterogeneous components to one of the most easily understood yet complex datasets are physical engineered systems, social behaviour, and institutions. The property, propagation, weight, and size are the biological aspects of COVID-19; social behaviour is what we call the economics of society which can be easily tracked via datasets if managed properly, and policymakers, stakeholders, economic forums and health organizations make up for the institutions.
Mitigating the problem has its roadblocks because the datasets are unstructured – there could be 90% of people going to work in one country whereas, 90% could be working from home in another.
Prof Dahleh, along with this team has worked on a solution that involved two brackets to gather data – observations and decision systems. “How much lockdown is necessary to contain this disease?” From the right approaches to how testing is a precautionary measure, there could be many aspects to these datasets than just numbers.
A little analysis of the problem could encompass questions such as – When there was no lockdown, a lot of people were dying from the vulnerable sections of society. One of the reasons was them living in high-density areas, for instance. But when the lockdown was imposed, they had to step out to feed the families and they were dying still. The solution?
Prof Dahleh suggests policymakers can infuse reforms in such societies with the precise data on what, when, and how of the communities, and the jobs that require them to move out.
Data, data, data – Noisy, but useful
Throughout the pandemic, a lot of emphasis has been put on sharing critical information across many countries. It made sense then, and it makes sense now. IDSS, during its research, came across two different scenarios to understand the situation better.
Scenario 1 –
This is a systematic way of estimating the growth rate of COVID-19 – On the left in this diagram here, they plotted the number of infected people as a function of time – in a situation where the pandemic is under control. Initially, things grow, they peak and come down, with every patient presenting different symptoms – from asymptomatic, mild to severe.
On the right, there is an exponential rise in the number of cases and another level of growth with no as such exponential – in a situation where the pandemic is not controlled.
The insights from this? – Random sampling. In March-April, there was no random testing in Iceland. People getting tested were self-declared sick (‘Purple’ In the picture).
Scenario 2 –
In the second scenario, the team assumed severe cases are a fixed ratio of total cases – this presented only the severe cases data. Followed by an assumption where people with severe symptoms are the constant fraction of people who have COVID-19, this data all suddenly became valuable. “What matters is the ratio, not the absolute number, it just cares about the growth” is what Prof Dahleh commented.
Does Intervention even matter?
Every country across the globe wanted to measure the lockdown – and Google’s data turned out to be helpful– how many businesses shut down, the number of new hospitals, so on and so forth. The IDSS team evaluated datasets of the United States and the United Kingdom. This is the curve of the number of deaths –
Followed by what happened in India after the lockdown was imposed –
“If India didn’t impose the lockdown early, there’d have been 9X times more deaths.” Is how Prof Dahleh put it.
More than any other human event in the recent past, this pandemic has been researched and studied intensely. There will undoubtedly be both good and bad data around it, but if it can be translated into a life-saving asset, we’d win.
This also asserts on the importance of data and its growing power in the field of technology. Data science has become a sought after career in today’s era of technology. If you want to learn data science, check out the Post Graduate Program in Data Science and Business Analytics program by McCombs School of Business at The University of Texas at Austin.0