Leaders at the Biocomplexity Institute of Virginia Polytechnic Institute and State University (Virginia Tech), in Blacksburg, Virginia, have been busy working on an exciting initiative: they have been developing an analytic modeling platform and simulation environment that make it possible to prepare for and prevent the spread of common infectious diseases such flu as well as rare diseases like Ebola and Zika. Among the questions being explored: how to address the challenges of tracking constantly changing data to identify patterns and stop the spread of infectious diseases; how to understand the spread of diseases by gaining an understanding of where patients live and work, and their behavioral patterns; how to determine the advisability of mass inoculations in cases of disease outbreak. One technology partner in this work has been Persistent Systems, whose U.S. headquarters are in Santa Clara, California (with global headquarters in Pune, India).
Recently, Healthcare Informatics Editor-in-Chief Mark Hagland spoke with Chris Barrett, a Virginia Tech professor and the executive director of the university’s Biocomplexity Institute, about his and his colleagues’ initiative in this area. Below are excerpts from that interview with Professor Barrett.
Professor Barrett, could you walk me through some of your work on this initiative, and explain to me what is at the core of what you and your colleagues are doing?
We’re developing a system of tools that allow for a very, very granular access to entire populations, in the analysis of infectious disease, as well as other phenomena, such as environmental impacts around air quality, ozone, etc. In terms of infectious disease, the big-data world has allowed us a lot of individual access to psychosocial data—who you are, what you do, your demographics, etc. So we can build synthetic databases that provide for realistic patterns of movement and interactions in time and space. Say we’re talking about an aerosol-borne disease. We know where people are, and who’s close to whom for how long for a variety of reasons, so that we can analyze potential exposures in a very detailed way. So we’ve developed an infectious disease diffusion model, but built from the bottom up, detailed people and places, so that we can understand what aspects of behavior, what components of activities in a population, can lead to spreading infection; and that can guide decision-making to stop or mitigate that.
And you are drilling down on the levels of data you’re able to access, correct?
Yes, we have systems involving “avatars” for all 7.5 billion individuals on the planet and a billion-and-a--half activity locations. The United States can be pretty detailed, in terms of the data we can access; China is less detailed, but even in such places it is more detailed than you might suppose. So, to use the example of the United States, it’s a very, very large country. Most infectious disease epidemics will run for a couple of hundred days, and we can run a simulation of the activities of individuals, moving around, spreading the disease, let’s say, if it’s an aerosol-borne disease, figuring out who’s going where, based on activity patterns. We can run a 200-day epidemic projection on a population the size of the entire United States, in five to nine seconds. These things run on very, very fancy, high-end computers, supercomputers, and yet the analytics are delivered by web services. So we can provide you web access via your cell phone or laptop or tablet; and we’re designing interfaces, so that you can ask a question, and the program will answer it. So if you’re Johnny’s mom and dad, you can ask, how many kids are likely to have flu today? And what’s the likelihood that he’ll get sick if he goes to school? Those kinds of apps are available to everybody. And by having people asking those questions, we can adjust the underlying data representation, to use them like human sensors, and it becomes more of a data library, than a traditional human calculation.
So the technology allows for predictive and explanatory analytics that can be delivered to researchers, public health officials, and regular people on the street, and can guide their behavior and decision-making, whether we’re talking about a public health official or pharma manufacturer, or little Johnny’s mom and dad. And we can use this technology for a variety of uses, including infectious disease.
What’s novel about this approach?
What’s different in this from traditional modeling is that we’re using really a lot of personal and social information, as well as enterprise information in businesses. And it’s never really been possible to use data at this level of granularity. We can use synthetic populations and models to give us the capacity to generate the phenomena at the population level. In the past, we’ve had aggregate-level data. But this way, we can calculate new dynamics to generate new models.