The National Database for Autism Research (NDAR) is becoming a one-stop shop of sorts for autism researchers. Most recently, the National Institutes of Health (NIH) supported Bethesda, Md.-based government agency, which has created a scalable informatics platform, added the Autism Genetics Research Exchange (AGRE) to its database. The move, according to NDAR, makes this database one of the largest repositories to date of genetic, phenotypic, clinical, and medical imaging data related to research on autism spectrum disorder (ASD).
NDAR has built up its data repository under the federated model, which allows independent repositories to connect and operate on the same platform. AGRE is the latest repository to join NDAR placing it alongside the Pediatric MRI Data Repository, The Autism Tissue Program and The Interactive Autism Network. NDAR says it is trying to be the central query tool across all databases that involve autism research.
Recently, Healthcare Informatics Assistant Editor Gabriel Perna spoke with Dr. Greg Farber, Ph.D, NDAR’s director, as well as the National Institute for Mental Health’s (NIMH) director of the Office of Technology Development and Coordination, about the autism research data repository to discuss why the group chose a federated model and why autism is the right disorder for this kind of repository.
Dr. Greg Farber
Why autism, what makes it the right disorder to study under this kind of database model?
Fundamentally, not much is known about autism’s true biological underpinnings. And also, the prevalence of autism seems to be increasing quite dramatically. It really does look and feel like an important problem that would benefit from data aggregation, because the community as a whole doesn’t have a clear consensus on what we should do going forward. What NDAR allows us to do going forward is define all of the sub-categories of autism that exist, and hopefully find treatments that work for those sub-categories.
What do you like about a federated data repository model?
NDAR has really implemented an apples-to-apples comparison with the data dictionaries that we have defined and translated into the data elements at these other repositories. A researcher really can come in and search in one place without having to do that translation. When I say translation, I mean for example, ‘Is sex referred to as M/F or 1/0?” Those sorts of things. That sounds trivial, but as it turns out, if you’re a researcher and looking at data from multiple sources and that translation hasn’t been done, the researcher has a hard problem. It’s not only convenient, but it puts data together in ways that would have been very hard otherwise. Because not much is known about autism from a biological basis, this is a way to bring smaller collections into a larger collection, which is hopefully more powerful.
What kind of research data does AGRE bring to the table?
They have a variety of different types of data. I think that some of the most important parts of data are the sequencing, the genome data. People will hopefully be able to merge those data sets with others. That’s my view though, AGRE has a lot of data that the community will hopefully find interesting.
Are there more autism repositories that NDAR would like to federate with in the future?
Yes, there are several that we’re at various stages at. Until those agreements are finalized, I can’t talk about it. We expect AGRE won’t be the last one.
What has pushed this data repository along?
The government is trying to make data more readily available. However, with data involving human subjects, there are a number issues that one needs to worry about: security, confidentiality, privacy, etc. Because of that, it is not just a database, so we’ve had to build it ourselves from scratch to make sure we’ve dealt with all of the confidentiality issues appropriately.