St. Jude Children's Research Hospital announced this week the launch of St. Jude Cloud, an online data-sharing and collaboration platform that provides researchers access to what St. Jude describes as the world's largest public repository of pediatric cancer genomics data.
Developed as a partnership among St. Jude, DNAnexus and Microsoft, St. Jude Cloud provides accelerated data mining, analysis and visualization capabilities in a secure cloud-based environment. In a partnership with DNAnexus and Microsoft, St. Jude offers extensive next-generation sequencing data and unique analysis tools to accelerate research and cures for life-threatening pediatric diseases.
“Sharing research and scientific discoveries is vital to advancing cures and saving lives, especially in rare diseases like pediatric cancer,” James R. Downing, M.D., St. Jude president and chief executive officer, said in a statement. “St. Jude has shared data and resources since its founding, and collaboration with researchers across the world is at the core of our mission. St. Jude Cloud offers researchers access to genomics data and analysis tools that will drive faster progress toward cures for catastrophic diseases of childhood.”
According to St. Jude, the interactive data-sharing platform allows scientists to explore more than 5,000 whole-genome (WGS), 5,000 whole-exome (WES) and 1,200 RNA-Seq datasets from more than 5,000 pediatric cancer patients and survivors. By 2019, St. Jude expects to make 10,000 whole-genome sequences available on St. Jude Cloud.
These data have been generated from three large St. Jude-supported genomics initiatives: the St. Jude—Washington University Pediatric Cancer Genome Project, designed to understand the genetic origins of childhood cancers; the Genomes for Kids clinical trial, focused on moving whole genome sequencing into the clinic; and the St. Jude Lifetime Cohort study (St. Jude LIFE), which conducts comprehensive clinical evaluations on thousands of pediatric cancer survivors throughout their lives.
Researchers may also upload their own data in a private, password-protected environment to explore using tools available on the St. Jude Cloud platform.
The St. Jude Cloud features a collection of bioinformatics tools to help both experts and non-specialists gain novel insights from genomics data. These tools include validated data analysis pipelines and interactive visualization tools to make it easier to make discoveries from large datasets. Data and results can be securely shared with collaborators within the platform, St. Jude officials state.
The platform enables researchers to explore St. Jude data or their own results using interactive visualizations powered by ProteinPaint, the genomic visualization engine developed at St. Jude. The ProteinPaint visualizations allow users to rapidly navigate through the genome and identify genetic changes linked to cancer development. St. Jude Cloud tools also produce custom visualizations of the user’s own research data for exploration or comparison with St. Jude-generated data.
In a testament to how the platform enables research discoveries to be verified faster, a St. Jude scientist was able to use the St. Jude Cloud to replicate, in just a few days, experimental findings that originally took the research team more than two years to make. The original team discovered mutations connected to UV damage in a B-cell leukemia in work that was recently published in Nature. The intriguing finding led the team to ask whether other leukemia samples not included in the original study might have similar patterns of mutations. They turned to the high-quality datasets available in St. Jude Cloud, where the rapid computing capabilities of the St. Jude Cloud platform enabled them to re-discover the same UV-linked mutational signature in pediatric B-cell leukemia patients. Identification of these additional samples will help researchers understand how UV damage could be linked to a blood cancer, and potentially point to new avenues for therapy. More details of this work will be presented at the AACR annual meeting in Chicago on Sunday, April 15 at 3 p.m.
“St. Jude Cloud is a powerful resource to drive global research and discovery forward,” Jinghui Zhang, PH.D., chair of the St. Jude Department of Computational Biology and co-leader of the St. Jude Cloud project, said in a statement. “Providing genomic sequencing data to the global research community and making complex computational analysis pipelines easily accessible will lead to progress in eradicating childhood cancer. St. Jude has been committed to sequencing and understanding pediatric cancer genomes for nearly a decade, and we will continue to generate and share data with the research community in the future.”
The data available through St. Jude Cloud is stored on Microsoft Azure, which can handle datasets on the massive scale required for large genomics studies such as those developed by St. Jude.