The Microbiome Cloud Project (MCP), developed by NIAID and the National Human Genome Research Institute (NHGRI), addresses one of the greatest challenges facing microbiome scientists—large-scale data analyses. The MCP seeks to facilitate data access and analysis by bringing together NIH-funded Human Microbiome Project (HMP) data and analysis tools in a collaborative online environment. In September 2013, NIAID and NHGRI launched the first phase of the MCP, making a portion of HMP data available as a public dataset on the Amazon Web Services cloud.
Thousands of species of bacteria, fungi, and other microbes naturally colonize the skin and the moist linings of the digestive, respiratory, and urogenital tracts. Researchers are just beginning to assess the impact of these microbial inhabitants on health and disease.
As part of the HMP, scientists are sequencing genomes from known microbes and performing metagenomic analyses to characterize the body’s microbial communities. HMP contributors have generated approximately 14 terabytes of sequence data—enough information to fill more than 3,000 standard DVDs. In total, the microbiome is estimated to be over 100 times larger than the human genome.
Mining HMP data promises to help researchers understand the role of the microbiota and identify new targets for drugs and vaccines. However, analyzing these data presents challenges. Data downloads are time-consuming, and many researchers do not have access to the computing infrastructure, analysis tools, or technical expertise required to assemble and analyze complex datasets.
The Microbiome Cloud Project
In response to these challenges, NIAID and NHGRI assembled a team of experts from academia and industry to develop an online solution that brings together data and tools in a collaborative environment. The launch of cloud-based HMP datasets is already helping researchers navigate the data more easily.
The team currently is developing the next phase of the MCP, which will add more datasets, analysis tools, and supporting documentation such as online tutorials. This cloud environment will offer researchers access to vast amounts of data and high-performance computing resources.
The MCP promises to facilitate use of the available HMP data. By bringing together data and tools in the cloud, MCP will encourage greater scientific collaboration and data sharing, which in turn will help advance science. The project also will inform NIH best practices for using cloud technologies for biomedical research.
The MCP team is preparing to launch the next phase of the project. After a public testing period, the team will review the results and lessons learned and will share these with the broader scientific community.