Using the NIAID Data Ecosystem Discovery Portal to Search Across Data Repositories

Data Science Dispatch |

NIAID has developed a platform to help researchers find data related to infectious and immune-mediated disease (IID) across multiple data repositories. The NIAID Data Ecosystem Discovery Portal is a centralized hub cataloging millions of datasets from over 50 sources.

Researchers can use the Discovery Portal to find data, resources, and computational tools from different repositories. This can save them time otherwise spent combing through multiple sources and help them find datasets they weren’t aware of previously.

The Discovery Portal includes resources from IID and generalist repositories. Representative resources include NIAID-sponsored repositories such as AccessClinicalData@NIAID, ImmPort, and VDJServer, as well as repositories funded outside of NIAID but relevant to IID research. Resources in the Discovery Portal include a diverse array of data types spanning multiple domains of IID research, including -omics data, clinical data, epidemiological data, pathogen-host interaction data, flow cytometry, imaging, and other experimental data.

The Discovery Portal supports NIAID objectives of maximizing the impact of scientific data, reducing duplication of efforts in research, and promoting data reuse, data transparency and compliance with data-sharing policies. The portal aligns with many of the principles of findable, accessible, interoperable, and reusable (FAIR) data practices by making data easier to find and access.

Using metadata to drive discovery

The NIAID Data Ecosystem Discovery Portal does not contain data itself. Instead, it contains detailed information about IID datasets and resources drawn from metadata. Users can then access the resources through external links.

The portal uses metadata to support several key features:

  • Search and Discovery: Users can rapidly search millions of datasets across both IID and generalist repositories using the Search or Advanced Search options. Metadata categories such as funding source, repository, and conditions of access help filter search results and identify relevant research data.
  • Metadata Compatibility: Each individual dataset in the Discovery Portal has a “metadata compatibility score,” which displays specific metadata elements collected for a given resource.  Additionally, the Discovery Portal has metadata compatibility visualizations which capture the breadth of metadata at the repository level. This information can help researchers and data contributors quickly understand a repository’s metadata structure, aiding in decisions about where to deposit or retrieve resources.
  • Downloadable Metadata: The portal has buttons that allow users to download metadata to perform meta-analyses.

The Discovery Portal is working to fill missing or incomplete metadata fields (such as Pathogen Species, Health Condition, and Host Species) by augmenting and standardizing metadata fields to provide more of this necessary information for users.

New Program Collection tool and other features

One of the new features of the NIAID Data Ecosystem Discovery Portal is the “Program Collection” filter. These are groups of datasets contributed by specialized NIAID research programs and initiatives. The Discovery Portal displays the Program Collection filter on the search page, and current efforts are focused on expanding Program Collection data.

The Program Collection filter allows researchers to discover high-quality, program-specific data relevant to their area of interest and find collections that align with the broader objectives of NIAID’s strategic research efforts. The feature also amplifies the scientific contributions of participating networks and increases the likelihood of researchers using these datasets. 

Using the Sources page of the Discovery Portal can also help researchers and data providers make informed decisions about different repositories where they can deposit their data.

The Discovery Portal is now connected to National Center for Biotechnology Information (NCBI) databases through NCBI LinkOut. When NCBI database content is linked to data described in the Portal, a link to the related Portal entry can be found on the NCBI page.

Learn more by visiting the Discovery Portal, reviewing the Getting Started page, and exploring the Knowledge Center

Content last reviewed on