RENCI to lead one of 12 projects to create an NIH Data Commons

The National Institutes of Health (NIH) has tapped the University of North Carolina at Chapel Hill and RENCI, a UNC technology research institute, to lead a project that is part of a nationwide effort to develop an NIH Data Commons, a shared virtual space where biomedical researchers can easily and securely work with data, analytical tools, and applications.

The Data Commons Pilot Phase provides nearly $646,000 to RENCI and UNC-Chapel Hill and another $578,000 to partners at seven institutions. The project launched in September and phase 1 will conclude in 2018. The award is one of 12 Data Commons Pilot Phase awards totaling $9 million announced by the NIH on Nov. 6.

The NIH Data Commons Pilot Phase is expected to span fiscal years 2017-2020. The pilot phase will explore the feasibility and best practices for making digital objects available through collaborative platforms. The work will be done on public clouds, which are virtual spaces where service providers make resources, such as applications and storage, available over the internet. The Data Commons aims to make biomedical research data Findable, Accessible, Interoperable, and Reusable (FAIR) for more biomedical researchers.

“Harvesting the wealth of information in biomedical data will advance our understanding of human health and disease,” said NIH Director Francis S. Collins, M.D., Ph.D. “However, poor data accessibility is a major barrier to translating data into understanding. The NIH Data Commons Pilot Phase is an important effort to remove that barrier.”

The UNC/RENCI Data Commons effort will engage multi-institutional, multidisciplinary teams to address eight key capabilities: guidelines and metrics for making data findable, accessible, interoperable and reusable; Global Unique Identifiers (GUIDs, which are numbers used to identify information in computer systems); open standard application programming interfaces (APIs); cloud agnostic architectures; workspaces for computation; research ethics, privacy and security (including authentication and authorization); indexing and searching; and use cases. RENCI is the overall lead for the UNC Data Commons Pilot Phase project and also leads the efforts on cloud agnostic architectures, workspaces for computation, and indexing and searching.

Some of the products that the RENCI teams plan to deliver include a tool called PIVOT, an acronym for Policy-driven, deeply integrated, Virtualized abstractiOn and federation. PIVOT, to be developed with partners from RTI International, will be a “cloud within clouds” architecture that integrates analytics tools, workflows, data optimization services, and other services that support FAIR principles into the core functionalities of a cloud.

A RENCI-led team also will develop and deploy cloud agnostic virtual workspaces for computation that can make use of many different scientific workflows and can dynamically provision technical infrastructure and resources across the Commons. In addition, RENCI researchers will work with partners at Lawrence Berkeley National Laboratory and Johns Hopkins University to deliver indexing and search capabilities that allow biological researchers to easily search free-form text, find digital objects, assign IDs, and search using ontology-based inference.

“We live in a time when digital biomedical data are ubiquitous, but the challenge is extracting value from those data in ways that lead to scientific breakthroughs and innovations in healthcare delivery,” said Stan Ahalt, PhD, director of RENCI and lead principal investigator for the project. “The NIH Data Commons addresses all the key questions that need to be answered in order to make biomedical data easy to find, access, analyze, share, and reuse. RENCI has been dealing with these kinds of questions for several years in other projects and we look forward to applying and leveraging what we’ve learned in this major, nationwide effort.”

RENCI’s partners in the Data Commons project include researchers at the UNC School of Medicine and the School of Information and Library Science as well as RTI International, Jackson Laboratory for Genomic Medicine, Johns Hopkins University, Lawrence Berkeley National Laboratory, Maastricht University, University of New Mexico, and Oregon Health and Science University.

“RENCI’s Innovations in data science are integral to the research we do here at UNC,” said Terry Magnuson, vice chancellor for research and NIH Council of Councils member. “The collaborative nature of our researchers, across disciplines and with other universities, creates a perfect environment in which medicine, information and data science can converge to advance discovery in biomedical research.”

For more information on the NIH Data Commons Pilot Phase, see https://go.usa.gov/xnbRX.

RENCI’s work as part the NIH Data Commons Pilot Phase is supported by award number 1OT3OD025464-01.