The Biomedical Data Translator Consortium announces publication of companion pieces

Prototype ‘Translator’ system shows promise and has garnered much enthusiasm roughly one year into feasibility assessment

The newly formed Biomedical Data Translator Consortium today announced the release of two inaugural publications in Clinical and Translational Science. The first paper, “Toward a Universal Biomedical Data Translator,” describes the efforts of the Consortium to develop a ‘Translator’ system designed to integrate a variety of data sources and translate the data into insights that can drive innovation and accelerate translational research. The second paper, “The Biomedical Data Translator Program: Conception, Culture, and Community,” focuses on the scientific community that has coalesced to support the program and drive research and development of the prototype Translator system.

The Consortium’s early success is likely attributable to several factors, including the fact that the biomedical research community has recently gained easier access to the myriad data sources, computational power, analytic tools, and intellectual expertise required to research and develop a Translator system. Another key factor that has contributed to the program’s success is the approach that the National Center for Advancing Translational Sciences (NCATS) has taken to conceptualizing, implementing, and managing the program.

The Consortium is aware that they are attempting something unprecedented with this project and its team science approach. “When we committed to this vision in 2016, we were well aware of its ambitious scope,” stated Christopher Austin, PhD, director of NCATS, in an editorial accompanying the papers. “We, therefore, designed the program to be different in virtually every way from how National Institutes of Health research projects are typically competed, supported, and managed, and have taken an explicitly flexible and staged approach to its construction.”

NCATS and the Biomedical Data Translator Consortium anticipate the continued success of the program. “Two hundred years ago, chemists created a comprehensive enumeration of the elements and systematic relationships among them,” stated Austin. “This Periodic Table transformed chemistry by placing it on firm scientific footing. We envision the Translator doing the same for translational science.”

About the NCATS Biomedical Data Translator Program
The NCATS Biomedical Data Translator Program was launched in October 2016, with funding from the National Center for Advancing Translational Sciences, a center within the National Institutes of Health (NIH awards 1OT3TR002019, 1OT3TR002020, 1OT3TR002025, 1OT3TR002026, 1OT3TR002027, 1OT2TR002514, 1OT2TR002515, 1OT2TR002517, 1OT2TR002520, 1OT2TR002584). Any opinions expressed in this press release are those of the Translator community writ large and do not necessarily reflect the views of NCATS, individual Translator team members, or affiliated organizations and institutions.

New Project Will Advance Virtual Laboratory Infrastructure

A new grant from the National Science Foundation (NSF) will fund operation of the Global Environment for Network Innovations (GENI) virtual laboratory for the next two years and support researchers in planning a new infrastructure to replace GENI. The NSF allocated $1.7 million to the effort, called Enabling NeTwork Research and the Evolution of a Next Generation Midscale Research Infrastructure (ENTeR). The project will be jointly led by researchers from the Renaissance Computing Institute (RENCI) of the University of North Carolina at Chapel Hill and their collaborators from the University of Kentucky (UK).

For the past decade, GENI has provided critical resources that researchers and students across the U.S. use to develop and test networks and distributed applications at scale on a connected system that is separate from the Internet. RENCI was one of the institutions that played a critical role in developing and deploying GENI.

Ilya Baldin, director of Network Research and Infrastructure at RENCI.

“With GENI, users can create isolated topologies of computers, servers and storage systems with network links that are programmable,” said Ilya Baldin, RENCI Principal Investigator for ENTeR. “It also incorporates technology that makes experiments much more reproducible than was possible previously.”

Researchers using GENI have published more than 370 scientific papers on a broad range of experiments, including testing new or emerging network protocols, developing ways to transport data such as videos more efficiently and answering scientific questions in fields from biology to astronomy. GENI has also provided a resource for teaching classes about distributed computer systems and allowed students to use and experiment with real networks that they built themselves.

Building the testbed of the future

With the new funding, researchers will explore what parts of GENI could be relevant in the next-generation testbed and how to incorporate future computational models into the infrastructure. They are looking toward building a testbed with the ability to perform computations in both centralized and distributed clouds as well as within the network itself.

“This new testbed infrastructure could be used to build an application that performs processing on the fly, such as would be required by internet-of-things devices that stream data constantly,” said Baldin. “On the other hand, it could also be used for experimenting with centralized cloud computing, which is more appropriate when there are substantial computational requirements.”

Although it is difficult to predict all possible uses of the next-generation testbed, it might help researchers to examine emerging problems such as how to ensure security for internet-connected devices from cars to home appliances. It could also serve the science community by providing an experimental environment for processing large amounts of data quickly, allowing scientists to get to answers faster and reducing time to discovery.

An operational change

The new project also introduces a new operational approach for GENI. Until now, GENI Project Office housed at BBN/Raytheon was charged with making budget decisions, directed the technical support efforts, ensured the system’s security and provided technical support to the community using GENI. With this new grant, control of GENI will be transitioned to RENCI and the University of Kentucky. Subcontracts were also awarded to the University of Maryland, the University of Utah and Internet 2, a non-profit computer networking consortium.

“We plan to use this opportunity to figure out how we, as a community, can run this infrastructure in a more distributed, but still coordinated, fashion,” said Baldin. “We are not only taking a look at what type of technology is needed in the future but also at the people and processes that will be needed to run the day-to-day operations of this type of infrastructure.”

Tagged , , , , , |

RENCI Named as Collaborating Institution for $3 Million Cyberinfrastructure Center of Excellence Pilot

Project will create a model for advising NSF’s largest scientific facilities

The National Science Foundation today named the Renaissance Computing Institute (RENCI) of the University of North Carolina at Chapel Hill as a collaborating institution on a $3 million pilot project to create a model and strategic plan for a Cyberinfrastructure Center of Excellence (CI CoE). The goal of the effort is to establish a reservoir of expertise on best cyberinfrastructure practices for the nation’s largest research facilities.

NSF supports more than 20 large facilities devoted to advancing research in a range of scientific domains, from the far reaches of the universe to the intricacies of Earth’s ecosystems. These facilities, which include telescopes, research vessels and other large research assets funded under the Major Research Equipment and Facilities Construction portion of the NSF budget, can cost hundreds of millions of dollars, take a decade or more to build and typically operate for many years.

Designed to collect and analyze enormous amounts of data, these facilities are often at the leading edge of scientific and computing infrastructure. The new pilot project aims to create a central body for advising large facilities on cyberinfrastructure needs and tools.  

“In the past, each large facility has built its own cyberinfrastructure backbone,” said RENCI research scientist Anirban Mandal, who will serve as co-Principal Investigator on the pilot. “In doing so, they have acquired a significant amount of expertise, but what has been missed is an opportunity to share their experiences, solutions and innovations with other large facilities. This pilot addresses that problem by forming a strategic plan and model for an exchange platform and a knowledge base for large facility cyberinfrastructure, both for existing facilities and new ones to come.”

The $3 million award is supported by NSF’s Office of Advanced Cyberinfrastructure and will be distributed over two years. The University of Southern California will serve as lead institution with Ewa Deelman, Research Director and Research Professor at USC’s Information Sciences Institute, serving as Principal Investigator. Collaborating institutions include RENCI, the University of Utah, Indiana University and the University of Notre Dame. RENCI will receive $440,000 for its contributions to the project from 2018-2020.

The project aims to create a repository for lessons learned and current tools relevant to data management systems, data processing facilities, software tools and other elements of the cyberinfrastructure systems that support large science facilities. It will also provide a forum for discussion among large facility personnel and the broader academic community, as well as address training and workforce development needs to help large facility planners and operators cultivate their in-house expertise.

“RENCI’s role is focused on the cyberinfrastructure side of things,” said Mandal. “We will first gain knowledge about what infrastructures are out there in existing facilities, then look at how we can build templates for future facilities and give consultation and advice on what has, or has not, worked well in the past.”

Collaborators will focus on developing a model for the CI CoE during the first year and implement a pilot for CI CoE in the second year. They will use the National Ecological Observatory Network, which collects data for insights on changes in U.S. ecosystems, as a test case for initial information gathering before broadening the effort to encompass other large facilities in a potential future CI CoE.

“The expertise built using the CI CoE pilot will be applicable to a host of NSF projects that include distributed cyberinfrastructure,” said Mandal. “Its broader impact comes from all the scientists who depend on this cyberinfrastructure; if you make the cyberinfrastructure better for these large facilities, it will help the scientists to do their work more effectively.”

Tagged , , |

Cloudian Joins the iRODS Consortium

HyperStore Enterprise Object Storage Validated with iRODS Platform

Cloudian has joined the iRODS Consortium, the foundation that leads development and support of the integrated Rule-Oriented Data System (iRODS). Testing for HyperStore enterprise object storage with iRODS is complete, and users may now deploy the combined solution where local workflows require cost-effective, exabyte-scalable storage and ease of integration.

iRODS is free open source software for data discovery, workflow automation, secure collaboration, and data virtualization used by research and business organizations around the globe. Easily deployed in an existing infrastructure, iRODS creates a unified namespace and a metadata catalog of all the data and users within the storage environment. With the iRODS rule engine framework, users can completely automate an organization’s data management policy.

“We’re thrilled to be working with Cloudian, a leader in enterprise object storage,” said Jason Coposky, executive director, iRODS Consortium. “Their unique perspective on customer requirements in a super-capacity storage environment will add an important dimension as we build out the next generation of policy-based data management software.”

Cloudian HyperStore solves the capacity storage challenge for global businesses in data- intensive verticals such as media, healthcare, and manufacturing. Cloudian’s global storage fabric, geo-distribution capability, and seamless cloud integration provide simple, efficient data management across the storage landscape. Unlike traditional storage solutions whose architectures were derived from stand-alone systems that operate within a single data center, Cloudian’s architecture was built on cloud technologies that were designed for distributed environments and limitless scale.

Cloudian joins Bayer, Dell EMC, Data Direct Networks, IBM, Intel, MSC, the U.S. National Institute of Environmental Health Sciences, NetApp, Quantum, RENCI, SURF, the Swedish National Infrastructure for Computing, Texas Advanced Computing Center (TACC), University College London, University of Groningen, Utrecht University, the Wellcome Sanger Institute, and Western Digital as iRODS Consortium members.

“Increasing data management efficiency is central to our customers’ objectives as they strive to manage more complex storage environments without increasing management workload,” said Sanjay Jagad, senior director of products and solutions at Cloudian. “By validating Cloudian HyperStore in the iRODS environment, we provide our users with another opportunity to deliver greater value from their storage environment at less cost.”

Texas Advanced Computing Center Joins the iRODS Consortium

The Texas Advanced Computing Center (TACC), a supercomputing center that provides scientists with some of the world’s most powerful computing resources to enable discoveries, is the latest organization to join the iRODS Consortium.

TACC, based at The University of Texas at Austin, designs, deploys, and operates a wide range of high performance computing systems used by thousands of scientists each year to study problems in biology, medicine, environmental sciences, nanomaterials, astrophysics, and much more.  

Deployed in 2009, Corral at TACC is a storage and data management resource designed and optimized to support large-scale collections and a collaborative research environment.

“TACC has run iRODS on our petabyte-scale storage platforms for many years,” said Chris Jordan, TACC’s data management and collections manager. “We provide iRODS services to support collection and sharing of research data across domains, from earth sciences to next-generation sequencing, in applications from preservation of cultural heritage to dissemination of laboratory data. We are excited to engage with the iRODS Consortium and to continue building the next generation of policy-based data management capabilities in partnership with the other member organizations.”

iRODS is free open source software for data discovery, workflow automation, secure collaboration, and data virtualization used by research and business organizations around the globe. The software is an important component of the services TACC provides through its Corral data management system and the Wrangler system, a computing environment for data-intensive applications provided though the National Science Foundation’s XSEDE initiative.

“We are excited to further our relationship with such a prominent institute within the HPC space,” said Jason Coposky, executive director of the iRODS Consortium. “Working with TACC will help us to harden our efforts toward parallel file system integration, storage tiering, and data movement over long distances on Internet2.”

TACC joins Bayer, DellEMC, Data Direct Networks, IBM, Intel, MSC, the U.S. National Institute of Environmental Health Sciences, NetApp, Quantum, RENCI, SURF, the Swedish National Infrastructure for Computing, University College London, University of Groningen, Utrecht University, the Wellcome Sanger Institute, and Western Digital as iRODS Consortium members. 

The iRODS Consortium guides development and support of iRODS, along with providing production-ready iRODS distribution and iRODS professional integration services, training, and support. The consortium is administered by founding member RENCI, a research institute for applications of cyberinfrastructure located at the University of North Carolina at Chapel Hill.

To learn more about iRODS, visit
To learn more about TACC, visit

RENCI to Lead Two $1 Million Grants to Support Data-Intensive Scientific Research

Projects aim to improve scientific productivity and protect data from inadvertent errors

Two new $1 million awards from the National Science Foundation aim to help researchers take advantage of the latest advances in data science, networking and computation while protecting the integrity of their scientific work. The Renaissance Computing Institute (RENCI) of the University of North Carolina at Chapel Hill will serve as lead institution on both projects.  Read more

South Big Data Hub partners on development of new nationwide data storage network under NSF grant

The Open Storage Network will enable researchers to manage data more efficiently than ever before.

The South Big Data Hub is one of four regional big data hub partners awarded a $1.8 million grant from the National Science Foundation (NSF) for the initial development of a data storage network over the next two years. A collaborative team will combine their expertise, facilities, and research challenges to develop the Open Storage Network (OSN). The OSN will enable academic researchers across the nation to work with and share their data more efficiently than ever before, according to the NSF announcement.  Read more

What to expect at the 2018 iRODS User Group Meeting

Interested in iRODS? Register for the meeting at

DURHAM, NC – The worldwide iRODS user community will gather here June 5 – 7 for the iRODS User Group Meeting (UGM), three days of learning, sharing of use cases, and discussions of new capabilities that have been added to the integrated Rule Oriented Data System (iRODS) in the last year.  Read more

Tagged , , |

Report focuses on rethinking flood analytics

Aerial photo of flooding caused by Hurricane Katrina (2005).

Floods are the most common, most frequent and most costly type of disaster in the United States. A flood-resilient nation uses state-of-the-art analytics and data tools to help reduce or eliminate fatalities, minimize disruptions and reduce economic losses, according to a new report co-authored by the Coastal Resilience Center of Excellence (CRC).  Read more

Tagged , |

iRODS Consortium continues to grow, signs on to OpenSFS

Quantum, NetApp join consortium that helps sustain the iRODS open source platform

 CHAPEL HILL, NC – Two companies involved in data storage and cloud-based data services recently became the 17th and 18th members of the iRODS Consortium, the membership-based foundation that leads development and support of the integrated Rule-Oriented Data System (iRODS).  Read more

Tagged , , , |