Although we now have huge amounts of data on everything from genes to the causes of disease, it is stored in an enormous variety of ways and in many different locations. This makes it difficult, if not impossible, to find and use this data to think about biomedical questions in a big picture, holistic way.
The NIH’s National Center for Advancing Translational Sciences (NCATS) Biomedical Data Translator program is working to change this by funding a platform that allows scientists to easily access and interrelate data to inform new research directions. RENCI investigators are part of the leadership for three of the 15 teams that make up the Biomedical Data Translator consortium.
The Translator platform is designed to accelerate the development of new treatments and translational clinical research. For example, it could help uncover potential new therapies and drug targets, further elucidate how environmental exposures impact disease, and reveal new relationships between rare and common diseases.
“Translator offers a way of looking at a large amount of information – the equivalent to reading all the research papers ever published – and returning a reasonable amount of information,” said RENCI’s Chris Bizon, co-PI of the Translator standards and reference implementation team. “It provides a hypothesis that can be investigated and a list of information that will be helpful to this investigation.”
Moving to the next phase
Launched in 2016, the Translator consortium began with a feasibility phase during which multiple teams separately developed tools to explore what was possible and practical. Although this was an important task, the tools couldn’t communicate with each other. In the past two years, Translator has moved to an implementation phase to combine promising tools into a seamless platform.
“The implementation phase has required almost starting from scratch and working to build a single system from these independently funded projects to actually answer questions,” said Bizon. With the current version of Translator, a user can enter a question, and the system will synthesize data from a variety of sources and present it to the user in a way that gives the best supported, most interesting answers first. For example, a researcher could ask the system to find all the genes involved in a certain disease and all the chemicals that have been shown to affect expression of those genes.
Getting answers
Although the Translator concept seems straightforward, creating the system and making it fast and easy to use has been a massive technical challenge. One important aspect was making sure that data conforms to standards so that the information can be accessed by the system through common interfaces. Bizon co-leads the team focused on standards together with researchers from Lawrence Berkely National Laboratory and the University of Colorado.
There are also teams working to find existing knowledge and make it accessible to the rest of the consortium using common interfaces. For example, RENCI’s Stanley Ahalt and Ashok Krishnamurthy lead the team that is identifying knowledge on environmental exposures that can be accessed by the platform.
Other teams are focused on developing the tools that figure out what data is needed in response to a specific question and then put it together in a way that is useful to the user. RENCI’s Alexander Tropsha leads the ranking agent team, which is working to fine-tune how the information provided in response to a query is ordered.
“A lot of the work has been about defining standards, so that the components that each of the 15 teams are building can talk to each other,” said Bizon. “There’s also been a fair amount of effort this year into adding evidence and provenance to the system so that it’s clear to the user where certain information came from and how well supported it is.”
Evidence and provenance allow users to trace information back to its original source, such as a publication, so that they can check that the information was interpreted correctly. In the coming months, the Translator consortium will continue to add and refine the platform’s features, develop a user interface, and add new types of data.
Read more about Translator:
Use cases show Translator’s potential to expedite clinical research