The Translator Consortium details new features, functionality, and applications of the Translator system and its underlying data model, the Biolink Model
The Biomedical Data Translator (Translator) Consortium has announced recent progress in two companion publications–“Progress toward a universal biomedical data translator” and “Biolink Model: A universal schema for knowledge graphs in clinical, biomedical, and translational science”–in Clinical and Translational Science.
The Translator system is a knowledge graph-based platform built for combining, searching, and ‘reasoning’ over biomedical data to derive knowledge and accelerate clinical discovery. The Translator project, funded by the National Center for Advancing Translational Sciences (NCATS), addresses challenges presented by the exponential growth of diverse, siloed, and non-standardized data sets. In the Translator system, rich data from many heterogeneous sources are brought together in one place in a standardized format, allowing users to pose novel scientific inquiries and accelerating innovative translational research in ways not previously possible. The ability to search across different data types and knowledge sources is a result of the Translator Consortium’s adherence to common ontologies and standards, including the Biolink Model.
In “Progress toward a universal biomedical data translator,” the authors detail the system’s updated architecture and capabilities developed since the Consortium’s 2018 companion publications, “Toward a universal biomedical data translator” and “The Biomedical Data Translator Program: Conception, culture, and community.” The most notable update is the launch of a fully unified and harmonized system, whereas Translator previously consisted of disconnected knowledge graphs and tools. This achievement was accomplished largely through the adoption and implementation of standards and references across teams for the integration of new knowledge sources.
“What really sets Translator apart is the amount of data being integrated,” said Anne Thessen, PhD, semantic engineer at the University of Colorado Anschutz Medical Campus. “Bringing together data that already exists might seem easy, but all the work of deciphering meaning and interpreting the data on top of getting hundreds of collaborators working together is one of the most challenging projects I’ve worked on.”
Karamarie Fecho, RENCI collaborator and biomedical consultant at Copperline Professional Solutions added, “Translator is unique both sociologically and technically. Sociologically, the program has fostered an atmosphere of true collaboration, where Translator team members are eager to engage in discussion and share knowledge and resources, as well as to collectively troubleshoot Translator tools and services, regardless of who ‘owns’ them. In terms of technology, Translator is open source, and Translator team members have developed and implemented novel approaches for openly exposing patient data in a manner that is not only regulatory compliant, but completely devoid of regulatory hurdles from the end user perspective.”
In the companion publication, “Biolink Model: A universal schema for knowledge graphs in clinical, biomedical, and translational science,” the authors describe the Biolink Model and its role in the Translator project and other initiatives. Biolink Model provides a universal, open source data model intended to standardize ontologies, naming conventions for nodes/entities in knowledge graphs, and the relationships between entities. Additionally, it maps comparable elements between ontologies, allowing disparate data sets to be compared and searched across.
“It is inspiring to see experts from a wide variety of domains communicate and collaborate on a shared model,” said Sierra Moxon, data architect and software developer at Lawrence Berkeley National Laboratory (LBNL). “Biolink Model establishes a common language to communicate with, and that’s the first step to solving hard problems together.”
“One of the main needs of Translator was a common dialect for organizing, representing, and exchanging knowledge between knowledge providers, subject matter experts, and machines,” said Deepak Unni, a former Software Developer at LBNL. “Biolink Model addresses this need by providing a harmonized data model that tackles challenges with knowledge representation and provides a foundation upon which intelligent applications can be built.”
About the NCATS Biomedical Data Translator Program
The NCATS Biomedical Data Translator Program was launched in October 2016, with funding from the National Center for Advancing Translational Sciences, a center within the National Institutes of Health. This work is supported by the National Center for Advancing Translational Sciences of the National Institutes of Health under grant numbers: Other Transaction Awards OT2TR003434, OT2TR003436, OT2TR003428, OT2TR003448, OT2TR003427, OT2TR003430, OT2TR003433, OT2TR003450, OT2TR003437, OT2TR003443, OT2TR003441, OT2TR003449, OT2TR003445, OT2TR003422, OT2TR003435, OT3TR002026, OT3TR002020, OT3TR002025, OT3TR002019, OT3TR002027, OT2TR002517, OT2TR002514, OT2TR002515, OT2TR002584, OT2TR002520; and contract number 75N95021P00636. Additional funding was provided by the Intramural Research Program at NCATS (ZIA TR000276-05). For a complete list of Translator teams and collaborators, visit https://ncats.nih.gov/translator/projects. Any opinions expressed in this press release are those of the Translator community writ large and do not necessarily reflect the views of NCATS, individual Translator team members, or affiliated organizations and institutions.