New concept poised to accelerate drug discovery through data mining

RENCI scientists together with collaborators from UNC and other institutions have developed and defined a concept called Clinical Outcome Pathways (COPs) that could help scientists harness the vast amounts of clinical and biomedical data available today to accelerate drug discovery and drug repurposing.

“Improving drug discovery requires understanding all the biological processes involved in how drugs work,” said the paper’s first author Daniel Korn from the UNC-Chapel Hill Department of Computer Science. “COPs help broaden the concept of a drug’s mechanism of action so that knowledge graph mining can be used to discover the complete chain of events that enables a specific therapeutic effect for a drug.”

Knowledge graphs express data as a collection of nodes—such as drugs and diseases—with edges that represent the relationships—such as drug A treats disease B—between the nodes. By bringing together heterogeneous information into a single system, knowledge graphs can reveal relationships between previously unconnected information that wouldn’t be obvious otherwise.

“The real power of the COPs concept is that once we understand all the biological pathways connecting drugs and diseases, that information can be used to develop new therapeutic agents—or repurpose existing ones—that modulate the same biological pathway,” explained the paper’s senior author Alexander Tropsha from the UNC Eshelman School of Pharmacy.

As described in a Drug Discovery Today paper, the researchers define COPs as a chain of key events—molecular initiating event, intermediate event(s), and the clinical outcome—that are responsible for the therapeutic actions of a drug. Each element of the chain corresponds to a term defined in commonly used biomedical ontologies, which allows computational methods to be used to elucidate COPs and provides a way for them to be cataloged for future use.

Better drug discovery

Many of today’s new drugs are designed to act on the same point in a biological pathway as existing drugs. “This creates a bunch of ‘me-too’ drugs that don’t actually increase our overall ability to cure disease,” said RENCI’s Chris Bizon, a co-author of the paper. “COPs and knowledge graphs could allow scientists to understand the full set of events involved in a drug’s action. Then they can look further upstream in the pathway to find druggable targets that produce the same therapeutic effect.”

Elucidation of COPs is one of the most pragmatic applications of the biomedical question-answering system ROBOKOP (Reasoning Over Biomedical Objects linked in Knowledge Oriented Pathways), which uses a knowledge graph structure to explore links between various biomedical data types. ROBOKOP was developed by Bizon and colleagues as part of the NIH NCATS Data Translator project.

“ROBOKOP is designed to find biological pathways for a particular drug and disease or to start with the disease and find a pathway that ends at a new drug,” said Bizon. “There are growing examples of tools based on mining of knowledge graphs in the biomedical space, many in the private sector, but ROBOKOP is one of a few fully transparent and publicly available tools that enables biomedical knowledge mining for uncovering important pathways such as those encoded by COPs.”

Integrating clinical information

Clinical observations are an important source of data necessary for elucidating COPs. ROBOKOP can be used with the  Integrated Clinical and Environmental Exposures Service (ICEES), which provides open, regulatory-compliant access to clinical data—including electronic health record data—that is integrated with environmental exposures data.

“Because a lot of medical treatments are found by serendipity or through trial and error, their mechanism of action may not be known,” said RENCI collaborator Kara Fecho, who led a team that developed tools that make it possible for ROBOKOP to access this clinical data. “ICEES provides a source of clinical observations that capture when a certain drug improved a given symptom or disease, for example. ROBOKOP can then be used to fill in the missing pieces.”

The paper describes case studies in which researchers used ROBOKOP to figure out specific COPs. In one case, researchers investigated the biological mechanisms that might explain why doctors have observed that patients taking the heartburn medicine Pepcid seemed to have much milder cases of COVID-19 compared to patients not taking the medication. In another example, researchers used ROBOKOP to find COPs that explain clinical observations suggesting that the diabetes drug metformin might be able to treat certain cancers.

Once researchers make connections like these, they can design experiments to find out whether certain medications might be useful for other indications. In addition, with access to enough clinical and genetic data, it might be possible one day to use this approach to select the best therapy for an individual patient with a particular genetic makeup using clinical and genetic data specific to that patient. The researchers are also looking at how concepts similar to COPs might be employed in areas beyond drug discovery such as identifying the causes of rare diseases or explaining adverse drug outcomes.

RENCI’s Advanced Cyberinfrastructure Support Team introduces updated research resources

The Advanced Cyberinfrastructure Support (ACIS) team at RENCI works to provide efficient, available resources for our researchers. Over the last several months, the team has introduced several new capabilities and tools that support researchers in successfully producing results from their computing research.

New hardware, new features

RENCI has several hardware clusters in use to advance computing development, including the newest addition — the Sterling Kubernetes cluster. The Kubernetes system allows researchers access to multiple containerized applications across several computer servers at the same time.

“Kubernetes provides a great set-up for a quick turn-around in case of emergencies within a cluster,” said Paul Linebaugh, DevOps Engineer at RENCI. “If an application is running a container in a server that goes down, Kubernetes will automatically start up the application in a new container on another server within the cluster.”

Throughout this past year, the ACIS team has developed the Sterling Kubernetes cluster’s features. The list of updates include automatic DNS and TLS certificates, enhanced security, GPU support, NVMe drives and many more. NVMe drives are the future of fast storage and bring significant writing speeds that are faster than the original storage settings. Updates will continue over the lifecycle of the cluster.

“Modern software can be complex, but a platform like Sterling can save a significant amount of time for our researchers by automating common tasks,” said Mac Chaffee, DevOps Engineer at RENCI. “Sterling can subdivide its four enormous GPUs that are larger than any one user needs into smaller sections that can be assigned to individuals. We can meet researchers’ needs at a higher scale now, since more staff at RENCI can benefit from the same set of hardware than ever before.”

The updates to Sterling have allowed the ACIS team to sunset RENCI’s legacy Kubernetes clusters — Blackbalsam and Mitchell. Since 2020, these clusters dutifully served as the workhorses of containerized operations for RENCI, providing resources to projects like HeLx and Translator.

In addition to the new Kubernetes development, the team has completed the first year of a two-year research effort to replace older components of the Hatteras High Performance Compute cluster. It now has newer and more powerful hardware, including eight GPUs and four large memory nodes.

Researchers have already benefited from the improved efficiency and user experience that the Hatteras refresh provides. The changes allowed the ACIS team to reconfigure the cluster’s job scheduler to provide better flexibility on job placement, while reducing the overall processing time. 

Tech support for a virtual world

The onset of the COVID-19 pandemic and the shift to remote work brought about many changes to RENCI’s systems, including the introduction of new support tool Bomgar.

“When RENCI staff members went remote at the beginning of the COVID-19 pandemic, there was a need for a resource that would allow users to request technical assistance as quickly as in person,” said Lance Leathers, Systems Specialist at RENCI. “Bomgar allows us to fully support RENCI staff working from home, regardless of whether their device runs Apple or Windows.” 

Bomgar has significantly increased efficiency for researchers requesting technical support at RENCI. within the new cluster. During a support session, staff members are easily able to use Bomgar to share their screen, demonstrate the problem at hand, and grant temporary control to ACIS support members for corrective action. 

Moving forward

Over the next few years, the ACIS team will continue to seek new ways to allow RENCI’s  innovative solutions to flourish. These continuous updates will be for the advancement of all RENCI researchers.

“It is important for us to give everyone who uses our clusters and applications a consistent experience,” said Nick Harrison, ACIS Manager at RENCI. “ACIS is working to take a lot of the services we’ve provided as stand-alone resources and consolidate them into centrally managed solutions. These advances will allow us to focus our finite resources and make services the best they can be for RENCI researchers.”

Use cases show Translator’s potential to expedite clinical research

RENCI investigators are contributing to the development of a platform called Biomedical Data Translator that will allow researchers to easily access and interrelate large amounts of data relevant to advancing biomedical research. Funded by the NIH’s National Center for Advancing Translational Sciences (NCATS), the new system is poised to accelerate translational clinical research by allowing users to approach biomedical questions from a holistic perspective to inspire important new research directions.

The platform is being developed by a 15-team multi-institutional Biomedical Data Translator consortium. Three of these teams include leadership from RENCI investigators. Although still a work in progress, Translator is being designed as an easy-to-use tool that can quickly respond to queries by identifying and synthesizing relevant data from a wide variety of sources.

Finding potential therapies for drug-induced liver injury

In December 2021, consortium members presented use cases to NCATS to demonstrate the platform’s progress and potential. In one, Paul Watkins, MD, from the UNC School of Medicine worked with RENCI collaborator Karamarie Fecho to use Translator to identify drugs that might be repurposed for treating drug-induced liver injury (DILI). There is a critical need for new therapies to heal liver damage caused by medicines. Although the injury sometimes heals when a patient stops taking the medication, it can take months or years to resolve and can leave patients unable to take medicines they need to treat medical conditions.

“There are lab-based ways to identify drugs for repurposing, or a researcher can spend years going through the literature and attempt to synthesize it,” explained Fecho. “Translator offers an alternative method that’s fast and doesn’t require the user to be an expert.” 

Using gene information to identify drug candidates that might hold promise for treating drug-induced liver injury, Translator quickly identified two antioxidant drugs for consideration. This query relied on clinical data that is part of UNC Health’s Integrated Clinical and Environmental Exposures Service (ICEES), which provides open, regulatory-compliant access to clinical data that is integrated with environmental exposures data. Fecho and colleagues from RENCI and the North Carolina Translational and Clinical Sciences Institute previously developed tools that allow Translator to access this important source of clinical data.

In addition to identifying potential drug candidates, Translator also provided experimental evidence that these drugs had been studied for preventing drug-induced liver injury in rat models and were used in clinical trials to treat other diseases. “Having this information showed that the candidate drugs were safe and effective enough to be used in a clinical trial,” said Fecho. “This can help reduce the risk involved in moving forward with clinical trials, which are time-consuming and expensive.”

The Translator findings are now being compiled into a formal report to present to the NIH-funded U.S. DILI Network leadership to inform planning for future clinical trials.

Revealing new directions for rare diseases

In another use case, researchers from the Hugh Kaul Precision Medicine Institute at the University of Alabama, Birmingham, are using Translator to find potential new treatments for rare diseases. Rare diseases are usually caused by gene mutations that aren’t passed on.

“For applications involving rare diseases, a new drug development candidate is not that helpful because it would require too much investment to develop and test a new drug for just a few people,” said RENCI’s Chris Bizon, co-PI of the Translator standards and reference implementation team. “Translator can help by looking for drugs that are already approved for some other purpose and have the potential to be repurposed for off-label use or tested in a clinical trial.”

The researchers were interested in a gene known as RHOBTB2. Children born with overactive variants of this gene sometimes never learn to walk and have severe intellectual disabilities. Researchers used Translator to ask for a list of all the chemicals that down-regulate RHOBTB2. When this didn’t return many leads, they performed another query to look for chemicals that up-regulate a gene that down-regulates RHOBTB2. This process helped reveal intermediate genes that could be targeted to down-regulate RHOBTB2.

“As a clinician, I don’t even know about all the databases that hold critical pieces of the puzzle I’m trying to put together,” said Anne Thessen, a visiting associate professor the University of Colorado School of Medicine. “With Translator I can prepare a query, run the query, and have results to review in an hour.”

Read more about Translator:
Biomedical Translator Platform moves to the next phase

New streamlined statistical method provides improved pattern detection and risk prediction for disease

The novel regression algorithm, CALF, outperforms the current gold standard, LASSO, in statistical tests

Researchers from the Renaissance Computing Institute (RENCI) at UNC-Chapel Hill, Perspectrix, the UNC School of Medicine, and the WVU Rockefeller Neuroscience Institute have collaborated to develop a new method for finding patterns in data which verifiably surpasses the performance of a generally accepted “gold standard.” 

Attempting to find patterns in data is central to all research, and it is particularly important in medical use of biological samples to predict a patient’s risk for disease formation and progression. Today, researchers can utilize advanced technology to produce an ocean of data about one person from various biological samples such as blood, DNA, and saliva, with the goal of identifying particular markers that can be informative about a person’s current health and future outlook. However, this advanced data collection and processing has outpaced current statistical methods for identifying simple but robust patterns and relationships, and this is particularly true for the field of psychiatry. For instance, researchers have yet to fully understand and predict the progression of schizophrenia. 

This new method, CALF, which stands for “coarse approximation linear function,” is described in the Scientific Reports paper, “A greedy regression algorithm with coarse weights offers novel advantages,” published on March 31, 2022. Application of CALF to five quite different examples from psychiatric and neurological studies consistently outperformed the gold standard, LASSO, or “least absolute shrinkage and selection operator” regression, and other methods. 

Read more…

New data format aids large-scale evolutionary biology research

In addition to revealing the hidden histories of life on Earth, studying the evolutionary relationships between organisms can help scientists track emerging diseases, inform methods to control invasive species, and understand how to best protect at-risk ecosystems.  

DNA sequencing and other genetic analysis approaches are providing vast new data streams to enable this research at unprecedented scales. For example, the Open Tree of Life Project is attempting to create a synthesized view of the evolutionary relationships among every known organism – more than 1.7 million species.

To aid in these endeavors, Gaurav Vaidya, PhD, from RENCI collaborated with a multi-institutional team of researchers to create a new data format that makes the clade definitions used by evolutionary biologists readable and interpretable by computers. Clades, which capture an organism’s ancestor and all its descendants, make up a portion of a phylogeny, a set of evolutionary relationships between different organisms.

Read more…

Biomedical Data Translator Platform moves to the next phase

Although we now have huge amounts of data on everything from genes to the causes of disease, it is stored in an enormous variety of ways and in many different locations. This makes it difficult, if not impossible, to find and use this data to think about biomedical questions in a big picture, holistic way.

The NIH’s National Center for Advancing Translational Sciences (NCATS) Biomedical Data Translator program is working to change this by funding a platform that allows scientists to easily access and interrelate data to inform new research directions. RENCI investigators are part of the leadership for three of the 15 teams that make up the Biomedical Data Translator consortium.

The Translator platform is designed to accelerate the development of new treatments and translational clinical research. For example, it could help uncover potential new therapies and drug targets, further elucidate how environmental exposures impact disease, and reveal new relationships between rare and common diseases.

“Translator offers a way of looking at a large amount of information – the equivalent to reading all the research papers ever published – and returning a reasonable amount of information,” said RENCI’s Chris Bizon, co-PI of the Translator standards and reference implementation team. “It provides a hypothesis that can be investigated and a list of information that will be helpful to this investigation.”

Read more…
Tagged , |

Drone projects take data processing and communication to new heights

Communicating after a natural disaster is often critical but can be challenging if telecommunications lines are damaged or wireless networks become overwhelmed. Drones, however, can be used to quickly create an on-demand communication infrastructure that is not only useful for emergency situations but can also be used for transportation, surveillance and crop monitoring. 

RENCI researchers are contributing to cutting-edge research projects that aim to make drones even more useful by improving how their data is handled and by providing a testbed that helps researchers optimize drone-based communication. 

Read more…
Tagged , , |

RENCI researchers awarded 2021 Best Paper from the Elsevier FGCS Journal

RENCI researchers recently received the 2021 Best Paper Award from the Elsevier Future Generation Computer Systems (FGCS) Journal. The paper, titled “End-to-end online performance data capture and analysis for scientific workflows,” was co-authored by Cong Wang, Anirban Mandal, and collaborators from the DOE Panorama and RAMSES projects.

The FGCS Journal aims to lead the way in advances in distributed systems, collaborative environments, high performance computing (HPC), and big data on such infrastructures as grids, clouds, and the Internet of Things. Each year, the editorial board awards “Best Paper” to one submission featured in the journal.

Read more…

RENCI Internship Program: Investing in the Next Generation of Leaders

As part of RENCI’s mission to be a leader in data science, our team is dedicated to helping the next generation of thinkers bring their ideas to the table, build valuable skill sets, and pursue professional growth. While we’ve hosted interns in several areas of our work in the past, we have recently launched an Internship Program to provide organization-wide support and resources. We are excited to expand our reach and engage with curious and hard-working young professionals across RENCI’s research groups, collaborations, and operations teams. 

“Working as an intern at RENCI has been a meaningful experience to me,” said Yifei Wang, Atlantic Wave-SDX research assistant and intern. “Colleagues and supervisors were super patient and helpful while helping me to grow from a student to a professional. RENCI is the perfect place if you want to pursue your academic and career goals.” 

Read more…

NRIG Director Ilya Baldin inducted into the NC State Computer Science Alumni Hall of Fame

On October 10, 2021, Ilya Baldin was inducted into the North Carolina State University Computer Science Alumni Hall of Fame. This honor is granted to those alumni who have exhibited noteworthy contributions to their profession and the communities they serve. 

Throughout the course of his career, Baldin has led many projects in the computer science realm and has addressed major problems in data software development. From developing prototypes to creating technologies for testbeds, he has invested much of his time to make contributions to this field.

Read more…