Computational Matchmaking

Brian Kuhlman is always on the lookout for resources that can make it simpler and quicker to study the innumerable shapes and sequences proteins can adopt.

A biochemistry and biophysics professor at the University of North Carolina at Chapel Hill, he heads a research group in the UNC School of Medicine that studies how proteins form, and then uses that information to create computer models that allow scientists to build new proteins that have important applications in medicine, biological research and industrial processes.  In particular, he is interested in manipulating signal transduction pathways, the process by which cells convert biochemical signals into cellular-level reactions and responses.

It’s work that could lead to new treatments, and eventually cures, for serious diseases such as diabetes, Alzheimers, HIV/AIDS and many cancers.  And it requires computing power well beyond that of a small research lab, a fact that led Kuhlman to collaborate with the Renaissance Computing Institute (RENCI).

Early in 2007, Kuhlman needed compute cycles to virtually create thousands of different protein configurations and determine their abilities to bind with one another. He looked to RENCI, which leads the engagement program of the Open Science Grid (OSG) to help match his research team with the computing power it needed. OSG is a consortium of universities, national laboratories, scientific collaborations and software developers dedicated to meeting the ever-growing computing and data management requirements of scientific researchers. Supported by the U.S. Department of Energy Office of Science and the National Science Foundation, OSG provides access to its members’ independently owned and managed resources through a common grid infrastructure that uses high-performance networks to connect computing systems scattered across the country. As leader of  engagement activities for OSG, RENCI works with research teams such as the Kuhlman Lab to introduce them to the OSG and its resources and help them develop the skills needed to use the OSG national cyberinfrastructure.

Using the OSG’s distributed computing facility, Kuhlman’s team was soon running large-scale jobs with Rosetta, molecular modeling software used to study protein design, protein folding, and protein-protein interactions. Rosetta allows scientists to build thousands of high-resolution, three-dimensional models of proteins. It then samples some of the thousands of combinations of protein structures and determines how well each is able to bind with a target protein structure. This process helps the researchers determine which proteins are the best candidates for a more thorough investigation of their binding properties using real molecules in the lab.

Currently, the Kuhlman Lab focuses on designing proteins that interact with target proteins only when they are in their activated state.  The researchers use these designed proteins in experiments with living cells to detect when and where the target proteins are activated in the cells.  The information helps them understand normal patterns of growth and development in cells, and by extension, the misregulations in cell development associated with cancer and other diseases.

The team used more than 150,000 CPU hours on the OSG in the spring and early summer of 2007—work that would’ve taken years on computers in the Kuhlman Lab. Through the lab’s partnership with RENCI, the process was seamless: RENCI cyberinfrastructure experts and Kuhlman Lab scientists worked together to adapt the scientist’s application into a format that could easily take advantage of OSG resources. They then used OSG’s Resource Selection Service (ReSS) to select the necessary resources at OSG sites. For every job submitted, ReSS managed the submission, detected job failures, rerouted jobs as needed, and delivered the results of the computations back to the scientists.

“Having RENCI here at Carolina helped us access the OSG. That has been a huge time saver, but even more important, it has made it possible for us to examine questions that would otherwise be unanswerable,” said Kuhlman. “In the 21st century, these are the kinds of resources that will be essential to making groundbreaking discoveries.”

Kuhlman said his research team is now studying several protein sequences that were designed using Rosetta and OSG resources.

“We are excited because we now have experimental evidence that one of our designs binds to a protein, P21-activated kinase (PAK), that has been  shown to be misregulated in cancer.  Our collaborator, Klaus Hahn in the Pharmacology deparment at UNC, plans to use our design to visualize when PAK is activated in cells”.

The Kuhlman team continues to run jobs on the OSG, although they no longer need help from RENCI in adapting their codes and managing their submissions.

“We got them started and they are using cycles everyday, but they don’t really need our help anymore,” said John McGee, who leads the OSG engagement program at RENCI. “We worked hand in hand with them to get their jobs running on the OSG and to give them the skills needed to use distributed cyberinfrastructure. Now, they are bonafide users of cyberinfrastructure with a new tool at their disposal.”

More information:
Open Science Grid: http://www.opensciencegrid.org/
Rosetta Commons: http://www.rosettacommons.org/
Kuhlman Lab: http://www.unc.edu/kuhlmanpg/