RENCI strengthens storm surge response capabilities

APSViz provides critical, high-resolution coastal hazards information to expedite decision-making and productivity

On September 28, 2022, Hurricane Ian made landfall along the west coast of Florida as a Category 4 hurricane–the strongest Category 4 hurricane to hit the region since Hurricane Charley in 2004–causing substantial damage from strong winds and the resulting storm surge and wind waves. Hurricane Ian then crossed the Florida landmass, emerged into the Atlantic Ocean, strengthened back into a weak hurricane, and made a second landfall on the South Carolina coast. According to the National Oceanic and Atmospheric Administration (NOAA), the damage caused by Hurricane Ian in its two landfalls ranks it as the third-costliest weather disaster in U.S. history. This major event required multiple state and local agencies to prepare for significant storm impacts, assess potential damages, and plan for post-storm recovery activities. 

Over the past three years, the Renaissance Computing Institute (RENCI), a data science research institute at UNC-Chapel Hill, has been developing a state-of-the-science, cloud-ready data engine, visualization, and information delivery system called APSViz. As a core project within the Department of Homeland Security’s Coastal Resilience Center at UNC-Chapel Hill, APSViz disseminates real-time coastal hazards information and enhances research productivity by making it much easier to understand computer simulations and predictions of coastal hazards. 

During significant coastal events like Hurricane Ian, APSViz has the potential to serve as a key resource for state and federal stakeholders and response teams. By providing critical, real-time, and high-resolution information to assist officials in making decisions for timely evacuations and other preventative measures, APSViz could add substantive value to response strategies and reduce negative impacts from tropical cyclones.   

Figure 1.  The new APSViz interface displays real-time results from the ADCIRC Prediction System (APS).  This example is for Hurricane Ian after it transited off the eastern Florida coast.

The primary computer model for coastal storm surge and wind wave simulation is ADCIRC, co-developed by researchers at UNC-Chapel Hill and the University of Notre Dame, along with other academic, federal, and industry collaborators. Originally developed for retrospective simulation of past meteorological events, ADCIRC has been increasingly used for forecasting and predicting weather-driven impacts on the coastal environment. ADCIRC is the core computational model in the ADCIRC Prediction System (APS). APSViz is a “window” into these real-time forecasting activities and will provide critical information to federal and local decision-makers to help mitigate damages, injuries, and fatalities caused by major coastal events.

Given the pressing need for proactive hurricane preparedness, APSViz has been developed by RENCI’s Earth Data Science group, leveraging RENCI’s expertise and resources in data management, database and geospatial technologies, and, in particular, the cloud technology, Kubernetes. In the APSViz context, Kubernetes manages containerized processes in a computing resource pool, schedules processes in the appropriate order, identifies needed resources, and disseminates results to both web-mapping frameworks and storage. 

The entire APSViz infrastructure is deployable to Amazon Web Services’ (AWS) Elastic Kubernetes Service, which will be used during active hurricane events that pose a substantial threat to the U.S. coast. Containers for new products are easily deployed to the Kubernetes workflow, enabling rapid extension and customization of APSViz. Additionally, the APSViz user interface is highly flexible, allowing for user-driven control of the mapping view (i.e., color maps, data ranges, etc.). Moreover, it is relatively straight-forward to incorporate other computer models into APSViz, allowing for further customization based on user needs. 

Figure 2. The layers of the ADCIRC Prediction System are shown on the left, and the cylinder shows the APS Data Engine, which is where the users on the far right extract data from for their predictions. 

APSViz is under continuous development to further increase its utility for storm surge and wind wave forecasting, streamlining the delivery of critical information to the people who need it most. A key component to the infrastructure is the availability of critical data layers (such as coastal water inundation onto land) to be accessed by external end users for input into their own 

decision support systems. In other words, APSViz is not just about visualization on the web, but also making a variety of products more generally available through well-known access methods, including AWS S3 buckets, the widely used GeoServer, and THREDDS data servers (Figure 2). Currently, DHS/FEMA, the North Carolina Department of Transportation, and The Water Institute of the Gulf extract APSViz layers from the data engine for use in their own analysis systems. 

“The APSViz infrastructure is enabling us to develop our own forecasting visualization applications much more rapidly. It is saving a substantial amount of application development time which we can invest in improving other parts of our systems,” said Zach Cobell, senior computational scientist at The Water Institute.

According to NOAA, hurricane activity predictions for the 2023 season indicate a relatively normal number of named storms for the North Atlantic ocean, which typically gets about 14 named tropical events each year. The 2023 North Atlantic season will be a critical proving ground for the APSViz infrastructure as it becomes more generally available at apsviz.adcircprediction.org

EduHeLx: A Cloud-based Programming Platform for Data Science Education

The EduHeLx pilot experiment informed future thinking about incorporating cloud-based technologies in UNC-CH courses, including courses in the new UNC-CH School of Data Science & Society (SDSS)

EduHeLx is an education-focused instance of HeLx, a scalable cloud-based computing platform developed by researchers at the Renaissance Computing Institute (RENCI), a data science research institute at UNC-Chapel Hill. HeLx offers a suite of tools, capabilities, and workspaces enabling research communities to deploy custom data science workspaces securely in the cloud. 

EduHeLx was developed to address the needs of courses with programming components and currently supports programming using Python and R. Previously, students were required to download a course’s programming software onto their own computers, and instructors had to work one-on-one with students to troubleshoot issues throughout the semester; this was so time-consuming that it took away from teaching time and derailed course schedules, especially in computer science courses with 250+ students. With EduHeLx, infrastructure setup is not required by instructors or students–students can access a course’s programming software in the cloud without the need to download it, thus saving a significant amount of class time. 

Emphasizing EduHeLx’s benefits, Ashok Krishnamurthy, Interim Director at RENCI and professor of Computer Science at UNC-Chapel Hill, stated, “We could concentrate on the instructional material for the course rather than spending time debugging installations on student’s laptops or other technology problems that unexpectedly crop up during the semester.” Additionally, EduHeLx allows instructors to send all course material through the platform, as well as enable auto-grading for exams and assignments, another time-saving capability that was not previously possible.

As a pilot experiment, UNC Information & Technology Services (ITS) assisted RENCI in applying EduHeLx as the educational platform in the UNC-Chapel Hill Computer Science course, COMP 116: Introduction to Scientific Programming, in Fall 2021 (Stan Ahalt/Ashok Krishnamurthy) and Spring 2022 (John Majikes). ITS provided technical support to deploy EduHeLx on UNC’s Google Cloud and assisted with adding 250+ student accounts; further, ITS provided financial support for the cloud costs to deploy EduHeLx and helped ensure security of the platform. RENCI and ITS both learned a great deal from this experiment, and it aided in informing ITS’ future engagement with cloud-based learning solutions. 

ITS, which manages the University’s Google Cloud Platform (GCP) environment, set up monitoring and essential guardrails to protect University data and advised RENCI on best practices for efficiently managing the resources, said Chuck Crews, Manager of ITS Cloud Operations Group, and John Godehn, ITS Systems Programmer/Specialist.  

“One of the compelling reasons to deploy in the cloud is that you only pay for what you’re using,” instead of paying for resources to sit idle, Crews said. Working in the cloud allows for resources to be deployed, and undeployed, as needed. 

Given the innovative capabilities EduHeLx enables for data science education, the newly launched UNC-Chapel Hill School of Data Science & Society (SDSS) is considering making extensive use of EduHeLx for a range of courses. Dr. Stan Ahalt, Inaugural Dean of the SDSS, reported that the School hopes to use the platform as a mechanism to provide data and computation to students very early in the program, both in existing courses cross-listed with other departments and in new courses developed by the SDSS. Further elaborating on the novel utility of EduHeLx, Ahalt stated, “The ability to stand up an educational platform and reliably provision the data and computation through a relatively simple process will enable us to engage new students seamlessly, as well as provide a tool that will grow with them as they progress in their coursework and research.” 

One of the main focuses of the SDSS is preparing students for an evolving workforce that increasingly demands data science literacy, which necessitates an interdisciplinary approach to integrate data science programming into a wide range of courses, including courses in the humanities and social sciences. Additionally, the SDSS places significant emphasis on their last ‘s’–society; by introducing data science and its applications to students with diverse disciplinary interests, the SDSS can better prepare them to effectively apply data science in their career of choice and maximize their impact on society. With its unique, accessible, and adaptable capabilities, EduHeLx has the potential to serve as a key resource to transform the SDSS’ vision into reality. 

New concept poised to accelerate drug discovery through data mining

RENCI scientists together with collaborators from UNC and other institutions have developed and defined a concept called Clinical Outcome Pathways (COPs) that could help scientists harness the vast amounts of clinical and biomedical data available today to accelerate drug discovery and drug repurposing.

“Improving drug discovery requires understanding all the biological processes involved in how drugs work,” said the paper’s first author Daniel Korn from the UNC-Chapel Hill Department of Computer Science. “COPs help broaden the concept of a drug’s mechanism of action so that knowledge graph mining can be used to discover the complete chain of events that enables a specific therapeutic effect for a drug.”

Knowledge graphs express data as a collection of nodes—such as drugs and diseases—with edges that represent the relationships—such as drug A treats disease B—between the nodes. By bringing together heterogeneous information into a single system, knowledge graphs can reveal relationships between previously unconnected information that wouldn’t be obvious otherwise.

“The real power of the COPs concept is that once we understand all the biological pathways connecting drugs and diseases, that information can be used to develop new therapeutic agents—or repurpose existing ones—that modulate the same biological pathway,” explained the paper’s senior author Alexander Tropsha from the UNC Eshelman School of Pharmacy.

As described in a Drug Discovery Today paper, the researchers define COPs as a chain of key events—molecular initiating event, intermediate event(s), and the clinical outcome—that are responsible for the therapeutic actions of a drug. Each element of the chain corresponds to a term defined in commonly used biomedical ontologies, which allows computational methods to be used to elucidate COPs and provides a way for them to be cataloged for future use.

Read more…

RENCI’s Advanced Cyberinfrastructure Support Team introduces updated research resources

The Advanced Cyberinfrastructure Support (ACIS) team at RENCI works to provide efficient, available resources for our researchers. Over the last several months, the team has introduced several new capabilities and tools that support researchers in successfully producing results from their computing research.

Read more…

Use cases show Translator’s potential to expedite clinical research

RENCI investigators are contributing to the development of a platform called Biomedical Data Translator that will allow researchers to easily access and interrelate large amounts of data relevant to advancing biomedical research. Funded by the NIH’s National Center for Advancing Translational Sciences (NCATS), the new system is poised to accelerate translational clinical research by allowing users to approach biomedical questions from a holistic perspective to inspire important new research directions.

The platform is being developed by a 15-team multi-institutional Biomedical Data Translator consortium. Three of these teams include leadership from RENCI investigators. Although still a work in progress, Translator is being designed as an easy-to-use tool that can quickly respond to queries by identifying and synthesizing relevant data from a wide variety of sources.

Read more…

New streamlined statistical method provides improved pattern detection and risk prediction for disease

The novel regression algorithm, CALF, outperforms the current gold standard, LASSO, in statistical tests

Researchers from the Renaissance Computing Institute (RENCI) at UNC-Chapel Hill, Perspectrix, the UNC School of Medicine, and the WVU Rockefeller Neuroscience Institute have collaborated to develop a new method for finding patterns in data which verifiably surpasses the performance of a generally accepted “gold standard.” 

Attempting to find patterns in data is central to all research, and it is particularly important in medical use of biological samples to predict a patient’s risk for disease formation and progression. Today, researchers can utilize advanced technology to produce an ocean of data about one person from various biological samples such as blood, DNA, and saliva, with the goal of identifying particular markers that can be informative about a person’s current health and future outlook. However, this advanced data collection and processing has outpaced current statistical methods for identifying simple but robust patterns and relationships, and this is particularly true for the field of psychiatry. For instance, researchers have yet to fully understand and predict the progression of schizophrenia. 

This new method, CALF, which stands for “coarse approximation linear function,” is described in the Scientific Reports paper, “A greedy regression algorithm with coarse weights offers novel advantages,” published on March 31, 2022. Application of CALF to five quite different examples from psychiatric and neurological studies consistently outperformed the gold standard, LASSO, or “least absolute shrinkage and selection operator” regression, and other methods. 

Read more…

New data format aids large-scale evolutionary biology research

In addition to revealing the hidden histories of life on Earth, studying the evolutionary relationships between organisms can help scientists track emerging diseases, inform methods to control invasive species, and understand how to best protect at-risk ecosystems.  

DNA sequencing and other genetic analysis approaches are providing vast new data streams to enable this research at unprecedented scales. For example, the Open Tree of Life Project is attempting to create a synthesized view of the evolutionary relationships among every known organism – more than 1.7 million species.

To aid in these endeavors, Gaurav Vaidya, PhD, from RENCI collaborated with a multi-institutional team of researchers to create a new data format that makes the clade definitions used by evolutionary biologists readable and interpretable by computers. Clades, which capture an organism’s ancestor and all its descendants, make up a portion of a phylogeny, a set of evolutionary relationships between different organisms.

Read more…

Biomedical Data Translator Platform moves to the next phase

Although we now have huge amounts of data on everything from genes to the causes of disease, it is stored in an enormous variety of ways and in many different locations. This makes it difficult, if not impossible, to find and use this data to think about biomedical questions in a big picture, holistic way.

The NIH’s National Center for Advancing Translational Sciences (NCATS) Biomedical Data Translator program is working to change this by funding a platform that allows scientists to easily access and interrelate data to inform new research directions. RENCI investigators are part of the leadership for three of the 15 teams that make up the Biomedical Data Translator consortium.

The Translator platform is designed to accelerate the development of new treatments and translational clinical research. For example, it could help uncover potential new therapies and drug targets, further elucidate how environmental exposures impact disease, and reveal new relationships between rare and common diseases.

“Translator offers a way of looking at a large amount of information – the equivalent to reading all the research papers ever published – and returning a reasonable amount of information,” said RENCI’s Chris Bizon, co-PI of the Translator standards and reference implementation team. “It provides a hypothesis that can be investigated and a list of information that will be helpful to this investigation.”

Read more…
Tagged , |

Drone projects take data processing and communication to new heights

Communicating after a natural disaster is often critical but can be challenging if telecommunications lines are damaged or wireless networks become overwhelmed. Drones, however, can be used to quickly create an on-demand communication infrastructure that is not only useful for emergency situations but can also be used for transportation, surveillance and crop monitoring. 

RENCI researchers are contributing to cutting-edge research projects that aim to make drones even more useful by improving how their data is handled and by providing a testbed that helps researchers optimize drone-based communication. 

Read more…
Tagged , , |

RENCI researchers awarded 2021 Best Paper from the Elsevier FGCS Journal

RENCI researchers recently received the 2021 Best Paper Award from the Elsevier Future Generation Computer Systems (FGCS) Journal. The paper, titled “End-to-end online performance data capture and analysis for scientific workflows,” was co-authored by Cong Wang, Anirban Mandal, and collaborators from the DOE Panorama and RAMSES projects.

The FGCS Journal aims to lead the way in advances in distributed systems, collaborative environments, high performance computing (HPC), and big data on such infrastructures as grids, clouds, and the Internet of Things. Each year, the editorial board awards “Best Paper” to one submission featured in the journal.

Read more…