RENCI Blog | RENCI

UNC-Chapel Hill’s RENCI and CRC unveil new streamlined APSViz user interface for hurricane impacts and visualization

Published: September 12, 2024

APSViz could be a key resource for state and federal stakeholders and response teams as Atlantic Ocean heats up

The 2024 Atlantic Ocean hurricane season started off with Hurricane Beryl, an intense Category 5 storm that caused much damage along its path through the Caribbean region before weakening and making a final landfall along the Texas coast in early July¹. Beryl was followed by Debby, which dropped significant precipitation along its path from Florida to New England. After a lull in tropical activity in the North Atlantic due in part to airborne Saharan dust over the subtropical ocean, a critical cyclone formation area², storm activity is returning to a more active state with Francine approaching landfall along the Louisiana coast.

Over the past four years, UNC-Chapel Hill’s Renaissance Computing Institute (RENCI) and Coastal Resilience Center (CRC) collaborated to develop a state-of-the-science, cloud-ready data engine, visualization, and information delivery system called the ADCIRC Prediction System (APS). APS computes and disseminates real-time coastal hazards information. A key component of this is APSViz, a webportal developed by RENCI’s Earth Data Science group, led by Dr. Brian Blanton, that visualizes the complex computer simulations. After the 2023 hurricane season, the APSViz framework was re-engineered to improve the user experience and mapping display response times, resulting in a website that is much more performant and intuitive.

“With every storm, we are very keen to continue improving the accuracy, efficiency, and quality of products we are able to deliver,” said Dr. Rick Luettich, Lead Principal Investigator of the Coastal Resilience Center at UNC-Chapel Hill.

During significant coastal events, APS and APSViz have the potential to serve as key resources for state and federal stakeholders and response teams. By providing critical, real-time, and high-resolution information to assist officials in making decisions for timely evacuations and other preventative measures, they offer substantive value to response strategies that reduce negative impacts from these events.

“It’s an alternative prediction capability that has shown to be very skillful at very high resolution,” said Blanton.

Figure 1. The latest APSViz interface displays real-time results from the ADCIRC Prediction System (APS). This example is for Hurricane Francine as it approaches Louisiana.

The primary computer model for coastal storm surge and wind wave simulation is ADCIRC, co-developed by UNC-Chapel Hill and University of Notre Dame researchers, along with other academic, federal, and industry collaborators. Originally developed for retrospective simulation of past meteorological events, ADCIRC has been increasingly used for forecasting and predicting weather-driven, coastal environment impacts. ADCIRC is the core computational model in the ADCIRC Prediction System, and APSViz is a “window” into these real-time forecasting activities.

According to NOAA, hurricane activity for the 2024 season is expected to be relatively higher than normal for the North Atlantic Ocean³, which typically gets about 14 named tropical events each year. The 2024 North Atlantic season will be a critical proving ground for the APSViz infrastructure as it becomes more generally available at https://apsviz.adcircprediction.org. For more information on the infrastructure behind ADCIRC and APSViz, see our previous blog post here.

Tagged APSViz, Earth Data Science, EDS |

RENCI receives $1.4 million NSF award to help develop an expansive public-data infrastructure

Published: August 13, 2024

RENCI will design an interconnecting fabric that links a diverse set of knowledge graphs, helping form a prototype Open Knowledge Network to power the next information revolution

RENCI has received $1.4 million from the U.S. National Science Foundation (NSF) as part of the Building the Prototype Open Knowledge Network (Proto-OKN) program. The program is developing a prototype public-data infrastructure that will transform our ability to unlock actionable insights from data by semantically linking information about related entities.

As a publicly accessible, interconnected set of data repositories and associated knowledge graphs, the Open Knowledge Network will enable data-driven, artificial intelligence-based solutions for a broad set of societal and economic challenges. By representing relationships among real-world entities, knowledge graphs provide a powerful way to organize, represent, integrate, reuse and access data from multiple sources.

“There’s a great deal of data and knowledge out there, but because most of it exists in unstructured government repositories, the [scientific] literature or documents and other places, it’s hard for people to find and query, and, most importantly, it’s difficult to integrate,” said Chris Bizon, Director of Analytics & Data Science, who leads the RENCI team working on this project. “Proto-OKN will establish a prototype version of a system in which many different types of information could be inserted, attached or uploaded in structured ways so that users can get value out of that data, either by directly accessing it or building applications that make use of it.”

Synthesizing data to solve societal problems

The Proto-OKN awardees include 15 Theme 1 projects that will each develop a knowledge graph to provide data-centric solutions to societal challenges related to issues such as equitable water distribution, justice, carbon capture, the environment, health communication, agriculture and homelessness.

“These inaugural Proto-OKN projects will not only advance the state of the art in data and artificial intelligence, but they will also have tremendous positive impacts on the societal and economic opportunities that we see before us,” said Erwin Gianchandani, assistant director for NSF’s Directorate for Technology, Innovation and Partnerships.

RENCI is one of two Theme 2 awardees that will develop and deploy technologies that provide an interconnecting fabric linking the various knowledge graphs developed by the Theme 1 teams. Educational materials and tools for those interested in engaging with the Proto-OKN will be developed by a Theme 3 awardee.

Creating a common understanding

When considering their task, the RENCI team decided that three fabrics were needed to link the knowledge graphs. Thus, their project, called Fabric Integrating Networked Knowledge (FRINK), includes a knowledge fabric that will incorporate the tools necessary for knowledge graph integration and interoperability; a technical fabric with a cloud-based technical infrastructure for deploying and tapping into the knowledge graphs; and a social fabric that will create a common vision, standard semantics and shared protocols for OKN.

RENCI will bring its expertise in several areas to the project, including extensive experience with building and integrating knowledge graphs for the NIH Biomedical Data Translator Program, an effort to integrate multiple types of data, such as information on the signs and symptoms of disease and drug effects. Like Proto-OKN, the Biomedical Data Translator Program includes a group of independently funded project teams that must work together to create a new tool.

“RENCI has a lot of familiarity in working in consortia and across teams to create a common understanding,” said Bizon. “We are also well-versed in the technical aspects of cloud deployments, including building platforms that can run in the cloud and incorporating the ability to provide continuous integration and continuous deployment in those environments.”

Drawing on technical expertise

The FRINK project also includes two subcontracts. One is with Rada Chirkova, Professor in the Computer Science Department at North Carolina State University. She will use her expertise in knowledge graph algorithms to build analytical tools that allow people to work on the knowledge graphs and gain useful information from analyses using the system.

The other subcontract is with Andrew Su, Professor in the Department of Integrative Structural and Computational Biology at Scripps Research. He is an expert on knowledge integration and has experience with using Wikidata as a knowledge backbone to connect knowledge graphs, an approach that will be applied in this project.

“The real power of the system, we hope, is in what happens when you integrate all of these pieces of information,” said Bizon. “Thus, there will be a lot of work in defining new use cases and really discovering the value of this integrated data source.”

Once it is developed, the Proto-OKN will benefit a broad range of people and organizations — including government agencies, businesses, nonprofits, researchers and others — by providing access to integrated information for a variety of uses, such as pursuing societal and economic opportunities, driving evidence-based policies and developing novel AI capabilities.

Learn more about Proto-OKN on the program webpage: https://new.nsf.gov/funding/opportunities/building-prototype-open-knowledge-network-proto.

Read the NSF press release: https://new.nsf.gov/tip/updates/nsf-invests-first-ever-prototype-open-knowledge-network.

Tagged FRINK, PROTO-OKN, Translator |

NC researchers reconvene for second Clinical and Environmental Health Data workshop

Published: April 18, 2024

On Friday, February 23, 2024, RENCI hosted the second workshop in a series on Clinical and Environmental Health Data, themed “Integrating Exposures Data into Clinical Data Assets: Building a Regional Center of Excellence.” The inaugural workshop, themed “Clinical and Environmental Health Data Workshop Series – Exploration,” was also hosted by RENCI in May 2023.

The workshop series is being jointly led by experts in clinical and environmental health data and cyberinfrastructure at RENCI, US EPA, and NIEHS. The overall goal of the series is to leverage the wealth of expertise, resources, and organizations focused on clinical and environmental health within the RTP region and the broader State of North Carolina and establish a Regional Center of Excellence in Clinical and Environmental Health.

The second workshop built off of the success of the first workshop and the gaps and opportunities that were identified in that workshop, namely interest in exploring a regional environmental exposures data hub, the need for more timely release of environmental exposures data and models, and the need for tools to integrate environmental exposures data with electronic health record (EHR) data and EHR-like data.

The workshop convened a group of ~30 regional experts in clinical informatics, EHR data, environmental exposures modeling, environmental health, and community health, with broad representation from academic, industry, and federal organizations within the RTP region.

Two working sessions served to focus the workshop discussions and activities. The first working session, titled “Developing exposures models and releasing them in a more timely manner,” was moderated by Kyle Messier, Stadtman Tenure-Track Investigator in the Division of Translational Toxicology at NIEHS. The discussion focused on collaborative development of open-source environmental exposures models and tools and their application to real-world use cases, focusing on how environmental exposures models can better serve clinical and epidemiological studies. The second working session, titled “Applying exposures models to EHR and EHR-like data,” was jointly moderated by Emily Pfaff, Assistant Professor and co-Director of Informatics and Data Science at NC TraCS, and Cavin Ward-Caviness, Senior Computational Biologist in the Public Health and Integrated Toxicology Division at the US EPA. The discussion focused on challenges and solutions for linkage of clinical and environmental exposures data and how common clinical data models such as OMOP data might facilitate the linkage. The group also discussed a proposed NC Environmental Exposures Data Hub and related efforts such as UNC’s Enviroscan, Duke’s Seed Health Atlas, and the NC Department of Health and Human Services’ Environmental Health Data Dashboard.

One of the major outcomes of the second workshop was the development of a high-level outline for a workshop publication intended to lay out the vision and structure for the proposed Regional Center of Excellence. Workshop participants are now working toward developing a full publication and a shared vision for the center.

Workshop Planning committee: Ashok Krishnamurthy (Director of RENCI and co-Director of Informatics and Data Science at NC TraCS), Karamarie Fecho (Research Affiliate at RENCI and President of Copperline Professional Solutions), Jessica Natonick (Research Project Coordinator at RENCI), Cavin Ward-Caviness (Senior Computational Biologist at US EPA), and Charles Schmitt (Director of the Office of Data Science at NIEHS).

For those interested in learning more about the Clinical and Environmental Health Workshop Series, please contact Jessica Natonick at jnatonick@renci.org.

ChatGPT used to streamline medical record analysis in EduHeLx

Published: September 22, 2023

The EduHeLx team at the Renaissance Computing Institute demonstrated time- and cost-saving capabilities of ChatGPT in an educational use case for a UNC-Chapel Hill clinical data science course.

In the past few months, ChatGPT has risen from relative obscurity to a newsworthy technology for its revolutionary artificial intelligence (AI) capabilities. The natural language processing chatbot was developed by OpenAI and is built on top of families of large language models. This approach enables ChatGPT to return related search results by reasoning over interconnected knowledge networks across these language models, rendering it the most advanced AI chatbot to date. ChatGPT’s innovative AI capabilities have significant time- and cost-saving implications in many instances, including those in the educational field, which was recently demonstrated by the EduHeLx team at the Renaissance Computing Institute (RENCI), a data science research institute at UNC-Chapel Hill.

EduHeLx was used in the Spring 2023 UNC-Chapel Hill course, CHIP690: Foundations of Clinical Data Science, which introduces students to hands-on Electronic Health Record analysis training. The platform helps students understand how effectively using this data can advance clinical research and improve patient outcomes. The class leveraged realistic, but synthetic, patient data downloaded as CSV files, which must be imported into a database (here, PostgreSQL) before they can be used for analysis. A straightforward but important step is that one must first create the table definitions (also known as the schema) that will store the data, after which it is a relatively easy process to import them. Although a straightforward process, it is time-consuming, tedious, and prone to missing subtle details. Jeff Waller, one of the EduHeLx developers who worked on this issue, stated, “Complicating matters more, there was also a time constraint and a rather large number of table definitions that needed to be created (34). Combined, this would easily account for hours worth of work.”

Given the time constraints and large number of files, the EduHeLx team turned to ChatGPT to automate the process. With just 20 lines of code, ChatGPT generated database schema definitions from the CSV files, as well as the “import statements” needed to import the contents of the CSV files into the database. The entire process took roughly 45 minutes, with the total cost amounting to only 20 cents. The team used the resulting data import statements to construct the database and fill it with data, and the students were then given access to the data via database login. Not only did ChatGPT expedite an otherwise tedious and time-consuming process for this course, but this solution is general enough to be reusable for future courses where it is necessary to create database schema definitions and import statements from CSV files for use in EduHeLx.

This use case demonstrates the utility of both ChatGPT and EduHeLx, as both proved essential to students’ success in their hands-on analysis training. In addition to CHIP690, EduHeLx has been successfully deployed in the UNC-Chapel Hill course, COMP116: Introduction to Scientific Programming, in Fall 2021 and Spring 2022. Given its unique cloud-based programming capabilities, EduHeLx has the potential to serve as an essential resource for many other courses, particularly those developed and cross-listed by the new UNC School of Data Science and Society (SDSS).

Looking ahead, the EduHeLx team plans to continue optimizing the platform. Future plans include incorporating Otter-Grader, a tool developed by the University of California, Berkeley that provides auto-grading capabilities and real-time error and efficiency feedback to students. This will further enhance EduHeLx’s utility in programming-based courses, thus enhancing instructors’ and students’ teaching and learning experiences.

EduHeLx is looking for pilot instructors interested in using the platform in their data science courses. Reach out to helx@lists.renci.org if interested.

EduHeLx is an education-focused instance of HeLx, a scalable cloud-based computing platform developed by researchers at RENCI. HeLx offers a suite of tools, capabilities, and workspaces, enabling research communities to deploy custom data science workspaces securely in the cloud. EduHeLx was developed to address the needs of courses with programming components and currently supports programming using Python and R. For more information, see an earlier blog post about EduHeLx here.

NC researchers come together to harness the power of clinical and environmental health data

Published: June 9, 2023

In an increasingly interconnected world, the integration of clinical and environmental health data holds immense potential for advancing research, improving patient outcomes, and shaping the future of healthcare. However, to truly make an impact on individuals and communities, institutional and scientific silos that hinder collaboration and resource sharing must be overcome.

Recognizing this challenge, Cavin Ward-Caviness, PhD, (US Environmental Protection Agency [US EPA]), Charles Schmitt, PhD (National Institute of Environmental Health Sciences [NIEHS]), and Karamarie Fecho, PhD, Ashok Kishnamurthy, PhD, and Sarah Tyndall (Renaissance Computing Institute [RENCI]) organized the inaugural Clinical and Environmental Health Data Workshop on Friday, May 19 at RENCI in Chapel Hill, NC.

“Pooling resources and expertise has the potential to catalyze groundbreaking research initiatives and identify previously unseen connections between environmental factors and human health outcomes,” according to Ashok Krishnamurthy, PhD, director of RENCI. “We are thrilled to be able to come together with our partners at NIEHS and the US EPA to work collaboratively on these hard – but impactful – problems.”

At the heart of this endeavor lies the ultimate goal of improving patient outcomes. By integrating clinical data, such as medical records and patient histories, with environmental data, researchers can gain deeper insights into the complex interplay between individual health and environmental factors. This holistic approach can lead to targeted interventions and personalized care plans.

The fusion of clinical and environmental health data not only benefits individual patients but also empowers communities. By leveraging integrated data, researchers and public health officials can identify environmental disparities, understand social determinants of health, and design evidence-based interventions tailored to specific communities. This knowledge equips policymakers with the tools needed to implement targeted interventions, allocate resources efficiently, and ensure the equitable distribution of healthcare services.

The half-day workshop brought together over twenty local scientists, healthcare professionals, and environmental experts from the Research Triangle Park (RTP) region to discuss the current state of the art and the work that still needs to be done to make these goals into reality.

The workshop began with several lightning talks where local leaders gave presentations on the tools, data, and methods in their research areas. Topics included:

Clinical Informatics: This presentation focused specifically on standardizing Electronic Health Records (EHRs), which are electronic file formats of medical records. Converting EHRs to a standardized model would allow their application for research and expand their reach beyond local and state boundaries to national, cross-institutional analysis.
Geospatial modeling: This presentation focused on various methods for modeling environmental exposures and subsequent population outcomes, which sparked a discussion on how additional factors, such as geography, could be included in the models and how to integrate with relevant exposure events.
Social and environmental determinants of health: This presentation focused on how to integrate EHRs with social and environmental data, which would provide a deeper understanding of how environmental exposure connects to health.
Community and public health: This presentation presented the complexities of public health issues and their solutions. An example was shown of how social determinants of health impact outcomes of environmental health hazards, and emphasis was placed on the need for team science to tackle these complex issues.
Public health surveillance: This presentation described a tool for surveilling public health data, The North Carolina Disease Event Tracking and Epidemiologic Collection Tool (NC DETECT). NC DETECT contains data from emergency departments, North Carolina Poison Control, and emergency medical services.
Data science and related tools: This presentation highlighted the NIH Strategic Plan for Data Science and NIH priorities around building a biomedical data ecosystem that supports data sharing. The NIEHS Climate, Health, and Outcomes Research Data (CHORD) project, funded by the PCORI Trust Fund, is intended to serve as an exemplar for geospatial-based climate data and tools.
Cyberinfrastructure and software applications: This presentation focused on the cyberinfrastructure that RENCI has been developing to support clinical and environmental health research. The emphasis was on the informed development of cyberinfrastructure designed to bridge gaps between geoscience models and their clinical and public health applications.

After lightning talks, the group divided into breakout sessions focused on two themes: identifying gaps in integrating environmental and social health data, and creating a list of shared resources that can be used to address those gaps.

During the wrap-up session, there was robust discussion on establishing a vision and cadence for future workshops. Ultimately, the group plans to hold regular workshops to establish regional leadership in clinical and environmental health research, ensuring that the needs of local communities and stakeholders remain central to future initiatives. By nurturing the partnerships forged at this and future events, North Carolina can play a vital role in shaping the future of healthcare, driving transformative change, and moving toward a healthier and more sustainable future for all.

RENCI strengthens storm surge response capabilities

Published: June 5, 2023

APSViz provides critical, high-resolution coastal hazards information to expedite decision-making and productivity

On September 28, 2022, Hurricane Ian made landfall along the west coast of Florida as a Category 4 hurricane–the strongest Category 4 hurricane to hit the region since Hurricane Charley in 2004–causing substantial damage from strong winds and the resulting storm surge and wind waves. Hurricane Ian then crossed the Florida landmass, emerged into the Atlantic Ocean, strengthened back into a weak hurricane, and made a second landfall on the South Carolina coast. According to the National Oceanic and Atmospheric Administration (NOAA), the damage caused by Hurricane Ian in its two landfalls ranks it as the third-costliest weather disaster in U.S. history. This major event required multiple state and local agencies to prepare for significant storm impacts, assess potential damages, and plan for post-storm recovery activities.

Over the past three years, the Renaissance Computing Institute (RENCI), a data science research institute at UNC-Chapel Hill, has been developing a state-of-the-science, cloud-ready data engine, visualization, and information delivery system called APSViz. As a core project within the Department of Homeland Security’s Coastal Resilience Center at UNC-Chapel Hill, APSViz disseminates real-time coastal hazards information and enhances research productivity by making it much easier to understand computer simulations and predictions of coastal hazards.

EduHeLx: A Cloud-based Programming Platform for Data Science Education

Published: August 31, 2022

The EduHeLx pilot experiment informed future thinking about incorporating cloud-based technologies in UNC-CH courses, including courses in the new UNC-CH School of Data Science & Society (SDSS)

EduHeLx is an education-focused instance of HeLx, a scalable cloud-based computing platform developed by researchers at the Renaissance Computing Institute (RENCI), a data science research institute at UNC-Chapel Hill. HeLx offers a suite of tools, capabilities, and workspaces enabling research communities to deploy custom data science workspaces securely in the cloud.

EduHeLx was developed to address the needs of courses with programming components and currently supports programming using Python and R. Previously, students were required to download a course’s programming software onto their own computers, and instructors had to work one-on-one with students to troubleshoot issues throughout the semester; this was so time-consuming that it took away from teaching time and derailed course schedules, especially in computer science courses with 250+ students. With EduHeLx, infrastructure setup is not required by instructors or students–students can access a course’s programming software in the cloud without the need to download it, thus saving a significant amount of class time.

New concept poised to accelerate drug discovery through data mining

Published: June 24, 2022

RENCI scientists together with collaborators from UNC and other institutions have developed and defined a concept called Clinical Outcome Pathways (COPs) that could help scientists harness the vast amounts of clinical and biomedical data available today to accelerate drug discovery and drug repurposing.

“Improving drug discovery requires understanding all the biological processes involved in how drugs work,” said the paper’s first author Daniel Korn from the UNC-Chapel Hill Department of Computer Science. “COPs help broaden the concept of a drug’s mechanism of action so that knowledge graph mining can be used to discover the complete chain of events that enables a specific therapeutic effect for a drug.”

Knowledge graphs express data as a collection of nodes—such as drugs and diseases—with edges that represent the relationships—such as drug A treats disease B—between the nodes. By bringing together heterogeneous information into a single system, knowledge graphs can reveal relationships between previously unconnected information that wouldn’t be obvious otherwise.

“The real power of the COPs concept is that once we understand all the biological pathways connecting drugs and diseases, that information can be used to develop new therapeutic agents—or repurpose existing ones—that modulate the same biological pathway,” explained the paper’s senior author Alexander Tropsha from the UNC Eshelman School of Pharmacy.

As described in a Drug Discovery Today paper, the researchers define COPs as a chain of key events—molecular initiating event, intermediate event(s), and the clinical outcome—that are responsible for the therapeutic actions of a drug. Each element of the chain corresponds to a term defined in commonly used biomedical ontologies, which allows computational methods to be used to elucidate COPs and provides a way for them to be cataloged for future use.

Tagged COVID-19 |

RENCI’s Advanced Cyberinfrastructure Support Team introduces updated research resources

Published: May 16, 2022

The Advanced Cyberinfrastructure Support (ACIS) team at RENCI works to provide efficient, available resources for our researchers. Over the last several months, the team has introduced several new capabilities and tools that support researchers in successfully producing results from their computing research.

Tagged COVID-19 |

Use cases show Translator’s potential to expedite clinical research

Published: May 4, 2022

RENCI investigators are contributing to the development of a platform called Biomedical Data Translator that will allow researchers to easily access and interrelate large amounts of data relevant to advancing biomedical research. Funded by the NIH’s National Center for Advancing Translational Sciences (NCATS), the new system is poised to accelerate translational clinical research by allowing users to approach biomedical questions from a holistic perspective to inspire important new research directions.

The platform is being developed by a 15-team multi-institutional Biomedical Data Translator consortium. Three of these teams include leadership from RENCI investigators. Although still a work in progress, Translator is being designed as an easy-to-use tool that can quickly respond to queries by identifying and synthesizing relevant data from a wide variety of sources.

UNC-Chapel Hill’s RENCI and CRC unveil new streamlined APSViz user interface for hurricane impacts and visualization

RENCI receives $1.4 million NSF award to help develop an expansive public-data infrastructure

NC researchers reconvene for second Clinical and Environmental Health Data workshop

ChatGPT used to streamline medical record analysis in EduHeLx

NC researchers come together to harness the power of clinical and environmental health data

RENCI strengthens storm surge response capabilities

EduHeLx: A Cloud-based Programming Platform for Data Science Education

New concept poised to accelerate drug discovery through data mining

RENCI’s Advanced Cyberinfrastructure Support Team introduces updated research resources

Use cases show Translator’s potential to expedite clinical research

Archives

Contact RENCI

About RENCI

Partners

Connect