RENCI to showcase latest technological innovations at SC24

Every sector of society is undergoing a historic transformation driven by big data. RENCI is committed to transforming data into discoveries by partnering with leading universities, government, and the private sector to create tools and technologies that facilitate data access, sharing, analysis, management, and archiving.

Each year, the Supercomputing conference provides the leading technical program for professionals and students in the HPC community, as measured by impact, at the highest academic and professional standards. RENCI will host a booth (#3923) at SC24 where team members will share collaborative research projects and cyberinfrastructure efforts aimed at helping people use data to drive discoveries.

A full schedule of sessions at the RENCI booth can be found on our website.


19th Workshop on Workflows in Support of Large-Scale Science

Anirban Mandal, the Director of Network Research & Infrastructure Group at RENCI and co-PI of the DOE-funded Poseidon project, will co-chair the 19th Workshop on Workflows in Support of Large-Scale Science (WORKS), taking place November 18. WORKS 2024 focuses on the many facets of scientific workflow management systems, ranging from actual execution to service management and the coordination and optimization of data, service, and job dependencies.

iRODS 4.3.3, HTTP, OIDC, S3, and the future of iRODS 5.0

The open source iRODS (Integrated Rule-Oriented Data System) data management platform presents a virtual filesystem, metadata catalog, and policy engine designed to give organizations maximum control and flexibility over their data management practices and enforcement. This year saw the release of iRODS 4.3.3 as well as a new PAM Interactive Authentication Plugin, HTTP and OpenID Connect functionality, the S3 API, and three new client libraries.

iRODS will host a free mini-workshop on Monday, November 18 at 9 AM ET to cover the above efforts and give a glimpse of where the team is headed next, with a special focus on the future of iRODS 5.0 and what it may mean for integrations to come. Additionally, iRODS team members will present talks on these topics and be available for further discussion at the RENCI booth on the exhibit floor from November 19-21.

Building an approachable cost-effective data management platform

iRODS Executive Director Terrell Russell will give a talk at the DDN booth (#2431) on November 19 at 3:00 PM ET. Long-term data management is best executed when policies are clear and infrastructure is abstracted and swappable.  iRODS has a desire to be normal and boring for the administrator and approachable and powerful for the user.  This talk will cover recent advances and interfaces which allow companies to sustain FAIR data practices, enforce consistency and reproducibility, and realize cost-savings through open source software.

FABRIC Network Research Infrastructure: Updates and User Highlights

As part of the 2024 INDIS Workshop, FABRIC will be represented on November 19 at 11 AM ET at the SC Theater on the exhibit floor. PI Paul Ruth will share FABRIC updates from the past year and two FABRIC users will show demos of their FABRIC experiments

FABRIC Research Infrastructure: Current Capabilities and Stories from our Users

Those interested in learning more about FABRIC at Supercomputing 2024 will be able to visit RENCI’s in-person booth (#3923) and learn about updates to the project. FABRIC PI Paul Ruth will present talks at the booth on November 20 and 21, accompanied by various FABRIC users. User presentations will vary by day, and include: 

  • Using AI/ML with P4 on Baric for intelligent routing | Mariam Kiran, ORNL
  • Using CREASE Tooling to smoothen your Testbed Experiment experience | Nik Sultana, Illinois Institute of Technology
  • FABRIC And Data Intensive Science Prototype Services | Joe Mambretti, Northwestern University and StarLight
  • Multi-Domain Experiments Using ESnet SENSE on the National Research Platform / PacWave / FABRIC | Mohammad Firas Sada, San Diego Supercomputer Center
  • Traffic Steering Without the Switch: Offloading Big Flows to the NIC | Justas Balcas, ESNet

FABRIC Facility Ports at SCinet

Beyond Booth #3923, we’re excited to announce that there will be FABRIC facility ports at SCinet at several booths this year in addition to RENCI, including:

  • StarLight | Booth #2551
  • California Institute of Technology / CACR | Booth #845
  • CIENA | Booth #1940

Ciena has worked in partnership with FABRIC to construct a mini-FABRIC node, known as a “Traveling Fabric Node”. This FABRIC node is designed to be mobile and will be used to support demonstrations, presentations, and experiments at various events and conferences such as Supercomputing, Optical Fiber Conference, and others. This Traveling Ciena built FABRIC node will be located in the Ciena booth on the SC24 Exhibit floor. This deployment will include connections thru SCinet and leverage the extensive wide area connectivity engineered into SC24 to connect back to the FABRIC terabit core and associated globally distributed infrastructure.  This provides an opportunity for FABRIC users to demonstrate innovative research/experiments which can span the FABRIC core infrastructure and resources at the SC24 Exhibit Venue, including to other SC24 participant booths.

SWARM: Scientific Workflow Applications on Resilient Metasystem

RENCI researchers Anirban Mandal, Komal Thareja, Erik Scott, and Cong Wang are listed as authors on a poster being presented at SC24, titled “SWARM: Scientific Workflow Applications on Resilient Metasystem.” The poster will be available to view in Rooms B302 – B305 from 12 – 5 PM ET on Tuesday, November 19.

Large Language Models for Anomaly Detection in Computational Workflows: from Supervised Fine-Tuning to In-Context Learning

RENCI researchers Anirban Mandal and Cong Wang are listed as authors on a paper being presented at SC24, titled “Large Language Models for Anomaly Detection in Computational Workflows: from Supervised Fine-Tuning to In-Context Learning.” The paper will be presented in Room B311 at 2 PM ET on Thursday, November 21, as part of the Computational Efficiency and Learning Techniques session.


About RENCI

RENCI (Renaissance Computing Institute) develops and deploys advanced technologies to enable research discoveries and practical innovations. RENCI partners with researchers, government, and industry to engage and solve the problems that affect North Carolina, our nation, and the world. RENCI is an institute of the University of North Carolina at Chapel Hill.

iRODS on the IT Press Tour: Navigating the Data Deluge

iRODS Executive Director Terrell Russell spoke to the press about next steps for the data management software

iRODS Executive Director Terrell Russell participated in the 58th Edition of the IT Press Tour in Boston, MA in early October. This edition was dedicated to innovations in IT infrastructure, cloud, networking, security, data and storage management, and Artificial Intelligence (AI), across all these topics.

The Integrated Rule-Oriented Data System (iRODS) is open source data management software used by research, commercial, and governmental organizations worldwide. The software is versatile, and is used by supercomputing centers and organizations working with data from fields such as physics, life sciences, genomics, health, financial services, and more.

During the event, Russell spoke to the press about the necessity of data management software as we move towards the “zettabyte era.” He discussed the software’s key strengths and features, including data virtualization, policy-based automation, secure collaboration, and data discovery through a metadata-centric approach. 

Russell shared future plans to form partnerships with system integrators who are interested in building iRODS into new vertical solutions for their customers. iRODS integrations provide a layer of seamless data management, without the need to replace existing storage systems. 

“The larger the organization, the more they need software like iRODS,” said Russell. “iRODS provides flexible insurance against the future. You know you will change your policy if you’ve been around long enough. You will buy a new shiny thing and plug it in. You will have to move stuff around, but this way, maybe your users, clients, and students don’t need to learn new tricks.”

The complete list of coverage from the event is below.

To learn more about iRODS and the iRODS Consortium, please visit irods.org

UNC-Chapel Hill’s RENCI and CRC unveil new streamlined APSViz user interface for hurricane impacts and visualization

APSViz could be a key resource for state and federal stakeholders and response teams as Atlantic Ocean heats up

The 2024 Atlantic Ocean hurricane season started off with Hurricane Beryl, an intense Category 5 storm that caused much damage along its path through the Caribbean region before weakening and making a final landfall along the Texas coast in early July1. Beryl was followed by Debby, which dropped significant precipitation along its path from Florida to New England. After a lull in tropical activity in the North Atlantic due in part to airborne Saharan dust over the subtropical ocean, a critical cyclone formation area2, storm activity is returning to a more active state with Francine approaching landfall along the Louisiana coast.   

Over the past four years, UNC-Chapel Hill’s Renaissance Computing Institute (RENCI) and Coastal Resilience Center (CRC) collaborated to develop a state-of-the-science, cloud-ready data engine, visualization, and information delivery system called the ADCIRC Prediction System (APS). APS computes and disseminates real-time coastal hazards information. A key component of this is APSViz, a webportal developed by RENCI’s Earth Data Science group, led by Dr. Brian Blanton, that visualizes the complex computer simulations. After the 2023 hurricane season, the APSViz framework was re-engineered to improve the user experience and mapping display response times, resulting in a website that is much more performant and intuitive.

“With every storm, we are very keen to continue improving the accuracy, efficiency, and quality of products we are able to deliver,” said Dr. Rick Luettich, Lead Principal Investigator of the Coastal Resilience Center at UNC-Chapel Hill. 

During significant coastal events, APS and APSViz have the potential to serve as key resources for state and federal stakeholders and response teams. By providing critical, real-time, and high-resolution information to assist officials in making decisions for timely evacuations and other preventative measures, they offer substantive value to response strategies that reduce negative impacts from these events.   

“It’s an alternative prediction capability that has shown to be very skillful at very high resolution,” said Blanton. 

Figure 1. The latest APSViz interface displays real-time results from the ADCIRC Prediction System (APS). This example is for Hurricane Francine as it approaches Louisiana. 

The primary computer model for coastal storm surge and wind wave simulation is ADCIRC, co-developed by UNC-Chapel Hill and University of Notre Dame researchers, along with other academic, federal, and industry collaborators. Originally developed for retrospective simulation of past meteorological events, ADCIRC has been increasingly used for forecasting and predicting weather-driven, coastal environment impacts. ADCIRC is the core computational model in the ADCIRC Prediction System, and APSViz is a “window” into these real-time forecasting activities. 

According to NOAA, hurricane activity for the 2024 season is expected to be relatively higher than normal for the North Atlantic Ocean3, which typically gets about 14 named tropical events each year. The 2024 North Atlantic season will be a critical proving ground for the APSViz infrastructure as it becomes more generally available at https://apsviz.adcircprediction.org. For more information on the infrastructure behind ADCIRC and APSViz, see our previous blog post here.

  1. https://www.climate.gov/news-features/event-tracker/category-5-hurricane-beryl-makes-explosive-start-2024-atlantic-season
  2. https://weather.com/storms/hurricane/news/2024-07-15-saharan-dust-atlantic-hurricane-season
  3. https://www.noaa.gov/news-release/noaa-predicts-above-normal-2024-atlantic-hurricane-season

RENCI receives $1.4 million NSF award to help develop an expansive public-data infrastructure

RENCI will design an interconnecting fabric that links a diverse set of knowledge graphs, helping form a prototype Open Knowledge Network to power the next information revolution

RENCI has received $1.4 million from the U.S. National Science Foundation (NSF) as part of the Building the Prototype Open Knowledge Network (Proto-OKN) program. The program is developing a prototype public-data infrastructure that will transform our ability to unlock actionable insights from data by semantically linking information about related entities.

As a publicly accessible, interconnected set of data repositories and associated knowledge graphs, the Open Knowledge Network will enable data-driven, artificial intelligence-based solutions for a broad set of societal and economic challenges. By representing relationships among real-world entities, knowledge graphs provide a powerful way to organize, represent, integrate, reuse and access data from multiple sources.

“There’s a great deal of data and knowledge out there, but because most of it exists in unstructured government repositories, the [scientific] literature or documents and other places, it’s hard for people to find and query, and, most importantly, it’s difficult to integrate,” said Chris Bizon, Director of Analytics & Data Science, who leads the RENCI team working on this project. “Proto-OKN will establish a prototype version of a system in which many different types of information could be inserted, attached or uploaded in structured ways so that users can get value out of that data, either by directly accessing it or building applications that make use of it.” 

Synthesizing data to solve societal problems

The Proto-OKN awardees include 15 Theme 1 projects that will each develop a knowledge graph to provide data-centric solutions to societal challenges related to issues such as equitable water distribution, justice, carbon capture, the environment, health communication, agriculture and homelessness.

“These inaugural Proto-OKN projects will not only advance the state of the art in data and artificial intelligence, but they will also have tremendous positive impacts on the societal and economic opportunities that we see before us,” said Erwin Gianchandani, assistant director for NSF’s Directorate for Technology, Innovation and Partnerships.

RENCI is one of two Theme 2 awardees that will develop and deploy technologies that provide an interconnecting fabric linking the various knowledge graphs developed by the Theme 1 teams. Educational materials and tools for those interested in engaging with the Proto-OKN will be developed by a Theme 3 awardee.

Creating a common understanding

When considering their task, the RENCI team decided that three fabrics were needed to link the knowledge graphs. Thus, their project, called Fabric Integrating Networked Knowledge (FRINK), includes a knowledge fabric that will incorporate the tools necessary for knowledge graph integration and interoperability; a technical fabric with a cloud-based technical infrastructure for deploying and tapping into the knowledge graphs; and a social fabric that will create a common vision, standard semantics and shared protocols for OKN.

RENCI will bring its expertise in several areas to the project, including extensive experience with building and integrating knowledge graphs for the NIH Biomedical Data Translator Program, an effort to integrate multiple types of data, such as information on the signs and symptoms of disease and drug effects. Like Proto-OKN, the Biomedical Data Translator Program includes a group of independently funded project teams that must work together to create a new tool.

“RENCI has a lot of familiarity in working in consortia and across teams to create a common understanding,” said Bizon. “We are also well-versed in the technical aspects of cloud deployments, including building platforms that can run in the cloud and incorporating the ability to provide continuous integration and continuous deployment in those environments.”

Drawing on technical expertise

The FRINK project also includes two subcontracts. One is with Rada Chirkova, Professor in the Computer Science Department at North Carolina State University. She will use her expertise in knowledge graph algorithms to build analytical tools that allow people to work on the knowledge graphs and gain useful information from analyses using the system.

The other subcontract is with Andrew Su, Professor in the Department of Integrative Structural and Computational Biology at Scripps Research. He is an expert on knowledge integration and has experience with using Wikidata as a knowledge backbone to connect knowledge graphs, an approach that will be applied in this project.

“The real power of the system, we hope, is in what happens when you integrate all of these pieces of information,” said Bizon. “Thus, there will be a lot of work in defining new use cases and really discovering the value of this integrated data source.”

Once it is developed, the Proto-OKN will benefit a broad range of people and organizations — including government agencies, businesses, nonprofits, researchers and others — by providing access to integrated information for a variety of uses, such as pursuing societal and economic opportunities, driving evidence-based policies and developing novel AI capabilities.

Learn more about Proto-OKN on the program webpage: https://new.nsf.gov/funding/opportunities/building-prototype-open-knowledge-network-proto.

Read the NSF press release: https://new.nsf.gov/tip/updates/nsf-invests-first-ever-prototype-open-knowledge-network.

NC researchers reconvene for second Clinical and Environmental Health Data workshop

On Friday, February 23, 2024, RENCI hosted the second workshop in a series on Clinical and Environmental Health Data, themed “Integrating Exposures Data into Clinical Data Assets: Building a Regional Center of Excellence.” The inaugural workshop, themed “Clinical and Environmental Health Data Workshop Series – Exploration,” was also hosted by RENCI in May 2023. 

The workshop series is being jointly led by experts in clinical and environmental health data and cyberinfrastructure at RENCI, US EPA, and NIEHS. The overall goal of the series is to leverage the wealth of expertise, resources, and organizations focused on clinical and environmental health within the RTP region and the broader State of North Carolina and establish a Regional Center of Excellence in Clinical and Environmental Health. 

The second workshop built off of the success of the first workshop and the gaps and opportunities that were identified in that workshop, namely interest in exploring a regional environmental exposures data hub, the need for more timely release of environmental exposures data and models, and the need for tools to integrate environmental exposures data with electronic health record (EHR) data and EHR-like data.

The workshop convened a group of ~30 regional experts in clinical informatics, EHR data, environmental exposures modeling, environmental health, and community health, with broad representation from academic, industry, and federal organizations within the RTP region.

Two working sessions served to focus the workshop discussions and activities. The first working session, titled “Developing exposures models and releasing them in a more timely manner,” was moderated by Kyle Messier, Stadtman Tenure-Track Investigator in the Division of Translational Toxicology at NIEHS. The discussion focused on collaborative development of open-source environmental exposures models and tools and their application to real-world use cases, focusing on how environmental exposures models can better serve clinical and epidemiological studies. The second working session, titled “Applying exposures models to EHR and EHR-like data,” was jointly moderated by Emily Pfaff, Assistant Professor and co-Director of Informatics and Data Science at NC TraCS, and Cavin Ward-Caviness, Senior Computational Biologist in the Public Health and Integrated Toxicology Division at the US EPA. The discussion focused on challenges and solutions for linkage of clinical and environmental exposures data and how common clinical data models such as OMOP data might facilitate the linkage. The group also discussed a proposed NC Environmental Exposures Data Hub and related efforts such as UNC’s Enviroscan, Duke’s Seed Health Atlas, and the NC Department of Health and Human Services’ Environmental Health Data Dashboard.

One of the major outcomes of the second workshop was the development of a high-level outline for a workshop publication intended to lay out the vision and structure for the proposed Regional Center of Excellence. Workshop participants are now working toward developing a full publication and a shared vision for the center.

Workshop Planning committee: Ashok Krishnamurthy (Director of RENCI and co-Director of Informatics and Data Science at NC TraCS), Karamarie Fecho (Research Affiliate at RENCI and President of Copperline Professional Solutions), Jessica Natonick (Research Project Coordinator at RENCI), Cavin Ward-Caviness (Senior Computational Biologist at US EPA), and Charles Schmitt (Director of the Office of Data Science at NIEHS).

For those interested in learning more about the Clinical and Environmental Health Workshop Series, please contact Jessica Natonick at jnatonick@renci.org.

ChatGPT used to streamline medical record analysis in EduHeLx

The EduHeLx team at the Renaissance Computing Institute demonstrated time- and cost-saving capabilities of ChatGPT in an educational use case for a UNC-Chapel Hill clinical data science course.

In the past few months, ChatGPT has risen from relative obscurity to a newsworthy technology for its revolutionary artificial intelligence (AI) capabilities. The natural language processing chatbot was developed by OpenAI and is built on top of families of large language models. This approach enables ChatGPT to return related search results by reasoning over interconnected knowledge networks across these language models, rendering it the most advanced AI chatbot to date. ChatGPT’s innovative AI capabilities have significant time- and cost-saving implications in many instances, including those in the educational field, which was recently demonstrated by the EduHeLx team at the Renaissance Computing Institute (RENCI), a data science research institute at UNC-Chapel Hill. 

EduHeLx was used in the Spring 2023 UNC-Chapel Hill course, CHIP690: Foundations of Clinical Data Science, which introduces students to hands-on Electronic Health Record analysis training. The platform helps students understand how effectively using this data can advance clinical research and improve patient outcomes. The class leveraged realistic, but synthetic, patient data downloaded as CSV files, which must be imported into a database (here, PostgreSQL) before they can be used for analysis. A straightforward but important step is that one must first create the table definitions (also known as the schema) that will store the data, after which it is a relatively easy process to import them. Although a straightforward process, it is time-consuming, tedious, and prone to missing subtle details. Jeff Waller, one of the EduHeLx developers who worked on this issue, stated, “Complicating matters more, there was also a time constraint and a rather large number of table definitions that needed to be created (34). Combined, this would easily account for hours worth of work.”

Given the time constraints and large number of files, the EduHeLx team turned to ChatGPT to automate the process. With just 20 lines of code, ChatGPT generated database schema definitions from the CSV files, as well as the “import statements” needed to import the contents of the CSV files into the database. The entire process took roughly 45 minutes, with the total cost amounting to only 20 cents. The team used the resulting data import statements to construct the database and fill it with data, and the students were then given access to the data via database login. Not only did ChatGPT expedite an otherwise tedious and time-consuming process for this course, but this solution is general enough to be reusable for future courses where it is necessary to create database schema definitions and import statements from CSV files for use in EduHeLx. 

This use case demonstrates the utility of both ChatGPT and EduHeLx, as both proved essential to students’ success in their hands-on analysis training. In addition to CHIP690, EduHeLx has been successfully deployed in the UNC-Chapel Hill course, COMP116: Introduction to Scientific Programming, in Fall 2021 and Spring 2022. Given its unique cloud-based programming capabilities, EduHeLx has the potential to serve as an essential resource for many other courses, particularly those developed and cross-listed by the new UNC School of Data Science and Society (SDSS). 

Looking ahead, the EduHeLx team plans to continue optimizing the platform. Future plans include incorporating Otter-Grader, a tool developed by the University of California, Berkeley that provides auto-grading capabilities and real-time error and efficiency feedback to students. This will further enhance EduHeLx’s utility in programming-based courses, thus enhancing instructors’ and students’ teaching and learning experiences.

EduHeLx is looking for pilot instructors interested in using the platform in their data science courses. Reach out to helx@lists.renci.org if interested. 

EduHeLx is an education-focused instance of HeLx, a scalable cloud-based computing platform developed by researchers at RENCI. HeLx offers a suite of tools, capabilities, and workspaces, enabling research communities to deploy custom data science workspaces securely in the cloud. EduHeLx was developed to address the needs of courses with programming components and currently supports programming using Python and R. For more information, see an earlier blog post about EduHeLx here.

NC researchers come together to harness the power of clinical and environmental health data

In an increasingly interconnected world, the integration of clinical and environmental health data holds immense potential for advancing research, improving patient outcomes, and shaping the future of healthcare. However, to truly make an impact on individuals and communities, institutional and scientific silos that hinder collaboration and resource sharing must be overcome.

Recognizing this challenge, Cavin Ward-Caviness, PhD, (US Environmental Protection Agency [US EPA]), Charles Schmitt, PhD (National Institute of Environmental Health Sciences [NIEHS]), and Karamarie Fecho, PhD, Ashok Kishnamurthy, PhD, and Sarah Tyndall (Renaissance Computing Institute [RENCI]) organized the inaugural Clinical and Environmental Health Data Workshop on Friday, May 19 at RENCI in Chapel Hill, NC.

“Pooling resources and expertise has the potential to catalyze groundbreaking research initiatives and identify previously unseen connections between environmental factors and human health outcomes,” according to Ashok Krishnamurthy, PhD, director of RENCI. “We are thrilled to be able to come together with our partners at NIEHS and the US EPA to work collaboratively on these hard – but impactful – problems.”

At the heart of this endeavor lies the ultimate goal of improving patient outcomes. By integrating clinical data, such as medical records and patient histories, with environmental data, researchers can gain deeper insights into the complex interplay between individual health and environmental factors. This holistic approach can lead to targeted interventions and personalized care plans.

The fusion of clinical and environmental health data not only benefits individual patients but also empowers communities. By leveraging integrated data, researchers and public health officials can identify environmental disparities, understand social determinants of health, and design evidence-based interventions tailored to specific communities. This knowledge equips policymakers with the tools needed to implement targeted interventions, allocate resources efficiently, and ensure the equitable distribution of healthcare services.

The half-day workshop brought together over twenty local scientists, healthcare professionals, and environmental experts from the Research Triangle Park (RTP) region to discuss the current state of the art and the work that still needs to be done to make these goals into reality.

The workshop began with several lightning talks where local leaders gave presentations on the tools, data, and methods in their research areas. Topics included:

  • Clinical Informatics: This presentation focused specifically on standardizing Electronic Health Records (EHRs), which are electronic file formats of medical records. Converting EHRs to a standardized model would allow their application for research and expand their reach beyond local and state boundaries to national, cross-institutional analysis.
  • Geospatial modeling: This presentation focused on various methods for modeling environmental exposures and subsequent population outcomes, which sparked a discussion on how additional factors, such as geography, could be included in the models and how to integrate with relevant exposure events.
  • Social and environmental determinants of health: This presentation focused on how to integrate EHRs with social and environmental data, which would provide a deeper understanding of how environmental exposure connects to health.
  • Community and public health: This presentation presented the complexities of public health issues and their solutions. An example was shown of how social determinants of health impact outcomes of environmental health hazards, and emphasis was placed on the need for team science to tackle these complex issues.
  • Public health surveillance: This presentation described a tool for surveilling public health data, The North Carolina Disease Event Tracking and Epidemiologic Collection Tool (NC DETECT). NC DETECT contains data from emergency departments, North Carolina Poison Control, and emergency medical services.
  • Data science and related tools: This presentation highlighted the NIH Strategic Plan for Data Science and NIH priorities around building a biomedical data ecosystem that supports data sharing. The NIEHS Climate, Health, and Outcomes Research Data (CHORD) project, funded by the PCORI Trust Fund, is intended to serve as an exemplar for geospatial-based climate data and tools.
  • Cyberinfrastructure and software applications: This presentation focused on the cyberinfrastructure that RENCI has been developing to support clinical and environmental health research. The emphasis was on the informed development of cyberinfrastructure designed to bridge gaps between geoscience models and their clinical and public health applications.

After lightning talks, the group divided into breakout sessions focused on two themes: identifying gaps in integrating environmental and social health data, and creating a list of shared resources that can be used to address those gaps.

During the wrap-up session, there was robust discussion on establishing a vision and cadence for future workshops. Ultimately, the group plans to hold regular workshops to establish regional leadership in clinical and environmental health research, ensuring that the needs of local communities and stakeholders remain central to future initiatives. By nurturing the partnerships forged at this and future events, North Carolina can play a vital role in shaping the future of healthcare, driving transformative change, and moving toward a healthier and more sustainable future for all.

RENCI strengthens storm surge response capabilities

APSViz provides critical, high-resolution coastal hazards information to expedite decision-making and productivity

On September 28, 2022, Hurricane Ian made landfall along the west coast of Florida as a Category 4 hurricane–the strongest Category 4 hurricane to hit the region since Hurricane Charley in 2004–causing substantial damage from strong winds and the resulting storm surge and wind waves. Hurricane Ian then crossed the Florida landmass, emerged into the Atlantic Ocean, strengthened back into a weak hurricane, and made a second landfall on the South Carolina coast. According to the National Oceanic and Atmospheric Administration (NOAA), the damage caused by Hurricane Ian in its two landfalls ranks it as the third-costliest weather disaster in U.S. history. This major event required multiple state and local agencies to prepare for significant storm impacts, assess potential damages, and plan for post-storm recovery activities. 

Over the past three years, the Renaissance Computing Institute (RENCI), a data science research institute at UNC-Chapel Hill, has been developing a state-of-the-science, cloud-ready data engine, visualization, and information delivery system called APSViz. As a core project within the Department of Homeland Security’s Coastal Resilience Center at UNC-Chapel Hill, APSViz disseminates real-time coastal hazards information and enhances research productivity by making it much easier to understand computer simulations and predictions of coastal hazards. 

Read more…

EduHeLx: A Cloud-based Programming Platform for Data Science Education

The EduHeLx pilot experiment informed future thinking about incorporating cloud-based technologies in UNC-CH courses, including courses in the new UNC-CH School of Data Science & Society (SDSS)

EduHeLx is an education-focused instance of HeLx, a scalable cloud-based computing platform developed by researchers at the Renaissance Computing Institute (RENCI), a data science research institute at UNC-Chapel Hill. HeLx offers a suite of tools, capabilities, and workspaces enabling research communities to deploy custom data science workspaces securely in the cloud. 

EduHeLx was developed to address the needs of courses with programming components and currently supports programming using Python and R. Previously, students were required to download a course’s programming software onto their own computers, and instructors had to work one-on-one with students to troubleshoot issues throughout the semester; this was so time-consuming that it took away from teaching time and derailed course schedules, especially in computer science courses with 250+ students. With EduHeLx, infrastructure setup is not required by instructors or students–students can access a course’s programming software in the cloud without the need to download it, thus saving a significant amount of class time. 

Read more…

New concept poised to accelerate drug discovery through data mining

RENCI scientists together with collaborators from UNC and other institutions have developed and defined a concept called Clinical Outcome Pathways (COPs) that could help scientists harness the vast amounts of clinical and biomedical data available today to accelerate drug discovery and drug repurposing.

“Improving drug discovery requires understanding all the biological processes involved in how drugs work,” said the paper’s first author Daniel Korn from the UNC-Chapel Hill Department of Computer Science. “COPs help broaden the concept of a drug’s mechanism of action so that knowledge graph mining can be used to discover the complete chain of events that enables a specific therapeutic effect for a drug.”

Knowledge graphs express data as a collection of nodes—such as drugs and diseases—with edges that represent the relationships—such as drug A treats disease B—between the nodes. By bringing together heterogeneous information into a single system, knowledge graphs can reveal relationships between previously unconnected information that wouldn’t be obvious otherwise.

“The real power of the COPs concept is that once we understand all the biological pathways connecting drugs and diseases, that information can be used to develop new therapeutic agents—or repurpose existing ones—that modulate the same biological pathway,” explained the paper’s senior author Alexander Tropsha from the UNC Eshelman School of Pharmacy.

As described in a Drug Discovery Today paper, the researchers define COPs as a chain of key events—molecular initiating event, intermediate event(s), and the clinical outcome—that are responsible for the therapeutic actions of a drug. Each element of the chain corresponds to a term defined in commonly used biomedical ontologies, which allows computational methods to be used to elucidate COPs and provides a way for them to be cataloged for future use.

Read more…