RENCI’s Network Research and Infrastructure Group works to advance the nation’s cyberinfrastructure

For more than ten years, the Network Research and Infrastructure Group (NRIG) at RENCI has been developing specialized cyberinfrastructure critical for advancing computer science and a variety of scientific domains. Their projects are helping scientists use large amounts of data to make new discoveries and have enabled important new advances in distributing computing networks, cloud-based systems, and software-defined networks.

Next-generation testbeds

Thousands of computer scientists around the world use the cyberinfrastructure testbeds developed by NRIG to develop and experiment with new software and networking architectures. These activities aim to improve the Internet and ensure next-generation networks can handle large amounts of data securely.

The Global Environment for Network Innovations (GENI) virtual laboratory project, supported by the National Science Foundation (NSF), was one of NRIG’s first forays into federated research infrastructure. The GENI testbed allows researchers to develop and test networks and distributed applications at scale on a connected system that is separate from the Internet. Research published in more than 370 scientific papers has made use of this virtual laboratory.

NRIG’s director Ilya Baldin is the principal investigator for the portion of GENI known as ExoGENI, which is a distributed edge cloud system testbed. In collaboration with researchers led by Jeff Chase from Duke University, Baldin’s team developed the hardware and software necessary to create the cloud system as well as the middleware that controls access to the testbed.

“ExoGENI gave us experience in running a multi-organizational project involving large distributed systems,” said Baldin. “We had to figure out how to structure software development and how to deal with users for a completely new type of system. This experience and the collaborations we formed have proved critical to the success of many other projects.”

For instance, these experiences helped position NRIG to contribute to the NSF-funded cloud network platform known as Chameleon Cloud. This testbed provides tools for computer scientists to conceptualize, assemble, and try new cloud computing approaches. NRIG’s Paul Ruth is co-principal investigator of Chameleon Cloud, which launched in 2015 and entered its third phase of funding in 2020.

Scientists have used Chameleon Cloud to study power management, operating systems, virtualization, high performance computing, distributed computing, networking, security, machine learning, and more. In the third phase of the project, the RENCI team will develop new options for software-defined networking that will allow Chameleon to be compatible with the new NSF-funded FABRIC testbed.

Reimagining the Internet

FABRIC is a next-generation distributed system that combines a cloud system with high-speed optical links to give scientists a place to study new Internet architectures at scale. It will connect specialized testbeds and high-performance computing facilities around the world to create a rich fabric that can be used for a wide variety of experimental activities.

“If computer scientists were to start over and design the Internet from scratch today, it would likely be done in a very different way,” said Baldin, who is part of the leadership team for the FABRIC project. “Due to the huge cost reduction in computing memory and storage, it now seems feasible to add intelligence to the core of the network, rather than just processing data at the end hosts, as is done now.”

FABRIC is being designed to let computer scientists study completely new approaches to storing and processing data on the fly so that they can figure out what might be viable. Although the project is two years into a planned four-year construction, parts of the testbed are already operational, and teams are actively building hardware and software to enable experimenters to use the system.

In October 2020, the NSF funded work to expand FABRIC beyond the U.S. Placing FABRIC nodes overseas will allow experiments to move large amounts of data across long distances. These additional sites will be at the University of Tokyo; CERN, the European Organization for Nuclear Research in Geneva, Switzerland; the University of Bristol in the UK; and the University of Amsterdam.

Making networks smart

One of the latest trends in network design is the use of data-oriented approaches including artificial intelligence and machine learning (AI/ML) to make networks more intelligent. For example, the multi-institutional Poseidon project, funded by the Department of Energy, will leverage testbeds like FABRIC and Chameleon to provide a modeling environment that uses AI/ML to predict the performance of scientific applications on distributed infrastructures.

NRIG’s Anirban Mandal will lead the part of the Poseidon project focused on performance guidance for optimizing workflows. RENCI will also be developing ways to use AI/ML to detect and classify workflow anomalies and help train and validate ML models on the testbeds.

NRIG is also helping to apply AI/ML methods to upgrade AtlanticWave-SDX, a distributed experimental software-defined exchange (SDX), which uses cutting-edge network technology to facilitate data exchange among research and education networks in the U.S. and abroad. AtlanticWave-SDX is critical for research utilizing the Vera C. Rubin Observatory in Chile, which produces 20 terabytes of data each night that must be quickly, securely, and reliably transmitted to the U.S.

NRIG’s Yufeng Xin is leading a team that will help extend the software to create new network monitoring and analysis capabilities that integrate the latest ML technologies. “This monitoring will eventually be used to create a network that can automatically respond to problems,” said Xin. “Detecting and fixing network problems is critical when transferring extremely large amounts of data over long distances.”

Strengthening cyberinfrastructure

CI Compass, one of NRIG’s newest projects, aims to improve the cyberinfrastructure used by NSF Major Facilities, which generate large amounts of data for research in astronomy, physics, environmental science, and other key domains. “Advanced data acquisition, storage, management, integration, mining, visualization, and computational processing services are critical for fulfilling the science missions for the NSF Major Facilities,” said Mandal, who is a co-principal investigator for the project. “We’ll be offering our expertise in computer science and network infrastructure to help enhance their cyberinfrastructure.”

The CI Compass team will provide expertise to help enhance and evolve the Major Facilities cyberinfrastructure, capture and disseminate cyberinfrastructure knowledge and best practices, and enable knowledge sharing among Major Facilities and the broader cyberinfrastructure community.

By delivering on prominent projects involving computer science teams across the U.S. and even around the world, NRIG has built a set of partners who now often approach NRIG to work with them on a variety of cutting-edge projects. The group continues to look for new ways to help computer scientists create a better Internet and to enhance scientific productivity by solving critical cyberinfrastructure problems.

RENCI-developed software helps train computers to read 3D microscopy images of the brain

New tool could help scientists understand brain structure changes underlying conditions such as autism

Scientists can now acquire detailed 3D microscopy images of an entire mouse brain in just hours thanks to technology advances such as the high-speed imaging technique known as light sheet microscopy. Although this new imaging data is providing incredible insights into the relationships between brain structure and disease, behavior and cognition, it also comes with some big analysis challenges.

The images obtained with light sheet microscopy capture subcellular information for the approximately 100 million cells that make up the mouse brain. Making full use of this huge amount of data requires the daunting task of identifying important features such as nuclei in every cell. Although machine learning can help, algorithms must be trained to understand what a nucleus looks like, which requires large numbers of manually labeled nuclei to use as training data.

“Creating the training data is a challenging problem because the images can be noisy, and in some areas of the brain, the nuclei are packed so densely that it is hard to separate them out,” said David Borland,  senior visualization researcher at RENCI and co-PI of the Nuclei Ninja project that is developing a high throughput platform for exploring and analyzing whole brain tissue cleared images. “To solve this problem, we developed the Segmentor software to produce high-quality data for training a machine learning algorithm to perform automatic segmentation.”

Acquiring training data

Segmentation of nuclei in brain images requires labeling all the 3D pixels, or voxels, that represent an individual nucleus in an image. The Segmentor software provides a very rough segmentation that a person can then refine. These refinements are fed back to the machine learning algorithm, which can use this input to produce segmentations that give users a better starting point for refining the segmentation next time. Once enough training data has been acquired, the algorithm should be able to produce segmentations that only need minimal corrections.

“This tool will eventually allow us to easily quantify differences in brain structure that are caused by genetic mutations associated with conditions that affect the brain, such as autism spectrum disorder,” said Borland. “And while we designed Segmentor for microscopy images, it can also be used with other imaging modalities such as MRI and CT.”

Light sheet microscopy produces a series of 2D image slices that are put together to create a large 3D image. One of the biggest challenges for creating Segmentor was figuring out ways for the user to intuitively interact with such complicated 3D data. The manual segmentation process must also be efficient because a single volume of data assigned for segmentation can contain thousands of nuclei.

Combining 2D and 3D visualization

“We created 3D visualization and 2D slice-based views that work in concert so that the user can edit in either view,” said Borland. “The 2D view reveals voxel intensities while the 3D view can make it easier to see the cellular geometry, which is especially useful if nuclei are close together.”

In a recent paper, the researchers showed that the Segmentor 2D-3D hybrid approach was two times faster than editing the same set of images with only 2D capabilities, without sacrificing accuracy. This increased efficiency could help the researchers more quickly reach their goal of acquiring around 20,000 high-quality manual 3D nuclei segmentations to train the machine learning algorithm.

A citizen science solution

To further ramp up the rate of acquiring training data, the researchers are working to turn their desktop tool into cloud-based citizen science software.

“We want to make an interface for people that is simpler and helps further distribute the work so that we can get more people contributing,” said Borland. “This is important because there are many areas of the brain, each with different characteristics. Even if we have enough training data for one part of the brain, we still need more training data for other parts of the brain to get good results for a whole brain.”

Other PIs for the Nuclei Ninja project include Guorong Wu and Jason L. Stein from UNC Chapel Hill and Minjeong Kim from UNC Greensboro.

Researchers developed the Segmentor software to produce high-quality data for training a machine learning algorithm to perform automatic segmentation. A volume rendering of the raw image intensities in the 3D view is shown.
The new software provides an initial nuclear segmentation (shown here in the 3D view) that is then refined by the user. These refinements are fed back to the machine learning algorithm, which can use this input to produce segmentations that give users a better starting point for refining the segmentation next time.
This figure shows an overview of the refining workflow for the Segmentor software.

Experts and researchers balance the scales at the NSF Conference on Data Science and Law

Data Science and Law are both disciplines that have perceived high barriers for entry. With data science, outsiders are overwhelmed by the thought of having to understand hard math and complicated computer code, as the Chief Justice of the Supreme Court demonstrated when he called statistical evidence of political gerrymandering “sociological gobbledygook.” With respect to law, computer and data scientists feel unequipped to interpret the fairness and justice of their work and perhaps do not even see it as relevant. Many data practitioners believe, “I am just writing an algorithm. It’s math and data; I’m not responsible for what happens downstream.” 

“As data increasingly affects all aspects of daily life, we cannot continue to let data science exist in a vacuum, without thinking of the legal, ethical, and societal implications that result from that math and data. We are being reminded daily of the inadequacy of legal frameworks and lack of governmental oversight of data protection, privacy, and security,” said Sarah Davis, senior project manager at RENCI. “Similarly, legal practitioners and researchers cannot ignore or willfully misunderstand the opportunities and dangers of a data-centric society. Increasingly, ‘black box’ algorithms will be used to make decisions that may attack privacy rights, violate due process, or discriminate against protected groups.”

Read more…
Tagged , |

RENCI partners with CUAHSI and others to launch Critical Zone Collaborative Network Hub

Five year cooperative agreement offers opportunity to accelerate research on boundary layers of rock, soil, air, water, and living organisms 

The Consortium of Universities for the Advancement of Hydrologic Science (CUAHSI) has been selected to be the Coordinating Hub for the NSF-funded Critical Zone (CZ) Collaborative Network

Collaborators in this new venture include representatives from RENCI, the US Geological Survey, Pennsylvania State University, Utah State University, and the Lahmont-Doherty Earth Observatory of Columbia University. All members of the team have experience with Critical Zone Science and the previous Critical Zone Observatory Network.

Read more…

RENCI to help guide effort to improve the efficiency of drone applications by leveraging edge, in-network, and cloud computing

The Renaissance Computing Institute (RENCI) at the University of North Carolina at Chapel Hill will collaborate on a $749,998, two-year effort to develop new architectures and tools for the safe, efficient, and economic operation of drones. The funding was awarded by the National Science Foundation (NSF).

Led by the University of Massachusetts Amherst, scientists from RENCI, the Information Sciences Institute (ISI) at the University of Southern California (USC), and the University of Missouri, will collaborate on FlyNet, a project that will utilize edge, cloud, and in-network computing to generate crucial data that will help them address a variety of pressing issues presented by drones.

Read more…
Tagged , , |

RENCI to develop advanced network software for AtlanticWave-SDX 2.0

Sharing big data requires big networks. Systems like AtlanticWave-SDX, which connects networks in the U.S., Chile, Brazil, and South Africa, provide specialized infrastructure needed to send vast amounts of scientific data across long distances, helping scientists make the most of powerful data collections.

RENCI scientists contributed to the development of AtlanticWave-SDX, a distributed experimental software-defined exchange (SDX) that uses cutting-edge network technology to facilitate the exchange of data between research and education networks in the U.S. with networks on other continents.

Now, RENCI will play a leading role in software development and testing for AtlanticWave-SDX 2.0. The five-year project, supported by a recent $6.5-million award from the U.S. National Science Foundation (NSF), is led by Florida International University and also includes the University of Southern California.

Read more…
Tagged , |

ROBOKOP technology offers faster, easier exploration of emerging COVID-19 research

As scientists around the world urgently work to understand the best ways to diagnose and treat COVID-19, quick and easy access to the latest research findings and rapid exploration of emerging data have become critical. RENCI scientists have developed new tools and approaches that can help researchers make important discoveries and answer key questions about COVID-19 in record time.

“These new approaches allow scientists to blend together novel observations and information from recent papers with previously known information that can be used to inform, contextualize, and test new COVID-19 information,” said Chris Bizon, director of analytics and data science at RENCI.

Read more…
Tagged , , |

New digital laboratory helps get COVID-19 analyses up and running quickly

Data analysis and visualization are helping answer a variety of questions about COVID-19 such as who is most at risk, how is the disease spreading, and what approaches might work best for treatments. However, setting up a computer environment to analyze the large amounts of data needed to answer such questions is no easy task. It requires selecting data libraries, software, and hardware and estimating how much memory and computing power will be needed. This process is time consuming and few individuals have the complex skill set needed to accomplish it.

RENCI scientists have developed a new digital data science laboratory called Blackbalsam that can help significantly shorten the planning stage for these efforts with a standardized environment housing computational and data sets for COVID-19 analytics.  

“As COVID-19 progressed, I saw that researchers were conducting analyses and visualization on an increasingly varied set of COVID-19 data,” said Blackbalsam co-author Steven Cox, assistant director of software systems architecture at RENCI. “I realized that it would be very helpful to have an environment that overcomes well-known technological and skill barriers by providing an interface that researchers with statistical, analytical, and visualization skills could use.”

Read more…
Tagged , |

Professor learns new lessons while teaching during a pandemic

When UNC students left for spring break on March 9, the COVID-19 public health crisis was just heating up. Soon after, UNC administrators made the decision to move to remote teaching and extended the break by a week to give instructors time to prepare. RENCI Deputy Director Ashok Krishnamurthy was one of many UNC professors who made the quick transition to teaching via video conferencing on Zoom.

What course were you teaching when you received notice that classes would all be moved online?

I was teaching a computer science course called Introduction to Scientific Programming that is designed for non-computer science majors. Most of the students take the class to learn programming skills for their day-to-day work or research. My section of the course had about 160 students enrolled.

How easily were you able to convert this class to a virtual format?

Fortunately, the course was relatively easy to adapt to virtual teaching. The UNC Computer Science department, and my colleague John Majikes who was teaching another section of the same course, have set up this course in such a way that taking it online was quite straightforward.

Read more…
Tagged , |

Beyond data: Supporting community during a pandemic

Families showing off their new face masks, donated by Sarah Davis.

When COVID-19 cases began to appear across the country, many RENCI employees felt a call to action. While several took it upon themselves to develop new data science technologies or to adapt existing ones to process COVID-19 data, others have contributed to communities in need by creating face masks, assisting food banks, connecting researchers to projects, and supporting foster youth.

Creating Face Masks

Like many across the nation, some RENCI employees have started sewing face masks to donate to medical workers, neighbors, and people in need.

Read more…
Tagged , |