RENCI to help lead effort to make cancer research data more useful and accessible

The Renaissance Computing Institute (RENCI) at the University of North Carolina at Chapel Hill will collaborate on an $8.8 million, 3.5-year effort to make the volumes of data arising from cancer research more accessible, organized, and powerful. This contract was awarded by the Frederick National Laboratory for Cancer Research on behalf of the National Cancer Institute.

Led by Oregon State University (OSU), scientists from RENCI, Oregon Health & Science University, the University of Chicago, and Johns Hopkins University will team up to create and operate the Center for Cancer Data Harmonization (CCDH).

The CCDH will work with a cloud-based data-sharing portal called the Cancer Research Data Commons. In the Commons, the goal is for disparate types of data generated by everything from basic science studies to clinical trials to be integrated and structured in ways that help researchers make advances and clinicians provide the best treatments.

The center’s work will be organized around five key areas: community development, data model harmonization, ontology and terminology ecosystem, tools and data quality, and program management. 

RENCI will contribute expertise in incorporating ontologies into tools for data validation, harmonization, and quality control. As open biomedical datasets continue to increase, facilitating the ability of researchers to navigate those datasets and find insight becomes increasingly important – and increasingly difficult – according to RENCI Senior Research Scientist James Balhoff, PhD.

“Ontologies define relationships between concepts in a way that allows computers to do logical reasoning, but you need tools that take advantage of that to help with quality control,” said Balhoff. “Combining input from researchers and the work of the other institutions on the project to create semi-automated tools will empower data providers to prepare and QC their own data and to create a more searchable database within the Cancer Research Data Commons.”

 “Our team includes experts across the fields of data modeling, terminologies, enterprise software development, cancer research, and clinical oncology,” said lead principal investigator Melissa Haendel, who directs OSU’s Translational and Integrative Sciences Laboratory. “They have all created world-renowned programs exemplary of the kinds of expertise needed to create a new Cancer Data Ecosystem as outlined by the Cancer Moonshot Blue Ribbon Panel recommendation. We are exceptionally honored at OSU to be able to help lead this vision.”

RENCI researchers spearhead $20 million project to test a reimagined Internet

Collaboration will establish a nationwide network infrastructure

The University of North Carolina at Chapel Hill will lead a $20 million project to create a platform for testing novel internet architectures that could enable a faster, more secure Internet.

With leadership from researchers at the Renaissance Computing Institute (RENCI), UNC-Chapel Hill and its partners will build a platform, called FABRIC, to provide a nationwide testbed for reimagining how data can be stored, computed and moved through shared infrastructure. FABRIC, funded by the National Science Foundation, will allow scientists to explore what a new Internet could look like at scale, and help determine the internet architecture of the future.

The Internet is a global network of computers that communicate information back and forth, powering the worldwide web and all online devices and activities. A series of government-funded programs from the 1960s through the 1980s established the computer networking architectures that formed the basis for today’s Internet. FABRIC will help test out new network designs that could overcome current bottlenecks and continue to extend the Internet’s broad benefits for science and society. FABRIC will explore the balance between the amount of information a network maintains, the network’s ability to process information, and its scalability, performance and security.

“The Internet has been a great enabler for many science disciplines and in people’s everyday lives, but it is showing its age and limitations, especially when it comes to processing large amounts of data. If computer scientists were to start over today, knowing what they now know, the Internet might be designed in a different way,” said Ilya Baldin, director of Network Research & Infrastructure at RENCI, who will serve as one of five principal investigators on the project.

Anticipated FABRIC topology at the end of construction

“FABRIC represents large-scale network infrastructure where the Internet can be reimagined, and a variety of ideas can be tried out and compared. If FABRIC allows the research community to come up with ideas on how to reimagine the Internet based on a new set of architectural tradeoffs, then everybody wins – researchers and citizens alike.”

Today’s Internet was not designed for the massive data sets, machine learning tools, advanced sensors and Internet of Things devices that have become central to many research and business endeavors. FABRIC will give computer scientists a place to test networking and cybersecurity solutions that can better capitalize on these tools and potentially extend the Internet’s benefits to people in remote or underserved areas.

As lead, RENCI will oversee the effort while also contributing to software development, supporting hardware deployment and assisting with outreach efforts.

“The Network Research and Infrastructure Group at RENCI is an incredibly influential team of researchers, and this award demonstrates their efforts to ensure that continuing research in fundamental networking principles is available to all,” said Stan Ahalt, director of RENCI.

“Solving complicated problems today requires sophisticated data science, which is more than data management and analytics,” Ahalt said. “Data transport and data security are also vital, and the FABRIC project showcases RENCI’s impact on the fundamental infrastructure of data science by working toward creating new mechanisms to transport data quickly, efficiently, and securely.”

FABRIC will consist of storage, computational and network hardware nodes connected by dedicated high-speed optical links. In addition to the interconnected deeply-programmable core nodes deployed across the country, FABRIC nodes will include major national research facilities such as universities, national labs and supercomputing centers that generate and process enormous scientific data sets. Such flexibility and control over the network functionality will allow experimenters to test new architectures not possible today. All major aspects of the FABRIC infrastructure will be programmable, so researchers can create new configurations or tailor the platform for specific research purposes, such as cybersecurity.

“We don’t know what’s the right balance between smarts, or how self-knowledgeable the Internet needs to be, and scalability and performance,” said Baldin. “What we are offering is an instrument where these questions can be studied and researchers can make real progress toward envisioning the Internet of the future.”

Collaborating organizations include the University of Kentucky, the Department of Energy’s Energy Sciences Network, Clemson University and the Illinois Institute of Technology. Contributors from the University of Kentucky and Energy Sciences Network will be instrumental in designing and deploying the platform’s hardware and developing new software. Clemson and Illinois Institute of Technology researchers will work with a wide variety of user communities—including those focused on security, distributed architectures, scientific applications, and data transfer protocols—to ensure FABRIC can serve their needs. In addition, researchers from many other universities will help test the platform and integrate their computing infrastructure and scientific instruments into FABRIC.

The construction phase of the project is expected to last four years, with the first year dedicated to software development, finalizing technical designs, and prototyping. Subsequent years will focus on rolling out the platform’s hardware to participating sites across the nation and connecting it to major national computing facilities. Ultimately, experimenter communities will be able to attach new instruments or hardware resources to FABRIC’s uniquely extensible design, allowing the infrastructure to grow and adapt to changing research needs over time.

SUSE joins the iRODS Consortium

The iRODS Consortium, the foundation that leads development and support of the integrated Rule-Oriented Data System (iRODS) data management software, welcomes SUSE as its newest Consortium member.  

iRODS is open source storage data management software for data discovery, workflow automation, secure collaboration, and data virtualization. By creating a unified namespace and a metadata catalog of all the data and users within a storage environment, the iRODS rule engine allows users to automate data management. 

iRODS easily integrates with SUSE Enterprise Storage, powered by Ceph technology, enabling users to take control of their data, regardless of where and on what device the data is stored by integrating multiple storage tiers into a single storage cluster. As the newest iRODS Consortium member, SUSE will help direct the technology and governance of iRODS and will participate in the development and testing of the software, which is used by research and business organizations around the globe. 

“SUSE has a rich history of Linux distribution and open source support, and partnering with them will allow iRODS to make even deeper connections throughout the open source community,” said Jason Coposky, Executive Director, iRODS Consortium. “SUSE Enterprise Storage integrated with iRODS’ data management capabilities creates a compelling and comprehensive solution stack.” 

Alan Clark, SUSE CTO Office lead focused on Industry Initiatives and Emerging Standards and chairman of the OpenStack Foundation board of directors, said, “SUSE is excited to join the iRODS Consortium, lending our open source technical expertise to help advance the iRODS data management software. The integration with SUSE Enterprise Storage helps customers lower total cost of ownership, leveraging commodity hardware to support their iRODS-managed storage environments. As a leading provider of open source software, SUSE helps our customers leverage the latest open source technologies for application delivery and software-defined infrastructure. SUSE tests and hardens our solutions, ensuring they are enterprise ready and backed by our superior support experience.” 

The iRODS Consortium guides development and support of iRODS, along with providing production-ready iRODS distribution and iRODS professional integration services, training, and support. The consortium is administered by founding member RENCI, a research institute for applications of cyberinfrastructure located at the University of North Carolina at Chapel Hill

In addition to SUSE, current members of the iRODS Consortium include Bayer, Cloudian, CU Boulder Research Computing, DataDirect Networks, Maastricht University, MSC, the U.S. National Institute of Environmental Health Sciences, NetApp, Quantum, RENCI, SURF, the Swedish National Infrastructure for Computing, SUSE, Texas Advanced Computing Center (TACC), University College London, University of Groningen, Utrecht University, the Wellcome Sanger Institute, and Western Digital. 

Tagged , , |

South Big Data Hub receives second round of NSF funding

$4 million will support continued innovation and problem-solving in the Southern data science community

The National Science Foundation (NSF) recently announced the second phase of funding for the regional Big Data Innovation Hubs (Hubs). Each of the Hubs will receive $4 million over four years for a total investment of $16 million.

Each Hub is located in one of the four U.S. Census regions (South, Northeast, Midwest, and West) and serves as a thought leader and convening force on social and economic challenges that are unique to the region by playing four key roles: (1) Accelerating public-private partnerships that break down barriers between industry, academia, and government, (2) Growing R&D communities that connect data scientists with domain scientists and practitioners, (3) Facilitating data sharing and shared cyberinfrastructure and services, and (4) Building data science capacity for education and workforce development.

Read more
Tagged , , |

University of Colorado Boulder Research Computing joins iRODS Consortium

The iRODS Consortium, the foundation that leads development and support of the integrated Rule-Oriented Data System (iRODS) data management software, welcomes University of Colorado Boulder (CU Boulder) Research Computing as its newest Consortium member.

CU Boulder Research Computing provides computing and data beyond the desktop to CU Boulder researchers and students. This includes large-scale computing resources, storage of research data, high-speed data transfer, data sharing support, and consultations in computational science and data management. 

Read more
Tagged , , |

Training, talks, and a hackathon bring users together for iRODS 2019 User Group Meeting

Seats are filling fast for international gathering of data management experts

Users of the integrated Rule Oriented Data System (iRODS) will gather at Utrecht University in the Netherlands June 26-27 for an annual opportunity to discuss iRODS-enabled applications and discoveries.

Read more
Tagged , , |

iRODS Consortium welcomes Maastricht University as newest member

Maastricht University, led by the efforts of DataHub Maastricht, which provides data management services to researchers from the university and academic hospital, has joined the iRODS Consortium, the foundation that leads development and support of the integrated Rule-Oriented Data System (iRODS). Maastricht is the fourth organization from the Netherlands to join the consortium, after Utrecht University, the SURF cooperative and the University of Groningen.

Read more
Tagged , , , |

The Biomedical Data Translator Consortium announces publication of companion pieces

Prototype ‘Translator’ system shows promise and has garnered much enthusiasm roughly one year into feasibility assessment

The newly formed Biomedical Data Translator Consortium today announced the release of two inaugural publications in Clinical and Translational Science. The first paper, “Toward a Universal Biomedical Data Translator,” describes the efforts of the Consortium to develop a ‘Translator’ system designed to integrate a variety of data sources and translate the data into insights that can drive innovation and accelerate translational research. The second paper, “The Biomedical Data Translator Program: Conception, Culture, and Community,” focuses on the scientific community that has coalesced to support the program and drive research and development of the prototype Translator system.

Read more
Tagged , |

New Project Will Advance Virtual Laboratory Infrastructure

A new grant from the National Science Foundation (NSF) will fund operation of the Global Environment for Network Innovations (GENI) virtual laboratory for the next two years and support researchers in planning a new infrastructure to replace GENI. The NSF allocated $1.7 million to the effort, called Enabling NeTwork Research and the Evolution of a Next Generation Midscale Research Infrastructure (ENTeR). The project will be jointly led by researchers from the Renaissance Computing Institute (RENCI) of the University of North Carolina at Chapel Hill and their collaborators from the University of Kentucky (UK).

Read more
Tagged , , , , , |

RENCI Named as Collaborating Institution for $3 Million Cyberinfrastructure Center of Excellence Pilot

Project will create a model for advising NSF’s largest scientific facilities

The National Science Foundation today named the Renaissance Computing Institute (RENCI) of the University of North Carolina at Chapel Hill as a collaborating institution on a $3 million pilot project to create a model and strategic plan for a Cyberinfrastructure Center of Excellence (CI CoE). The goal of the effort is to establish a reservoir of expertise on best cyberinfrastructure practices for the nation’s largest research facilities.

Read more
Tagged , , |