AUSTIN, Texas – Presentations about key RENCI projects, including iRODS and the iRODS Consortium, ExoGENI and advanced networking, and the National Consortium for Data Science (NCDS), will be featured in the RENCI exhibit at SC15, the world’s premier conference for high performance computing, networking, storage, and analysis.
The RENCI booth (#181) will open at 7 p.m. Monday, Nov. 16. RENCI experts will be in the booth to talk about and demonstrate ExoGENI and Networking Infrastructure as a Service (NIaaS), the integrated Rule Oriented Data System (iRODS) and the iRODs Consortium, the Resource Aware Data-centrIc collaboratIon Infrastructure (RADII), the NCDS, and more.
In addition to the Monday night opening gala, exhibit hours for the conference will be 10 a.m. – 6 p.m. Tuesday, Nov. 17 and Wednesday, Nov. 18, and 10 a.m. – 3 p.m. Thursday, Nov. 19.
For more on ExoGENI’s role at SC15, see the story ExoGENI featured in SCinet Network Research Exhibition demos.
Look for updates about our activities at SC15 via the following social media:
7 p.m. – 9 p.m.
Welcome SC15 attendees! The RENCI booth tonight will feature information on:
- Advanced networking
- The iRODS data management platform and the iRODS Consortium
- The National Consortium for Data Science (NCDS)
- The new National Science Foundation Big Data Innovation Hubs Program
Resource Aware Data-centrIc collaboratIon Infrastructure (RADII)
ADCIRC in the Cloud: ExoGENI Using CloudLab
The South Big Data Regional Innovation Hub (South BD Hub)
The National Consortium for Data Science
iRODS Overview
Virtualized Science DMZ-as-a-Service
iRODS Software and Consortium Update
iRODS Demonstrations
Q and A: iRODS Technology Deep Dives and Other Buzzwords
10:30 a.m. – 11:30 a.m.
Resource Aware Data-centrIc collaboratIon Infrastructure (RADII)
Presenter: Claris Castillo, RENCI Senior Computational & Networked Systems Researcher
Description: Data-centric collaborations have become the engines of scientific research. However, these collaborations can be difficult to realize because the appropriate infrastructure, including dedicated network infrastructure needed to transfer large data sets, is often unavailable and few mechanisms exist for controlling data access. Solutions that bridge the gap between infrastructure and data management technologies are needed to make data-centric collaborations feasible.
This demonstration presents a novel cloud-based platform, called Resource Aware Data-centrIc collaboratIon Infrastructure (RADII), that addresses these challenges. RADII integrates the Open Resource Control Architecture (ORCA) and integrated Rule Oriented Data System (iRODS) to allow scientists to create and manage collaborations. The research team will show how scientists can use RADII to create data-centric collaborations using data-flow diagram formalisms. RADII provides a user-friendly graphical interface that scientists can use to determine their infrastructure requirements and data access policies. The policies are then automatically mapped to the infrastructure and data management system by the RADII software. The demonstration will also show how RADII allows scientists to manage their collaborations throughout the lifecycle of a project. The team has deployed RADII on ExoGENI to support collaborations over a worldwide federated environment of resources and infrastructure.
********
11:30 a.m – 12:30 pm
ADCIRC in the Cloud: ExoGENI Using CloudLab
Presenter: Paul Ruth, RENCI Senior Distributed Systems Researcher
Description: The new mid-scale infrastructures deployed by NSFCloud are now available for use by the GENI community and will enable research into the future of cloud computing. They also make possible the deployment of large dynamic ExoGENI racks on CloudLab infrastructure. This demonstration will create a CloudLab slice containing a functional ExoGENI rack (ExoGENI is part of the NSF GENI federation of test beds) and submit a slice request to the ExoGENI rack using its native API. The ExoGENI slice will then run ADCIRC, a storm surge modeling system.
Currently, private ExoGENI racks can be deployed on CloudLab clusters at APT, Clemson University and the University of Wisconsin. Private racks containing as many as 64 nodes (512 cores) have been deployed so far and have been used to create slices containing as many as 512 VMs. These ExoGENI VMs can utilize advanced hardware capabilities, such as the SR-IOV enabled Infiniband network fabric at the APT site. Already, an ExoGENI slice of 64 VMs (512 cores) has been used to run ADCIRC at speeds fast enough to contribute to an urgent hurricane simulation.
********
12:30 p.m. – 1:30 p.m.
The South Big Data Regional Innovation Hub (South BD Hub)
Informational slide presentation
Description: On October 29, The University of North Carolina’s RENCI and the Georgia Institute of Technology’s College of Computing were chosen as co-leads of an effort to establish a Big Data Regional Innovation Hub covering 16 southern U.S. states and the District of Columbia. The South BD Hub will be developed through the National Science Foundation’s new Big Data Regional Innovation Hubs (BD Hubs) initiative, designed to build innovative public-private partnerships on the key challenges and opportunities related to big data. This informational slide presentation provides general information about the South BD Hub, including plans to develop Hub “spokes” to address regional priorities in five areas: healthcare and health disparities; coastal hazards; industrial big data; materials and manufacturing; and habitat planning.
********
The National Consortium for Data Science
Informational slide presentation
Description: The National Consortium for Data Science (NCDS) launched in April 2013 as public-private partnership to address the challenges and opportunities posed by massive data sets being created by digital medicine, environmental sensors, scientific instruments, social networks, and more. Its goals include: encouraging collaboration among industry, academia and government on data science research and problem solving; providing members with access to experts in other fields and domains to help address their data challenges; encouraging data science research that spans academia, industry and government; facilitating improved data science education; and supporting technical, ethical and policy standards for data. Members include research universities in North Carolina, Drexel University, Cisco, Deloitte LLP, GE, IBM, MCNC and RTI International. This presentation provides an overview of the benefits of membership and NCDS activities aimed at advancing data science.
********
1:30 p.m. – 2:30 p.m.
iRODS Overview
Presenters: Jason Coposky, iRODS Chief Technologist; Terrell Russell, iRODS Senior Data Scientist; Dan Bedard, iRODS Consortium Executive Director
Description: This talk will present an overview of iRODS (http://irods.org), including an overview of its functions, architecture, use cases, and future directions. iRODS is open source data grid middleware that consolidates the management of heterogeneous data storage technologies. Equipped with configurable automation and metadata cataloging capabilities, over 100 PB of data is managed using iRODS worldwide. Example use cases include tracking gene sequencing workflows at several of the world’s preeminent research institutes and streaming terabytes of production video footage across the globe.
********
2:30 – 3:30 pm
Virtualized Science DMZ-as-a-Service
Presenters: Ilya Baldin, RENCI Director of Network Research & Infrastructure; Inder Monga, CTO and Area Lead, Energy Sciences Network (ESnet)
Description: Many campuses are installing ScienceDMZs to support efficient large-scale scientific data transfers. There’s a need to create custom configurations of ScienceDMZs for different groups on campus. Network function virtualization (NFV) combined with compute and storage virtualization enables a multi-tenant approach to deploying virtual ScienceDMZs. It makes it possible for campus IT or NREN organizations to quickly deploy well-tuned ScienceDMZ instances targeted at a particular collaboration or project. This presentation shows a prototype implementation of ScienceDMZ-as-a-Service using ExoGENI racks (ExoGENI is part of NSF GENI federation of test beds) deployed at the StarLight facility in Chicago and at NERSC. The virtual ScienceDMZs deployed on-demand in these racks connect to a data source at Argonne National Lab and a compute cluster at NERSC to provide seamless end-to-end high-speed data transfers of data acquired from Argonne’s Advanced Photon Source (APS) to be processed at NERSC. The ExoGENI racks dynamically instantiate necessary virtual compute resources for ScienceDMZ functions and connect to each other on demand using ESnet’s OSCARS and Internet2’s AL2S system.
********
3:30 p.m. – 4:30 p.m.
iRODS Software and Consortium Update
Presenters: Jason Coposky, iRODS Chief Technologist; Terrell Russell, iRODS Senior Data Scientist; Dan Bedard, iRODS Consortium Executive Director
Description: This talk will discuss the history, the present, and the future of iRODS development. The iRODS team will also discuss the state of the iRODS Consortium, the organization that supports continued development of iRODS data management software as free open source software.
********
4:30 p.m. – 5:30 p.m.
iRODS Demonstrations
Presenters: Jason Coposky, iRODS Chief Technologist; Terrell Russell, iRODS Senior Data Scientist; Dan Bedard, iRODS Consortium Executive Director
Description: Using material from recently presented workshops, we will demonstrate several key features that have made iRODS a critical technology for research organizations worldwide. We will show how storage resource composition makes it easy to distribute and replicate data across multiple file systems; how the iRODS rule engine can automate searchable metadata annotation; and how federation can be allows users to access data at remote sites through a common interface.
********
ONGOING (Kiosk 2)
Q and A: iRODS Technology Deep Dives and Other Buzzwords
Curious about how the RENCI iRODS development team is connecting unstructured data with structured metadata? Want to know how iRODS connects file systems and object stores? Interested in pluggable authentication? APIs? Stop by our booth and AUA—that’s Ask Us Anything about distributed data management.
The South Big Data Regional Innovation Hub (South BD Hub)
Informational slide presentation, see description above
The National Consortium for Data Science
Informational slide presentation, see description above
PANORAMA: Predictive Modeling and Diagnostic Monitoring of Extreme Science Workflows
ADCIRC in the Cloud: ExoGENI Using CloudLab
Resource Aware Data-centrIc collaboratIon Infrastructure (RADII)
Virtualized Science DMZ-as-a-Service
iRODS Software and Consortium Update
iRODS Demonstrations
10:30 a.m. – 12:30 p.m.
PANORAMA: Predictive Modeling and Diagnostic Monitoring of Extreme Science Workflows
Presenters: Ewa Deelman and Gideon Juve, University of Southern California Information Science Institute; Anirban Mandal, RENCI; Jeffrey Vetter, Oak Ridge National Laboratory
Description: As we move closer to the ability to execute exascale calculations and process the ensuing extreme-scale amounts of data produced by both experiments and computations, the complexity of managing the compute and data analysis tasks has grown beyond the capabilities of scientists. Thus, workflow management systems are necessary to ensure current and future scientific discoveries. A key research question for these workflow management systems concerns performance optimization of complex calculation and data analysis tasks. This presentation will showcase the PANORAMA approach for modeling and diagnosing the run-time performance of complex scientific workflows. This approach integrates extreme-scale systems test bed experimentation, structured analytical modeling, and parallel systems simulation into a comprehensive workflow framework called Pegasus for understanding and improving the overall performance of complex scientific workflows.
********
12:30 p.m. – 1:30 p.m.
ADCIRC in the Cloud: ExoGENI Using CloudLab
Presenter: Paul Ruth, RENCI Senior Distributed Systems Researcher
Description: The new mid-scale infrastructures deployed by NSFCloud are now available for use by the GENI community and will enable research into the future of cloud computing. They also make possible the deployment of large dynamic ExoGENI racks on CloudLab infrastructure. This demonstration will create a CloudLab slice containing a functional ExoGENI rack (ExoGENI is part of the NSF GENI federation of test beds) and submit a slice request to the ExoGENI rack using its native API. The ExoGENI slice will then run ADCIRC, a storm surge modeling system.
Currently, private ExoGENI racks can be deployed on CloudLab clusters at APT, Clemson University and the University of Wisconsin. Private racks containing as many as 64 nodes (512 cores) have been deployed so far and have been used to create slices containing as many as 512 VMs. These ExoGENI VMs can utilize advanced hardware capabilities, such as the SR-IOV enabled Infiniband network fabric at the APT site. Already, an ExoGENI slice of 64 VMs (512 cores) has been used to run ADCIRC at speeds fast enough to contribute to an urgent hurricane simulation.
********
1:30 p.m. – 2:30 p.m.
Resource Aware Data-centrIc collaboratIon Infrastructure (RADII)
Presenter: Claris Castillo, RENCI Senior Computational & Networked Systems Researcher
Description: Data-centric collaborations have become the engines of scientific research. However, these collaborations can be difficult to realize because the appropriate infrastructure, including dedicated network infrastructure needed to transfer large data sets, is often unavailable and few mechanisms exist for controlling data access. Solutions that bridge the gap between infrastructure and data management technologies are needed to make data-centric collaborations feasible.
This demonstration presents a novel cloud-based platform, called Resource Aware Data-centrIc collaboratIon Infrastructure (RADII), that addresses these challenges. RADII integrates the Open Resource Control Architecture (ORCA) and integrated Rule Oriented Data System (iRODS) to allow scientists to create and manage collaborations. The research team will show how scientists can use RADII to create data-centric collaborations using data-flow diagram formalisms. RADII provides a user-friendly graphical interface that scientists can use to determine their infrastructure requirements and data access policies. The policies are then automatically mapped to the infrastructure and data management system by the RADII software. The demonstration will also show how RADII allows scientists to manage their collaborations throughout the lifecycle of a project. The team has deployed RADII on ExoGENI to support collaborations over a worldwide federated environment of resources and infrastructure.
********
2:30 p.m. – 3:30 p.m.
Virtualized Science DMZ-as-a-Service
Presentesr: Ilya Baldin, RENCI Director of Network Research & Infrastructure; Inder Monga, CTO and Area Lead, Energy Sciences Network (ESnet)
Description: Many campuses are installing ScienceDMZs to support efficient large-scale scientific data transfers. There’s a need to create custom configurations of ScienceDMZs for different groups on campus. Network function virtualization (NFV) combined with compute and storage virtualization enables a multi-tenant approach to deploying virtual ScienceDMZs. It makes it possible for campus IT or NREN organizations to quickly deploy well-tuned ScienceDMZ instances targeted at a particular collaboration or project. This presentation shows a prototype implementation of ScienceDMZ-as-a-Service using ExoGENI racks (ExoGENI is part of NSF GENI federation of test beds) deployed at the StarLight facility in Chicago and at NERSC. The virtual ScienceDMZs deployed on-demand in these racks connect to a data source at Argonne National Lab and a compute cluster at NERSC to provide seamless end-to-end high-speed data transfers of data acquired from Argonne’s Advanced Photon Source (APS) to be processed at NERSC. The ExoGENI racks dynamically instantiate necessary virtual compute resources for ScienceDMZ functions and connect to each other on demand using ESnet’s OSCARS and Internet2’s AL2S system.
********
3:30 p.m. – 4:30 p.m.
iRODS Software and Consortium Update
Presenters: Jason Coposky, iRODS Chief Technologist; Terrell Russell, iRODS data management research scientist; Dan Bedard, iRODS Consortium Executive Director
Description: This talk will discuss the history, the present, and the future of iRODS development. The iRODS team will also discuss the state of the iRODS Consortium, the organization that supports continued development of iRODS data management software as free open source software.
********
4:30 p.m. – 5:30 p.m.
iRODS Demonstrations
Presenters: Jason Coposky, iRODS Chief Technologist; Terrell Russell, iRODS data management research scientist; Dan Bedard, iRODS Consortium Executive Director
Description: Using material from recently presented workshops, we will demonstrate several key features that have made iRODS a critical technology for research organizations worldwide. We will show how storage resource composition makes it easy to distribute and replicate data across multiple file systems; how the iRODS rule engine can automate searchable metadata annotation; and how federation can be allows users to access data at remote sites through a common interface.
********
ONGOING (Kiosk 2)
Q and A: iRODS Technology Deep Dives and Other Buzzwords
Curious about how the RENCI iRODS development team is connecting unstructured data with structured metadata? Want to know how iRODS connects file systems and object stores? Interested in pluggable authentication? APIs? Stop by our booth and AUA—that’s Ask Us Anything about distributed data management.
The South Big Data Regional Innovation Hub (South BD Hub)
Description: On November 2, The University of North Carolina’s RENCI and the Georgia Institute of Technology’s College of Computing were chosen as co-leads of an effort to establish a Big Data Regional Innovation Hub covering 16 southern U.S. states and the District of Columbia. The South BD Hub will be developed through the National Science Foundation’s new Big Data Regional Innovation Hubs (BD Hubs) initiative, designed to build innovative public-private partnerships on the key challenges and opportunities related to big data. This informational slide presentation provides general information about the South BD Hub, including plans to develop Hub “spokes” to address regional priorities in five areas: healthcare and health disparities; coastal hazards; industrial big data; materials and manufacturing; and habitat planning.
The National Consortium for Data Science
Description: The National Consortium for Data Science (NCDS) launched in April 2013 as a public-private partnership to address the challenges and opportunities posed by massive data sets being created by digital medicine, environmental sensors, scientific instruments, social networks, and more. Its goals include: encouraging collaboration among industry, academia and government on data science research and problem solving; providing members with access to experts in other fields and domains to help address their data challenges; encouraging data science research that spans academia, industry and government; facilitating improved data science education; and supporting technical, ethical and policy standards for data. Members include research universities in North Carolina, Drexel University, Cisco, Deloitte LLP, GE, IBM, MCNC and RTI International. This presentation provides an overview of the benefits of membership and NCDS activities aimed at advancing data science.
10:30 a.m. – 12:30 p.m.
PANORAMA: Predictive Modeling and Diagnostic Monitoring of Extreme Science Workflows
Presenters: Ewa Deelman and Gideon Juve, University of Southern California Information Science Institute; Anirban Mandal, RENCI; Jeffrey Vetter, Oak Ridge National Laboratory
Description: As we move closer to the ability to execute exascale calculations and process the ensuing extreme-scale amounts of data produced by both experiments and computations, the complexity of managing the compute and data analysis tasks has grown beyond the capabilities of scientists. Thus, workflow management systems are necessary to ensure current and future scientific discoveries. A key research question for these workflow management systems concerns performance optimization of complex calculation and data analysis tasks. This presentation will showcase the PANORAMA approach for modeling and diagnosing the run-time performance of complex scientific workflows. This approach integrates extreme-scale systems test bed experimentation, structured analytical modeling, and parallel systems simulation into a comprehensive workflow framework called Pegasus for understanding and improving the overall performance of complex scientific workflows.
********
1:30 p.m. – 2:30 p.m.
Virtualized Science DMZ-as-a-Service
Presentesr: Ilya Baldin, RENCI Director of Network Research & Infrastructure; Inder Monga, CTO and Area Lead, Energy Sciences Network (ESnet)
Description: Many campuses are installing ScienceDMZs to support efficient large-scale scientific data transfers. There’s a need to create custom configurations of ScienceDMZs for different groups on campus. Network function virtualization (NFV) combined with compute and storage virtualization enables a multi-tenant approach to deploying virtual ScienceDMZs. It makes it possible for campus IT or NREN organizations to quickly deploy well-tuned ScienceDMZ instances targeted at a particular collaboration or project. This presentation shows a prototype implementation of ScienceDMZ-as-a-Service using ExoGENI racks (ExoGENI is part of NSF GENI federation of test beds) deployed at the StarLight facility in Chicago and at NERSC. The virtual ScienceDMZs deployed on-demand in these racks connect to a data source at Argonne National Lab and a compute cluster at NERSC to provide seamless end-to-end high-speed data transfers of data acquired from Argonne’s Advanced Photon Source (APS) to be processed at NERSC. The ExoGENI racks dynamically instantiate necessary virtual compute resources for ScienceDMZ functions and connect to each other on demand using ESnet’s OSCARS and Internet2’s AL2S system.
********
ONGOING (Kiosk 2)
Q and A: iRODS Technology Deep Dives and Other Buzzwords
Stop by our booth and AUA—that’s Ask Us Anything about distributed data management.
The South Big Data Regional Innovation Hub (South BD Hub)
Description: On November 2, The University of North Carolina’s RENCI and the Georgia Institute of Technology’s College of Computing were chosen as co-leads of an effort to establish a Big Data Regional Innovation Hub covering 16 southern U.S. states and the District of Columbia. The South BD Hub will be developed through the National Science Foundation’s new Big Data Regional Innovation Hubs (BD Hubs) initiative, designed to build innovative public-private partnerships on the key challenges and opportunities related to big data. This informational slide presentation provides general information about the South BD Hub, including plans to develop Hub “spokes” to address regional priorities in five areas: healthcare and health disparities; coastal hazards; industrial big data; materials and manufacturing; and habitat planning.
The National Consortium for Data Science
Description: The National Consortium for Data Science (NCDS) launched in April 2013 as public-private partnership to address the challenges and opportunities posed by massive data sets being created by digital medicine, environmental sensors, scientific instruments, social networks, and more. Its goals include: encouraging collaboration among industry, academia and government on data science research and problem solving; providing members with access to experts in other fields and domains to help address their data challenges; encouraging data science research that spans academia, industry and government; facilitating improved data science education; and supporting technical, ethical and policy standards for data. Members include research universities in North Carolina, Drexel University, Cisco, Deloitte LLP, GE, IBM, MCNC and RTI International. This presentation provides an overview of the benefits of membership and NCDS activities aimed at advancing data science.