What to expect at the iRODS 2022 User Group Meeting

The worldwide iRODS community will gather in Leuven, Belgium, from July 5 – 8 

Members of the iRODS user community will meet at KU Leuven in Belgium for the 14th Annual iRODS User Group Meeting to participate in four days of learning, sharing use cases, and discussing new capabilities that have been added to iRODS in the last year.

The event, sponsored by KU Leuven, RENCI, Vlaams Supercomputer Centrum, and Fujifilm, will provide in-person and virtual options for attendance. An audience of over 100 participants representing dozens of academic, government, and commercial institutions is expected to join.

“We are excited to meet in-person for the first time in three years to learn about the global impact of iRODS in fields such as life sciences, healthcare, cybernetics, and more,” said Terrell Russell, executive director of the iRODS Consortium. “In addition to hearing talks from our user community, the 2022 iRODS User Group Meeting will provide users the chance to network and collaborate throughout the week.”

In June, the iRODS Consortium and RENCI announced the release of iRODS 4.3.0. Along with supporting two additional operating systems, a notable new feature in the release is Delay Server Migration. The iRODS Delay Server can now be safely moved from one iRODS server to another without requiring a restart, which will provide administrators with flexibility when the system is under continuous load.

Another new feature is programmable authentication workflows. In the past, iRODS has supported various authentication methods such as native authentication, GSI, Kerberos, OpenID, with new authentication methods implemented as shared libraries that needed to be installed on the client and server side, often requiring patches for existing client libraries. The iRODS Consortium, in collaboration with SURF, has implemented an authentication plugin for iRODS 4.3.0 “pam_interactive” that enables the flexibility of fully-fledged PAM (pluggable authentication module) authentication flows.

During last year’s UGM, users learned about the Python iRODS 1.0.0 client and the S3 Resource plugin. Version 1.1.4 of the Python iRODS client is now available, and includes fixes for the XML protocol, connection reuse, the anonymous user, ticket enhancements, and compatibility with iRODS talking directly to S3. The iRODS S3 Resource Plugin has been extended to honor the Glacier semantics of an S3 storage system including reacting appropriately to responses that indicate the data requested will be available later. 

As always with the annual UGM, in addition to general software updates, users will offer presentations about their organizations’ deployments of iRODS. This year’s meeting will feature over 20 talks from users around the world. Among the use cases and deployments to be featured are:

  •  Data Management Environment at the National Cancer Institute. Frederick National Laboratory for Cancer Research. An efficient and cost-effective mechanism is required to store and manage the large heterogeneous datasets generated by high throughput technologies such as Next Generation Sequencing, Cryo-Electron Microscopy, and High Content Imaging. Tier 1 storage is expensive, and Tier 2 devices used standalone do not lend themselves well to discovering and disseminating datasets. The Data Management Environment (DME), a data management platform for storing, sharing, and managing high-value scientific datasets, was developed at the National Cancer Institute to close this gap. DME addresses the long-term data management needs of research labs and cores at NCI per the FAIR (Findable, Accessible, Interoperable, and Reusable) guiding principles for data management. It supports S3 compatible object store, as well as file system-based storage. DME uses iRODS as the metadata management layer enabling virtualization of backend storage, replacement of storage providers with zero impact on users, and transparent migration of data across providers. The granular permissions scheme provided by iRODS coupled with DME’s authentication and authorization mechanism enables researchers to share data with collaborators securely. This talk will give an overview of the capabilities and architecture of the Data Management Environment and discuss how DME has leveraged iRODS to deliver enhanced data management and storage management capabilities.
  • iRODS speaks SFTP: More ways to securely transfer your data. CyVerse / University of Arizona. The need for compliance and data encryption during transfer is a strict requirement for many science domains that are working with confidential data. Realizing this unmet need for secure and encrypted transfers for CyVerse users, the CyVerse team decided to implement Secure File Transfer Protocol (SFTP) access to iRODS. This approach complements the existing secure data transfer and authentication method currently provided in iRODS via SSL and PAM authentication, which however are challenging to integrate into existing services or research workflows for multiple reasons: requiring changes on iRODS server, firewall configurations, and training users for complex client side installations of icommands. In this talk, the team introduces their work on adding iRODS as a backend storage option for SFTPGo utilizing the Go iRODS library developed at CyVerse.
  • From SRB to iRODS: 20 years of data management at the petabyte scale. CC-IN2P3. CC-IN2P3, a data center hosting services such as computing and data storage for international projects mainly in the fields of subatomic physics and astrophysics, has been using SRB and then iRODS in a wide variety of projects and use cases for the last 20 years. Data management has always been a key activity for a data center such as CC-IN2P3, due to the ever growing size of the projects, their international dimension. This talk will emphasize on the evolution of the data management needs, the pitfalls, the endless migration cycle (both hardware and software) over the years. It will also focus on the ongoing prospects, especially the long term data preservation needs and open science.
  • MrData: An iRODS Based Human Research Data Management System. Max Planck Institute for Biological Cybernetics. MrData is an iRODS based archival system for research medical imaging data, and was built initially to automate collection and archival of data flowing from a Siemens 9.4 Tesla MRI system. Of particular importance to this project was managing metadata related to human subject recruiting in a GDPR compliant manner. The team chose Castellum, a Max Planck developed system specifically for managing human subject data securely and we worked with that team to integrate it with the MrData system. An additional requirement for their team was “mixed use” metadata, information necessary for both subject recruiting and scientific processing. Mixed use metadata, such as handedness, is managed by Castellum but made available by MrData for scientific and archival purposes securely and without manual intervention. The Max Planck team will present an overview of this project, including current production status and future directions. 

Bookending this year’s UGM are two in-person events for those who hope to learn more about iRODS. On July 5, the Consortium is offering beginner and advanced training sessions. After the conference, on July 8, users have the chance to register for a troubleshooting session, devoted to providing one-on-one help with an existing or planned iRODS installation or integration.

Registration will remain open until the beginning of the event. Learn more at this year’s UGM at irods.org/ugm2022

About the iRODS Consortium

The iRODS Consortium is a membership organization that supports the development of the integrated Rule-Oriented Data System (iRODS), free open source software for data virtualization, data discovery, workflow automation, and secure collaboration. The iRODS Consortium provides a production-ready iRODS distribution and iRODS training, professional integration services, and support. The world’s top researchers in life sciences, geosciences, and information management use iRODS to control their data. Learn more at irods.org.
The iRODS Consortium is administered by founding member RENCI, a research institute for applications of cyberinfrastructure at the University of North Carolina at Chapel Hill. For more information about RENCI, visit renci.org.