RENCI awarded NSF grant to develop cyberinfrastructure training program for X-ray scientists

Enhancing the ability of scientists to use the latest computing and data tools will help quicken the pace of scientific discoveries

RENCI scientists and collaborators from Cornell University and University of Southern California (USC) have been awarded a $1 million, three-year grant from the National Science Foundation (NSF) to develop an innovative training program for scientists who use the Cornell High Energy Synchrotron Source (CHESS) X-ray facility. The program will be designed to help the scientists increase their computing skills, awareness and literacy with an ultimate goal of accelerating scientific innovations in synchrotron X-ray science.

A RENCI team headed by Anirban Mandal, assistant director of the Network Research & Infrastructure Group (NRIG), will lead the CyberInfrastructure Training and Education for Synchrotron X-Ray Science (X-CITE) project. It will bring together experts in cyberinfrastructure, X-ray science and other related areas from RENCI, Cornell University and USC to develop an innovative training program for researchers using CHESS, an NSF-supported high-intensity X-ray source at Cornell. CHESS is used to conduct research in materials science, physics, chemistry, biology, environmental science and other areas.

“Scientists don’t always have the computing and data expertise necessary to fully harness the instruments, data and computing tools available to transform data into insights and knowledge,” said Mandal. “We want to help reduce barriers so that scientists can effectively utilize computing capabilities and data resources at CHESS as well as cyberinfrastructure resources available through national computing and data services.”

Teaching scientists about computing tools

To get scientists up to speed on computing and data tools, the training program will cover programming essentials, systems fundamentals, distributed computing with the cyberinfrastructure ecosystem, X-ray science software and issues of data curation and applying the FAIR data principles of findability, accessibility, interoperability and reusability.

“As scientific instruments have become more sophisticated, there has been an explosion in the volume and rate of data produced by scientific facilities like CHESS,” said Mandal. “The data generated no longer fits on a laptop, and there are now computational models and AI methods that scientists can use to steer experiments based on the results they are getting. It is very difficult for scientists to keep pace with all these new capabilities.”

Mandal points out that it is important for scientists to get up to date on FAIR principles because federal research funding agencies are planning to roll out new mandates requiring scientists to share the data they generate. This will require designing metadata and figuring out how to push data into repositories in a way that makes it findable and usable by other researchers — tasks that scientists might not be accustomed to doing.

Drawing on RENCI’s expertise

The RENCI team will focus on developing common computer science modules for Python and other programming languages. This work will leverage RENCI’s expertise in this area, including Senior Research Software Developer Erik Scott’s experience as an instructor for the student program within the CI Compass project. The USC team, led by Research Professor of Computer Science Ewa Deelman, will contribute distributed computing training materials. Training materials for the specialized X-ray science software used at CHESS will be the focus of the Cornell team, which is led by Matthew Miller, associate director of CHESS.

The X-Cite training materials and activities will be available in several formats, including self-paced modules, videos, cyberinfrastructure catalogs, in-person instruction sessions, CHESS user workshops and tutorials offered at scientific conferences. The project team will also develop a coordination network to help disseminate the training materials, communicate the cyberinfrastructure needs for the X-ray science community and discuss best practices for training.