Data Commons Pilot Phase teams plan how a rising tide of data and tools can float all research boats
Last November the National Institutes of Health announced $9 million in pilot funding to explore feasibility and best practices for a new approach to advancing biomedical research. The initiative, known as Data Commons, is focused on making digital objects—that is, the data, models, and analytical tools that constitute the engine behind the modern research enterprise—available through collaborative platforms.
Plenty of data and tools exist, of course. But many are locked inside the computer systems of the institution that “owns” them, are too large to move around, are impossible to combine or compare with other data or tools, or are difficult to share due to privacy concerns, among many other challenges.
Data Commons aims to change all that by making biomedical research data “FAIR”—findable, accessible, interoperable, and reusable. It’s not about a specific study or even a bunch of them, but about creating linkages that can advance the way all biomedical research is done. If every researcher has access to more—and more useful—data and tools, the thinking goes, we can dramatically accelerate discovery and innovation.
More than 360 people have been working hard over the past nine months to rapidly develop, test, and iterate on specific solutions and overarching strategies for advancing the Data Commons vision. A late summer meeting in Chapel Hill, NC brought together dozens of team representatives to take stock of progress, lessons, and directions. Here are a few take-aways.
This isn’t Thunderdome
We all love a little healthy competition, but Data Commons isn’t the place for it. NIH’s Data Commons Pilot Phase program manager Vivien Bonazzi emphasized that, rather than multiple teams developing their own solutions and then seeing whose idea wins, Data Commons is built on the premise that meaningful collaboration is the key to solving the truly hairy problems posed by data FAIRness. There’s really just one fighter in the ring—one Commons at the end—and it’s going to be collaboratively produced by many.
One illustration of what this means in practice is the initiative known as DataSTAGE, a program of NIH’s National Heart, Lung and Blood Institute (NHLBI). As NHLBI Chief Information Officer Alastair Thomson explained, DataSTAGE is designed to advance all of the same goals as Data Commons, albeit for the specific areas of biomedical research that are aligned with NHLBI’s mission. Instead of being separate from or a competitor to Data Commons, Thomson sees DataSTAGE as essentially an early instantiation of it—a test bed of sorts where Commons work can find ready testers and whose products will ultimately be absorbed into, and replaced by, Data Commons.
Get feedback, early and often
The meeting kicked off with a series of lightning talks on each team’s milestones and reflections. Participants noted that some of the most illuminating moments of the meeting, and indeed throughout the project, have happened when teams have had the chance to offer feedback and riff on each other’s work.
The trick, attendees reflected, is finding the sweet spot when feedback can be both useful and actionable—not so early that the basic ideas aren’t congealed but not so late that the product can’t be meaningfully changed. As RENCI director Stan Ahalt noted, deeply reviewing someone else’s work takes time, though the benefits for the program as a whole are well worth it. He suggested building in dedicated time for this review and feedback process as the effort moves forward.
It’s about the product, not the PI
Notwithstanding the considerable brainpower in the room, Data Commons is decidedly not about promoting celebrity geniuses. Bonazzi stressed that for collaboration on this scale to succeed, the actual work products must take precedence. The Data Commons Pilot Phase is operated under a unique organizational structure in which experts from across the country lump and divide the work on multiple dimensions. Various aspects of the challenge are tackled from different angles by teams who are encouraged to regularly compare notes and harmonize their efforts.
This “teams of teams” structure keeps the focus where it should be—on generating results that will ultimately speed biomedical research and innovation for the benefit of the country and the world.
By Anne Johnson, Lead Science Writer at Creative Science Writing