A data center upgrade for a data-rich world

What is a data center? Ask any number of technologists or scientists and you will probably get a wide range of answers, in part because the data center concept changes as the technologies that power them change.

In the 1960s and 70s, as scientists began to exploit new technological research tools, data centers began popping up on university campuses and at research labs worldwide. The first- generation data centers functioned primarily as long-term data repositories. In the 1980s and 90s, data centers evolved to include new resources and tools for managing data sets that were growing larger and more complex. These new methods for accessing, managing, and archiving data meant researchers could collaborate and share their data and research results.

However, 21st century scientists need a new data center model if they plan to realize the potential of the big data explosion. Today’s scientists must also be able to easily access high performance/throughput computing and easily work with collaborators in other fields in order to address complicated multidisciplinary problems. The data center of the future—or Data Center 3.0—looks considerably different than the centralized, often domain specific data centers of the past and present.

In the white paper From Data Center 1.0 to Data Center 3.0: Transforming the storage, access and (re)use of research data for better science, RENCI researchers describe a plan to deploy this third generation of data centers.

Key features of Data Center 3.0 include:

  • Distributed and interconnected data center “nodes.” In the data-rich world, every physical data center will be able to operate as a node within a larger “data grid,” which users will be able access from any location.
  • Metadata best practices. Metadata is a powerful tool for making data more discoverable. Continuing development and deployment of common metadata standards, as well as Increased automation of metadata generation, will enable truly collaborative, multidisciplinary science.
  • With metadata standards, evolving vocabularies that cross disciplines, and linked data, more interoperability across data collections will be possible.
  • Robust analytical tools. Data Center 3.0 will make analytical tools available through a data grid, making it easier for scientists to analyze, compute, and create simulations from data.

To learn more about the Data Center 3.0 concept, read the RENCI Data Center 3.0 white paper.