There are currently thousands of scientists creating millions of data sets describing an increasingly diverse matrix of social and physical phenomena. The sheer volume and diversity of data presents a new set of challenges in locating all of the data relevant to a particular line of research. The National Science Foundation’s Big Data Initiative aims to develop new tools and methods to extract and use knowledge from these large data sets to accelerate progress in science and engineering research and innovation. “DataBridge – A Sociometric System for Long-Tail Science Data Collections” will create a multi-dimensional network representation of relationships among scientific data sets.

RENCI’s Role

RENCI, along with other collaborators, is developing Data Bridge as an interface similar to social networking sites like Linked In or Facebook. When DataBridge is completed, it will be able to access a wide range of research databases and link datasets through sociometric network-type connectivity, to enable scientists to find data and research that could be of interest to their own work.

Researchers using the interface will receive suggestions of other relevant data- much like Amazon suggests books based on the user’s previous choices. The system will also provide an easy means of publishing data to the DataBridge and incentivize data producers to do so by enabling collaboration and citation. Researchers will be able to connect through the system, and new cross-disciplinary partnerships and projects could emerge through those connections. The concepts developed in the project- linking data through sociometric network analysis- will have an impact on non-scientific data collections and will improve access and discovery of information over the web.

Project Team

  • Arcot Rajasekar (Project Leader)
  • Howard Lander
  • Sharlini Sankaran
  • Michael Shoffner
  • Hong Yi



National Science Foundation