DataBridge tackles the problem of ‘dark data’

DataBridge-Logo-Final copyDataBridge, a National Science Foundation-funded project to make research data more discoverable and usable by a wide community of scientists, has the green light to expand its work into the neuroscience community, thanks to a new NSF EAGER award.

The award itself is relatively small (less than $100,000) and will allow the researchers to consult with neuroscientists, develop a prototype DataBridge for Neuroscience (DBfN), and a community workshop. However, the impact could be significant for a hot scientific field that is making breakthrough discoveries about the human brain.

The South Big Data Hub will play a key role in BDfN:

  • The Hub will provide computing and storage facilities for the implementation of the DBfN system. Those systems are located at RENCI at UNC-Chapel Hill, one of the lead institutions for the South Big Data Hub.
  • The South Hub will assist the research team in conducting a community workshop. The workshop is tentatively planned to take place at Georgia Tech, the other lead institution for the South Hub.
  • The South Hub’s network of domain scientists and industry experts will be leveraged to disseminate information about DBfN, including the workshop report, to wider audiences in the South and across all four Hub regions.
  • The researchers will work with the South Hub to develop DBfN into a full BD Hub spoke proposal that will help the national neuroscience community.

Still not sure what DataBridge is? The idea is simple and addresses the challenges that result from this key fact: even in the age of big data, most research data is created by small teams or individual investigators. That means most research data sets are small and usually stored locally, where it is impossible for future researchers to access it.

When considered as a whole, these small data sets equal big data; an untapped treasure trove of research results often referred to as “dark data.” DataBridge, led by Arcot Rajasekar at UNC-Chapel Hill and RENCI, aims to make dark data discoverable and available for investigation and collaboration.

DataBridge gathers metadata about data sets, including the scientific field of the data, when and where it was created or collected, and methods used. It then uses relevance detection algorithms to find similarities between a newly ingested data set and other data sets in the system. The system uses socio-metric network algorithms to cluster data sets into “communities” based on their similarities. When researchers use the DataBridge web interface, they can find similar and related data sets—much like recommends books based on past purchases or Facebook recommends new friends based on existing connections.

In its first iteration, also funded by the NSF, DataBridge focused on data sets in the social sciences. As the project expands into new communities, we wish them continued success in making data from the “long tail of science” more accessible and usable.

Learn more:

DataBridge white paper

DataBridge website

-Karen Green

Leading the charge in biomedical visualization

amia-logo-nobgBiomedical informatics is one of the hottest data science research fields, with scientists publishing hundreds of research papers every year that could impact how patients and doctors access and interact with medical information and the effectiveness of medical treatments.

Read more…

Why Data Commons? Because scientists want to focus on science, not infrastructure


ESIP meeting participants discuss the challenges of a Data Commons at their recent summer meeting in Durham, NC.

After more than 25 years as a science communicator, I’ve come to recognize the things that all scientists, regardless of their disciplines, yearn for. It’s not an endless stream of funding or appreciation from the public for their work (although both would be nice). Read more…

Introducing the Women of RENCI

As Women’s History Month draws to a close, RENCI acknowledges the daily hard work of each of its female employees. The research strides occurring at RENCI would not be possible without our female researchers, project coordinators, administrators, and communicators.

From left to right: Asia Mieczkowska, Jennifer Resnick, Claris Castillo, Hong Yi, Lea Shanley, Caryn Best, Lisa Stillwell, Margaret Wesley, Kristi Andrews, Laura Capps Hill, Rebekah Sturgess, Karen Green, Dawn Carsey, Annie Goessling, and Stephanie Suber

From left to right: Asia Mieczkowska, Jennifer Resnick, Claris Castillo, Hong Yi, Lea Shanley, Caryn Best, Lisa Stillwell, Margaret Wesley, Kristi Andrews, Laura Capps Hill, Rebekah Sturgess, Karen Green, Dawn Carsey, Annie Goessling, and Stephanie Suber

Recently, the RENCI communications team rounded up as many “Women of RENCI” as possible for a group photo and to learn more about how they contribute to the organization. The list below (and the photo) summarize the information gathered on that day. Read more…

RENCI CTO speaks to high school students on the future of computer science

The next generation of potential computer scientists are making their way to K-12 classrooms each day, but are these young minds being exposed to the fundamentals of computer science? According to, only one in four American high schools offer computer science courses, and few of those schools allow the course to count toward graduation.

To counteract these statistics, some computer scientists are working harder to share their knowledge and experiences from the field. RENCI’s Director of Informatics and Chief Technology Officer Charles Schmitt, PhD, joined the cause recently when he visited the North Carolina School of Science and Math (NCSSM) to speak to a group of students about computer science.   Read more…

Research Triangle Analysts at RENCI: Topological Data Analysis

Research Triangle Analysts met at RENCI for their first monthly meeting of the new year on January 19. Research Triangle Analysts meet at RENCI every third month and elsewhere around the Triangle during other months. The group, a 501(c)(3) non-profit and all-volunteer organization, promotes the advancement of data science throughout the Triangle’s collaborative communities of analysts, mathematicians, statisticians, and scientists.

Research Triangle Analysts participants learn about topological data analysis at RENCI.

Research Triangle Analysts participants learn about topological data analysis at RENCI.

Hamza Ghadyali, a PhD candidate in mathematics at Duke University, featured as the speaker for the meeting. Ghadyali develops new topological data analysts (TDA) tools, particularly for the analysis of electroencephalogram (EEG) data. Topology is the mathematical study of shape. TDA tools analyze large, noisy, complex datasets from disciplines such as, but not limited to, oncology, astronomy, meteorology, and neuroscience. Analysis of the shapes and changes in shape represented by data yield information about the data.  Read more…

Crossing the pond in the name of better data management

iRODS Chief Technologist Jason Coposky offers guidance to iRODS users at the University of Utrecht.

iRODS Chief Technologist Jason Coposky offers guidance to iRODS users at the University of Utrecht.

The iRODS data management platform and the iRODS Consortium that works to sustain it are making waves well beyond their home base in Chapel Hill, NC.

This week, three of the smart, savvy people behind iRODS and the Consortium (iRODS originator Reagan Moore, Consortium Executive Director Dan Bedard, and Chief Technologist Jason Coposky) traveled to France, the United Kingdom, and the Netherlands to talk about the benefits of iRODS as a data management solution for large distributed research projects, to provide training for those interested in becoming iRODS power users, and generally to evangelize about software that is now being used far and wide in Europe, the U.S., Asia, South America, Australia, and South Africa.  Read more…

Coffee and Viz series brings teaching in a Social Computing Room to life

Professors at NC State University and UNC-Chapel Hill have access to a tool that can bring both excitement and exploration into their curriculum – the Social Computing Room (SCR). While the resource is available on both campuses, educators can be unsure about how it effectively fits into their course plans.

NC State’s Coffee and Viz series hopes to provide ideas for instructors of all disciplines by highlighting those already using SCRs and other visualization spaces and by providing speakers with novel ideas for the use of visualization in education and research.

Read more…

DataNet presentations lead to invigorating discussion at ESA annual meeting

ESAlogoDataNet Tools and Services was the topic of a session at the recent Ecological Society of
America Annual Meeting, held last month in Baltimore.

Chris Lenhardt and Mike Conway presented in the session representing the UNC Chapel Hill-based DataNet Federation Consortium (DFC). Chris is lead of the DFC Facilities and Operations team and is active in RENCI’s environmental sciences group; Mike is a senior developer with DFC.

Organized by Amber Budden of the DataONE DataNet project, the session used the IGNITE format: a series of 5-minute, 20-slide talks followed by Q & A. The fast-paced IGNITE talks present forward-looking, unconventional, and/or controversial ideas to spur the audience into questioning their usual assumptions and thinking creatively about the topic. Both of the DFC IGNITE talks challenged the audience to consider how a data management system can provide tools and services for scientists that go beyond simply storing, indexing discovering, and accessing data files. Read more…

Three keys to work-life balance

Last week, I was asked to speak to young professionals about work-life balance, so I have been pondering this topic a lot. How do you juggle both a full-time, demanding and exacting career and the often-contradictory demands of raising little human beings to become productive members of society? To be honest, I think the “secret” is that all of us are just winging it, really, and we are creating and maintaining balance as we go – even if it doesn’t appear that way to others from the outside. Parenting and careers are all about change. Just when you think you have achieved the perfect balance, something changes – your child starts potty training, enters puberty, adjusts to a new school, or gets chosen for a school team. You earn a promotion and gain new responsibilities, move offices (which affects your commute), or start a new job. Your spouse has to travel more or has a change in health condition. Older family members need care and help in a way they haven’t before.

Read more…

Page 1 of 612345...Last »