Educators offer tips on making sense of the data revolution

“We are creating every 10 minutes what we were creating every 2,000 years, and that’s the problem.”

This statement, by panelist Arcot Rajasekar, succinctly sums up one of the many challenges stemming from the modern big data environment discussed at “A Citizen’s Guide to Big Data.”

Held Thursday, Sept. 28 at the Friday Center for Continuing Education in Chapel Hill, the panel featured four speakers, each an expert in different data science specialties. Each panelist offered a brief presentation on their area of expertise and the way in which their knowledge gets applied in the rapidly expanding field of data science.

Making up the panel were Michele Hayslett, a data librarian who works with the UNC-Chapel Hill Davis Library Research Hub and as adjunct faculty in the School of Information and Library Science (SILS); Paul Jones, clinical professor in  SILS: Arcot Rajasekar, professor in SILS, chief data scientist at RENCI, and co-director of the Data Intensive Cyber Environments (DICE) Center at UNC-Chapel Hill; and Ryan Shaw, associate professor SILS. Moderating the panel was Neal Thomas, assistant professor, in the UNC-Chapel Hill department of communication.

The discussion hit both the highs and lows of big data, from the numerous possibilities created by applying  big data to problem solving to the threat of privacy breaches. Opening the panel, Rajasekar offered multiple definitions and explanations of the term “big data” and what it truly entails. Then, he described the current paradigm shift of data from compute intensive to data intensive. “The way we do science, research, and business is changing,” he said.

Following Rajasekar was Shaw, who while not a data scientist, brought an interesting perspective to the discussion as a former employee at Yahoo. He advised closer cooperation between the people who make strategic decisions and the scientists trying to understand increasingly large amounts of data.

Third to speak was Jones, who ran through three examples of data aggregators and the way they use large amounts of data, most of it personal. He warned of the consequences of this data falling into the wrong hands, and cited how citizens can protect themselves by supporting third-party oversight.

Closing the panel was Hayslett, who described incidents of data research conducted through unethical means. Questionable ethics on the part of the researcher led to bad research, and ultimately bad data. Hayslett noted that ethics remain key, especially in the proliferation of data on a larger scale.

After the panel, audience members were able to engage the speakers in a question and answer session.

Carolina Chao, RENCI Communications Intern