Separating the wheat from the chaff in an age of bots and trolls

In the age of ubiquitous connectivity and social media, information is at our fingertips. Unfortunately, so is misinformation and often it is hard to tell one from the other.

A recent roundtable discussion sponsored by the South Big Data Hub examined the rapidly changing landscape for building online communities, sharing information, and creating what often appears to be a groundswell of support for particular points of view.

The roundtable panelists have studied social media data and worked to understand its impacts for years. They have a wide range of experience in computer science, data science, and the behavioral and social sciences (for panelist bios, click here).

According to one of those panelists, Kathleen M. Carley, PhD, of Carnegie Mellon University, social media, easy access to the internet, and the proliferation of bots (web “robot” software that runs automated tasks over the internet) and trolls (people who seek arguments on the internet) has turned discourse about big ideas into “a wild west.” The cast of characters on this new frontier include Anonymous—a loosely associated group of international network hackers and activists who generally oppose internet censorship—and social media platforms like Twitter and YouTube, who police their sites and evolve and change on a daily basis.

Increasingly, information from social media, posted by ordinary citizens rather than trained journalists, shapes the news, said Carley. Marketing campaigns are designed to go viral, spreading through networks and communities using social media. Misinformation can multiply, deep information and context are often hard to find, and bots and trolls busily create new online communities and shape public conversations.

How do bots work? They are often embedded deeply in online communities and used to link different communities so that community members receive the same information. They share posts between social media platforms and post stories that appear to be from legitimate news sites but are often from sites created to appeal to a constituency or give the appearance of wide support.

“What this does is create a groundswell of what appears to be strong support for common ideas even though it is just fiction,” said Carley. “You get truth and fiction being spread to people so fast through a series of soundbites that there’s not an ability for them to check it.”

Panelist Nitin Agarwal, PhD, of the University of Arkansas at Little Rock referred to a recent Wired magazine cover story that profiled teenage bloggers in Macedonia, where the chance to make comparatively large sums of money motivates tech savvy young people to create blogs that lift misinformation from sites with right wing or alt-right viewpoints. Since recent research shows 34 percent of Americans trust the information they receive from social media and 14 percent consider social media their most important information source, these blogs can have real impact.

“This is a huge percentage when you consider the electorate; 1 percent can make a difference there,” Agarwal said.

Agarwal’s research team tracks blogs and bloggers, how fringe ideas make their way into the mainstream, and how these ideas are further shared through mainstream social media, taking full advantage of a mass communication system that supports one-to-one, one-to-many, and most importantly many-to-many communications.

Panelist Huan Liu, PhD, of Arizona State University has been a big data researcher for more than 20 years and is an expert on gleaning knowledge from data, but, he said, traditional methods of data analysis don’t work as well on social media data, partly because of the misinformation on social media channels. Furthermore, artificial intelligence, which is integrated into social media platforms and into bots, makes bots more “intelligent” and harder to detect, he said.

Data science related to social media must account for misinformation by detecting bots, thereby reducing their impact, according to Liu. However, traditional big data research tactics are not enough in the era of artificial intelligence.

“Big data alone is not enough,” said Liu. “We need to enable user-controlled information filters and checkers. We need to promote and build diverse collective intelligence.”

Manipulating the narrative

According to panelist Rand Waltzman, PhD, of Rand Corporation fake news is an old idea that’s been given new life through social media and the internet. It’s an example of “active measures,” a term coined by the KGB, he said.

“Active measures are techniques used to manipulate groups of people—anywhere from 10 to a billion—using information as a weapon, doing whatever the perpetrator wishes, while making those people think it was their idea,” said Waltzman. “It doesn’t matter whether the information is true, half true or false as long as it gets the job done.”

Active measure techniques, said Waltzman, are used in everything from criminal activity to mass marketing to political campaigns. “At some point, and I think we are almost there, active measures completely dominate the information environment and objective truth and reality become almost meaningless or irrelevant concepts.”

How can researchers and ordinary humans fight back? The panelists agreed that researchers need to look at the big picture, tracking multiple platforms and multiple sources of messages, rather than one social media platform or one post. With Twitter, for example, they need to study not individual tweets, but whole conversations in order to understand context. As social media and the internet become more image based, images must be tracked and analyzed to determine if they have been altered to support a particular narrative.

In addition, counter arguments and attempts to discredit information sources won’t work, the panelists agreed.

“It’s not so much about countering, but replacing one narrative with another,” said Waltzman. “All of these things exist because they fulfill a need of some sort. You can’t discredit or fight against it without providing an alternative. The question becomes how do you inject the alternative (narrative) and get it to replace the one you don’t like.”

To listen to the entire roundtable discussion, click here.

To learn about upcoming South Big Data Hub roundtables and other events, please see the South Hub calendar of events.

-Karen Green

Tagged |

First Southern Data Science Conference comes to Atlanta April 7

Register now at

The data science community and members of the South Big Data Hub should mark their calendars for the very first Southern Data Science Conference, to be held on April 7 at the Hyatt Regency Atlanta Perimeter at Villa Christina. The conference is expected to attract data science thought leaders from around the southeast and the nation and will feature speakers from innovative companies and research laboratories, such as Google, Microsoft, AT&T, NASA, Glassdoor and Groupon. Read more…

IBM exec offers tips for thriving in the digital data storm

Cognitive thinking is the key to surviving and thriving in the perfect storm of modern technology, according to IBM’s Mac Devine, who presented a National Consortium for Data Science (NCDS) DataBytes Webinar in December.

Devine, vice president and CTO of emerging technology and advanced innovation, IBM Cloud Division, said that our interconnected world composed of big data, the Internet of Things and the cloud, has created a tidal wave of data that is too large to handle using traditional methods of managing information. Cognitive thinking, or using high-level technology to comb through large sets of data with a human mindset, is one strategy for coping with what he termed a “perfect digital storm.”

Read more…

Webinar to discuss smart and connected cities

smart cities imageThe explosion of digital data means changes in how we work, play, and interact with each other and with the technologies and devices we depend on. Nowhere is that change more apparent than in the than in movement to create smart and interconnected cities.

What started as an effort to integrate multiple information and communication technologies with sensors that collect data about transportation systems, power plant usage, water supply networks, and more has evolved into a transformation of urban environments using a data infrastructure that can monitor events, troubleshoot problems, and enable a better quality of life.

Read more…

Tagged , , |

DataBridge tackles the problem of ‘dark data’

DataBridge-Logo-Final copyDataBridge, a National Science Foundation-funded project to make research data more discoverable and usable by a wide community of scientists, has the green light to expand its work into the neuroscience community, thanks to a new NSF EAGER award.

Read more…

Leading the charge in biomedical visualization

amia-logo-nobgBiomedical informatics is one of the hottest data science research fields, with scientists publishing hundreds of research papers every year that could impact how patients and doctors access and interact with medical information and the effectiveness of medical treatments.

Read more…

Why Data Commons? Because scientists want to focus on science, not infrastructure


ESIP meeting participants discuss the challenges of a Data Commons at their recent summer meeting in Durham, NC.

After more than 25 years as a science communicator, I’ve come to recognize the things that all scientists, regardless of their disciplines, yearn for. It’s not an endless stream of funding or appreciation from the public for their work (although both would be nice). Read more…

Introducing the Women of RENCI

As Women’s History Month draws to a close, RENCI acknowledges the daily hard work of each of its female employees. The research strides occurring at RENCI would not be possible without our female researchers, project coordinators, administrators, and communicators.

From left to right: Asia Mieczkowska, Jennifer Resnick, Claris Castillo, Hong Yi, Lea Shanley, Caryn Best, Lisa Stillwell, Margaret Wesley, Kristi Andrews, Laura Capps Hill, Rebekah Sturgess, Karen Green, Dawn Carsey, Annie Goessling, and Stephanie Suber

From left to right: Asia Mieczkowska, Jennifer Resnick, Claris Castillo, Hong Yi, Lea Shanley, Caryn Best, Lisa Stillwell, Margaret Wesley, Kristi Andrews, Laura Capps Hill, Rebekah Sturgess, Karen Green, Dawn Carsey, Annie Goessling, and Stephanie Suber

Recently, the RENCI communications team rounded up as many “Women of RENCI” as possible for a group photo and to learn more about how they contribute to the organization. The list below (and the photo) summarize the information gathered on that day. Read more…

RENCI CTO speaks to high school students on the future of computer science

The next generation of potential computer scientists are making their way to K-12 classrooms each day, but are these young minds being exposed to the fundamentals of computer science? According to, only one in four American high schools offer computer science courses, and few of those schools allow the course to count toward graduation.

To counteract these statistics, some computer scientists are working harder to share their knowledge and experiences from the field. RENCI’s Director of Informatics and Chief Technology Officer Charles Schmitt, PhD, joined the cause recently when he visited the North Carolina School of Science and Math (NCSSM) to speak to a group of students about computer science.   Read more…

Research Triangle Analysts at RENCI: Topological Data Analysis

Research Triangle Analysts met at RENCI for their first monthly meeting of the new year on January 19. Research Triangle Analysts meet at RENCI every third month and elsewhere around the Triangle during other months. The group, a 501(c)(3) non-profit and all-volunteer organization, promotes the advancement of data science throughout the Triangle’s collaborative communities of analysts, mathematicians, statisticians, and scientists.

Research Triangle Analysts participants learn about topological data analysis at RENCI.

Research Triangle Analysts participants learn about topological data analysis at RENCI.

Hamza Ghadyali, a PhD candidate in mathematics at Duke University, featured as the speaker for the meeting. Ghadyali develops new topological data analysts (TDA) tools, particularly for the analysis of electroencephalogram (EEG) data. Topology is the mathematical study of shape. TDA tools analyze large, noisy, complex datasets from disciplines such as, but not limited to, oncology, astronomy, meteorology, and neuroscience. Analysis of the shapes and changes in shape represented by data yield information about the data.  Read more…

Page 1 of 612345...Last »