South Big Data Hub links talented students with tech startups through new program

DataStart provides real-world experiences for students while helping entrepreneurial companies

Graduate students from six universities in the southern U.S. will spend the summer working on data challenges important to the success of new and growing technology companies thanks to a program called the Southern Startup Internship Program in Data Science (DataStart).

DataStart interns will address a wide range of data-related business problems, including spotting trends in the diversity of people included in clinical trials, developing methods for using sensor data to detect loads on wind turbines, and the challenges of cleansing and standardizing data to extract more knowledge from it.

The program was launched this year by the South Big Data Regional Innovation Hub (South BD Hub), one of four Big Data Hubs funded by the National Science Foundation. It is supported by a grant from the Computing Community Consortium, which enables innovative, high-impact research in the national computing community.

DataStart provides each host company with up to $15,000 to pay for a full-time intern working onsite from approximately June 1 to August 31. The program is open to graduate students in good standing from accredited universities in the South BD Hub region, which includes Alabama, Arkansas, Delaware, District of Columbia, Florida, Georgia, Kentucky, Louisiana, Maryland, Mississippi, North Carolina, Oklahoma, South Carolina, Tennessee, Texas, Virginia, and West Virginia.

“Connecting data-focused young businesses with talented students who will be the data scientists of the future is essential to the mission of the NSF Big Data Hubs,” said Stan Ahalt, PhD, principal investigator for the South BD Hub and director of the Renaissance Computing Institute (RENCI) at the University of North Carolina at Chapel Hill. “DataStart builds this bridge between academia and the business world. The students get the chance to learn in a real-world business environment and the startups will benefit from their ideas, energy, and talents at a critical time in their business development.”

The DataStart interns and their host companies are:

DataStart_AnsariStudent: Samia Ansari, University of Georgia
Host company: Sartography, Staunton, VA
Ansari, a student in the professional science master’s program in biomanufacturing and bioprocessing, will characterize the representation of women and racial minorities in cancer research conducted between 2002 and 2012. She will also characterize and spot trends about women and minorities’ participation in cancer trials during that time. The work complements efforts by many research agencies to ensure that clinical trials reflect the social, racial/ethnic, geographic, and economic diversity of the U.S. population.


DataStart_McGowanStudent: Lucy D’Agostino McGowan, Vanderbilt University
Host company:, Nashville, TN
D’Agostino McGowan, a PhD student in biostatistics, will incorporate raw data streams from Google Analytics, Slack and other sources to create a foundation for predictive modeling for, a company that assembles remotely-managed freelance software development teams for companies worldwide. She will also examine financial data to better understand the relationship between the company’s pricing and client demand and the dynamics of different refund and return policies.


DataStart_EramStudent: Aziz Eram, University of Arkansas at Little Rock
Host company: Black Oak Analytics, Little Rock, AR
Eram, a student in the master’s program in information quality, will develop and test a general approach to the problem of cleansing and standardizing information obtained from free text fields that reference the same product or service—for example, information about store inventory that is entered into a system manually by an employee. She will use a tool developed by Black Oak Analytics to design and test new comparators and rule configurations that address the free-text reference problem.


DataStart_JiangStudent: Zhengqian Jiang, Florida State University
Host company: NPGroup Inc., Tallahassee, Florida
Jiang, a student in the department of industrial and manufacturing engineering, will assist Nanotechnology Patronas Group Inc. (NPGroup Inc.) in developing and commercializing a sensor system for wind turbines that can accurately detect loads that go undetected in the models typically used by inflow sensors to estimate load. The new sensor system seeks to increase wind power production and protect expensive wind turbine components.


DataStart_OrtizStudent: Jonathan Ortiz, University of Texas at Austin
Host company:, inc., Austin, TX
Ortiz, a student in the professional data analytics program, will work with, a new Austin-based technology company headed by Brett Hurt, a serial entrepreneur who has led several successful big data startups, including Bazaarvoice and Coremetrics (now IBM Customer Analytics).


DataStart_VardhanStudent: Ashok Vardhan, George Mason University
Host company: MetiStream, McLean, VA
Vardhan, who is pursuing a master’s in data analytics engineering, will help develop a healthcare data conversion solution called Ember, which bridges the gap between the existing Health Level Seven International (HL7) version 2.x (HL7 V2) healthcare standards and the emerging next generation international specification called Fast Healthcare Interoperability Resources (FHIR). The new standards are vital to using healthcare data to improve patient costs, but organizations have been slow to adopt them because of their complex existing technical environments or because of a reluctance to commit to change.

About the South BD Hub
The South Big Data Hub is a National Science Foundation Big Data Regional Innovation Hub that serves 16 states in the southern U.S. and the District of Columbia. Led jointly by the University of North Carolina at Chapel Hill and the Georgia Institute of Technology, the South BD Hub aims to build innovative public-private partnerships that address regional challenges through big data analysis. For more information, visit