First Workshop on Data-Centric Infrastructure for Big Data Science (DIBS)

dibs2015-header3

Today’s infrastructure to support Big Data science applications follows traditional control-centric approach wherein behaviour, and not data and data operations, is the primary organizing construct of its design. This limits the potential data-handling capabilities of such infrastructures. For example, it hinders explicit handling of data at various system layers (e.g., network and file system) to satisfy multi-domain requirements, such as security, performance and resource-aware data management for reducing operational costs.

Central to these challenges is the fact that software tools and infrastructure to support collaboration have evolved in a piecemeal fashion.  Data management technologies such as data-grids enable sophisticated operations to integrate data from multiple administrative domains into one single abstraction. Cloud models such as Infrastructure as a Service (IaaS) facilitate the rapid deployment of networked virtual infrastructure (i.e., Clouds) and fast data transfers. On one hand, these technologies do not address the Big Data challenge by working independently. On the other hand, they present complex opaque APIs and use different resource abstractions that hinder their integration, thus preventing data from playing a central role in making decisions  in an automated fashion.

This workshop brings together system researchers, practitioners and domain scientists with expertise and interest in Big Data Science to explore novel data-driven approaches in developing and deploying software designs and infrastructure. We will focus on capturing research that seeks to take a holistic and integrated approach to data, infrastructure and resource management; domain science applications that can benefit from these novel data-centric software infrastructures; and experiences that help us navigate the problem space.  

Topics of interest for the workshop include (but are not limited to) the following subject categories:

  • Novel designs approaches and software infrastructures to support Big Data Science
  • Networking support for Big Data Science (beyond speeding up data transfers)
  • Security including preservation of data privacy in distributed computation settings
  • Big Data Science use cases with stringent performance requirements
  • Storage infrastructure for Big Data Science
  • Resource-aware data management
  • Big Data infrastructure designs to support Big Data Science

Venue and Date 

The workshop will be co-located with IEEE BigData 2015, the 2015 IEEE International Conference on BigData (http://cci.drexel.edu/bigdata/bigdata2015/) in Santa Clara, California on October 29th, 2015. It will be a half-day or full-day event depending on the number of submissions.

Registration

Registration information for the workshop can be found through IEEE BigData website.

Organization

Workshop Co-Chairs

  • Claris Castillo (RENCI) 
  • Ivan Rodero (Rutgers)

Steering Committee

  • Ilya Baldin (RENCI)
  • Ewa Deelman (ISI)
  • Geoffrey Fox (Indiana University)
  • Inder Monga (ESnet)
  • Manish Parashar (Rutgers)
  • Arcot Rajasekar (UNC, Chapel Hill)
  • Robert Ricci (University of Utah)
  • Almadena Chtchelkanova (NSF)
  • Joseph (Bryan) Lyles (Oak Ridge National Park)

Program Committee

  • Yanpei Chen (Cloudera)
  • Toni Cortes (Barcelona Supercomputing Center)
  • Liana Fong (IBM Research)
  • Cees de Laat (University of Amsterdam)
  • Amit Majumdar (San Diego Supercomputer Center)
  • Joe Membretti (Northwestern University Information Technology)
  • Jay Park (Louisiana State University)
  • Lavanya Ramakrishnan (Lawrence Berkeley Laboratory)
  • Omer Rana (Cardiff University)
  • Charles Schmidtt (RENCI)
  • Mai Zheng (New Mexico State University)
  • Linh Ngo (Clemson)

Important Dates

  • September 6, 2015 – Submissions due
  • September 20, 2015 – Reviews due
  • September 24, 2015 – Notifications out
  • October 5, 2015 – Camera ready paper due 

Paper submissions

Authors are invited to submit papers electronically in PDF format. Submitted manuscripts should be structured as technical papers and may not exceed 8 letter-size (8.5 x 11) pages including figures, tables and references using the IEEE Computer Society format for conference proceedings. 

LaTex package and word template are available from here.

All papers will be included in the Workshop Proceedings published by the IEEE Computer Society Press.

Submission website: https://easychair.org/conferences/?conf=dibs2015 

Call for papers in PDF

Click here to download the CFP. (*.pdf) (to be updated)

First Workshop on Data Centric Infrastructure for BigData Science & 3rd Workshop on Distributed Storage Systems and Coding for Big Data

Joint Program Schedule

Date: 29-October, 2015

Venue: Ballroom H, Hyatt Regency Santa Clara, CA 95054, USA

Time Workshop Schedule
13:30-13:40 Plenary
13:40-14:20 Keynote Speech :Data Federation and Data Management for the LHC Experiments ATLAS and CMS)

Frank Wuerthwien (University of California San Diego/San Diego Supercomputing Center)

14:20-14:45 Network-Aware Resource Management for Scalable Data Analytics Frameworks

Thomas Renner, Lauritz Thamsen, Odej Kao (Technische Universität at Berlin, Germany)

 

14:45-15:10

Lambda Architecture for Cost-effective Batch and Speed Big Data Processing

Mariam Kiran (University of Bradford), Peter Murphy (ESnet), Inder Monga (ESnet), Jon Dugan (ESnet), Sartaj Singh Baveja (Netaji Subhas Institute of Technology)

15:10-15:35 On a New Approach to the Index Selection Problem using Mining Algorithms

Parinaz Ameri, Jorg Meyer and Achim Streit (Karlsruhe Institute of Technology)

15:35-15:50 Coffee Break
 

15:50-16:15

RADII: Resource Aware Datacentric CollaboratIon Infrastructure

Claris Castillo, Fan Jiang, Charles Schmitt, Ilya Baldin, Arcot Rajasekar (RENCI, UNC-Chapel Hill)

 

 

16:15-16:40

A Comprehensive Evaluation of NoSQL Datastores in the Context of Historians and Sensor Data Analysis

Arun Kumar Kalakanti, Vinay Sudhakaran, Varsha Raveendran, and Nisha Menon (Siemens Technology and Services Pvt. Ltd., Bangalore, India)

 

 

16:40-17:05

On the Implementation of Zigzag Codes for Distributed Storage System

Lijia Lu, Hui Li, Jun Chen, Bing Zhu (Shenzhen Graduate School, Peking University, China), and Weijuan Yin (Shenzhen Huadong Feitian Network Development Co., Ltd., Shenzhen, China)

 

17:05-17:30

Challenges and Opportunities on Network Resource Management in DCN with SDN

Guan Xu, Jun Yang, and Bin Dai (Huazhong University of Science and Technology, China)