With increasing focus on inter-operability across cloud offerings to leverage their disparate capabilities, it has become more and more important to enable a flexible framework for sharing of heterogeneous resources in the cloud infrastructure. At the same time, it is imperative to be aware of the performance implications of hosting application workloads on different resources in order to guarantee Service Level Agreements (SLAs) to the applications. This paper focusses on experimental characterization of performance implications of different heterogeneous resources in hosting big-data analytics application workloads (one of the most critical applications in modern times). To create the knowledge, based on which the recommendations are provided, we benchmark the performance of big-data analytics applications, using a Hadoop cluster setup. Specifically, we study parameters of interest such as turnaround time and throughput, which are most likely to influence our choice of infrastructure for a particular application. Our experiments are conducted on varied platforms, both internal to Xerox and external cloud providers. We present a model based on our experiments, that facilitates the characterization of hetergeneous applications, thus enabling the cloud middleware to select an appropriate infrastructure and metrics in order to attain the desired SLA.
Naghmeh DezhabadSudhakar GantiGholamali C. Shoja
Zhuoyao WangMajeed M. HayatNasir GhaniKhaled Shaban
Tanvir AhammadUzzal Kumar AcharjeeMd. Mahmudul Hasan