Witnessing the soaring demand for computation over the past decade, tech companies are piling up numerous commodity machines to serve requests from massive users. Such large-scale multi-tenant clusters, with optimized resource scheduling, have the potential to be highly efficient. However, it is challenging to achieve high performance and low cost in practice. Given heterogeneous hardware and diverse workloads, many schedulers either fail with low resource utilization, which increases the cost, or cause high workload contention, which decreases the performance. In this dissertation, starting with a characterization study of a production cluster, we present the challenges posed to resource scheduling; for example, low resource utilization, presence of hard-to-schedule tasks demanding hig...[ Read more ]
Hossein ShafieiradAmir ShaniManaf Bin-YahyaSeyed Hossein MortazaviGeng LiXinle DuT.H. SuWei WangJingbin ZhouMajid Ghaderi
Robert GrandlMosharaf ChowdhuryAditya AkellaGanesh Ananthanarayanan
Xiaoping LiDongyuan PanYadi WangRubén Ruíz