Today, Cloud providers offer simplistic scaling policies that rely on thresholds that force tenants to have a priori knowledge of their workloads. We develop a new method for scaling memory-intensive workloads that needs no thresholds. This makes it worry-free for tenants, and it adapts even as workloads evolve. This is especially hard for memory-bound applications where even a small decrease in the amount of memory available can have a dramatic, almost unbounded impact on performance. Hence, sizing a machine's physical memory correctly is critical to application performance and operating cost. To determine a natural threshold for memory-intensive applications, our approach automatically analyzes an application's miss ratio curve (MRC) and models it as a hyperbola. Intuitively, a memory scaling policy should operate at the point where the curve flattens: that is, at its intersection with its latus rectum (LR). Our system uses a new approach to constructing and analyzing MRCs at run time that captures memory references from a slice of any scalable application as it executes on standard virtual machines from any major Cloud provider. We demonstrate with multiple applications running on Amazon Web Services (AWS) and Microsoft Azure. Our implementation and evaluation show that, though the LR doesn't require tenants to set thresholds, it is effective in scaling memory-intensive workloads to save on operating costs while avoiding queuing, thrashing, or collapse. It increases throughput by 1.5× and reduces queuing delay by 2× in our evaluation.
Wagdy Anis AzizAmir A AmmarJohn Soliman
María Goicoechea de JorgeAntônio Carlos
María Goicoechea de JorgeAntônio Carlos
María Goicoechea de JorgeAntônio Carlos