Microservices provide modularity and scalability but introduce latency sensitivity, especially when deployed in large multi-tenant cloud clusters. This paper proposes LAT-Place, a microservice placement policy that incorporates inter-service latency profiles, network topology, and real-time traffic into container orchestration decisions. LAT-Place constructs a latency graph for each application describing dependencies and acceptable latency budgets. A placement engine matches microservices to nodes using a multi-objective optimization function that balances latency, utilization, and interference avoidance. We implement LAT-Place as a Kubernetes scheduler plugin and evaluate it with microservice workloads modeled after e-commerce, streaming, and analytics systems. Results show improvements of 20–45% in end-to-end latency and more stable 95th percentile response times. LAT-Place also reduces cross-tenant resource contention through predictive throttling and priority-aware routing. The paper concludes with a cost analysis and recommendations for production deployment in hybrid and edge-cloud environments.
Marco ZambiancoSilvio CrettiDomenico Siracusa
Independent ResearcherRavi Chandra Thota