Sachinkumar Anandpal GoswamiKashyap C. PatelDhara Ashish DarjiShital PatelSonal Patel
Automation and orchestration of workloads in the cloud environment are critical for managing the ever-increasing complexity and scale of artificial intelligence applications. This chapter analyses the tools and techniques used to automate and optimize AI workflows in cloud infrastructure. This focus is on the role that cloud orchestration platforms, such as Kubernetes, play in managing distributed workloads and the benefits of automation in scaling, resource management, and fault tolerance. This chapter will discuss the integration of the ML pipeline with cloud services towards streamlined model training, deployment and observation. The discussion puts greater emphasis on the fact that the automation of workload maximises productivity, minimizes downtime while making definite and almost guaranteed high availability in clouds. The case studies reflect the best practices adopted with the most common problems AI workload automation and orchestration face.
Bhavye SharmaRavi, AkashAvigyan Mukherjee