Non-convex optimization is critical in modern machine learning, signal processing, and various scientific and engineering applications, particularly with large datasets and complex models. While convex optimization offers strong theoretical guarantees, non-convex landscapes are fraught with local minima, saddle points, and plateaus, making global convergence challenging. This paper presents a comprehensive geometric theory for large-scale non-convex optimization, exploring how intrinsic geometric structures of objective functions and search spaces can enhance algorithmic efficiency and robustness. We integrate concepts from Riemannian geometry, information geometry, and optimization landscape analysis, providing a unifying framework for understanding iterative methods. By analyzing curvature, geodesics, and topological properties of underlying manifolds, we can design methods that navigate complex spaces more effectively, escape undesirable stationary points, and identify regions for faster convergence. This geometric perspective offers novel insights into the empirical success of deep learning optimizers and paves the way for next-generation optimization strategies for high-dimensional, large-scale non-convex problems.
Qi ZhangYi ZhouAshley Prater-BennetteLixin ShenShaofeng Zou