Stochastic gradient methods (SGMs) have become the workhorse of machine learning (ML) due to their incremental nature with a computationally cheap update. In this talk, I will discuss our generalization analysis of SGMs by simultaneously considering the generalization and optimization errors in the framework of statistical learning theory (SLT) and discuss their applications. The core concept for our study is stability which is a notion in SLT to characterize how the output of an ML algorithm changes upon a small perturbation of the training data. Our theoretical studies significantly improve the existing results in the convex case and led to new insights into understanding the generalization of deep neural networks trained by SGD in the non-convex case. I will also discuss the applications of our new stability study to differential privacy and how to derive lower bounds for the convergence of existing methods in the task of maximizing the AUC score which further inspires a new direction for designing efficient AUC optimization algorithms.
Yiming is a Professor at the Department of Mathematics and Statistics, SUNY Albany. Before joining UAlbany in 2015, he was an Assistant Professor in the Department of Computer Science at the University of Exeter, England. His research interests include Statistical Learning Theory, Machine Learning, and Optimization. He currently serves as an associate editor of Transactions of Machine Learning Research, Neurocomputing, and Mathematics of Computation and Data Science, and the managing editor for Mathematical Foundation of Computing. He also serves as a Senior Program Member/Area Chair for major machine learning conferences such as NeurIPS and AISTATS.