STOR 712: Optimization for Machine Learning and Data Science

Overview

STOR 712 will provide a detailed and deep treatment for commonly used methods in continuous optimization, with applications in machine learning, statistics, data science, operations research, among others.  The main focus of this course is on continuous optimization algorithms, and it will also cover some core optimization theory as a foundation for the development of these algorithms. The discussions of algorithms will be accompanied with representative applications. STOR 712 is an advanced course that is complement to STOR 612 and STOR 614.

Topics

  1. Introduction: mathematical optimization, representative models and applications: machine learning, data science, operations research, and signal/image processing.
  2. Basic Theory: Convex analysis and optimization theory, duality, optimality conditions.
  3. First-order algorithms (basic): gradient descent methods (for both large-scale convex and nonconvex problems) and accelerated variants.
  4. First-order algorithms (advanced): proximal operators, and proximal gradient-based methods, and primal-dual algorithms
  5. Stochastic optimization algorithms: Stochastic gradient, variance reduction, and adaptive stochastic gradient methods (for large-scale convex and nonconvex optimization).
  6. Second-order algorithms: Newton and quasi-Newton methods, Interior-point methods, and randomized variants.
  7. Minimax theory and applications: robust optimization, distributationally robust optimization, two-person game, and equilibrium problems.
  8. Applications: Neural networks/deep learning, generative adversarial networks (GANs),  online learning, Bayesian optimization, and federated learning. (Some applications in this topic can be merged into Topic 3 to Topic 7).