publications

Research contributions towards a principled understanding of optimization dynamics in deep learning.

Stochastic optimization in deep learning

We study the long-run behavior of stochastic gradient descent (SGD) on non-convex objectives, providing the first characterization of SGD’s invariant measures and global convergence times.

  • What is the long-run distribution of stochastic gradient descent? (ICML 2024, poster)
  • The global convergence time of stochastic gradient descent in non-convex landscapes (ICML 2025, poster)

Talks: Thoth seminar (slides), LPSM Paris (slides), Université Côte d’Azur (slides), Morgan Stanley ML Research (slides), Inria Argo team (slides)

Internal mechanisms of large language models

Understanding the robustness of uncertainty quantification methods and in-context learning capabilities through targeted experiments.

Wasserstein distributionally robust optimization

Regularization schemes and generalization guarantees for Wasserstein DRO models.

  • Regularization for Wasserstein distributionally robust optimization (ESAIM COCV)
  • Exact generalization guarantees for (regularized) Wasserstein distributionally robust models (NeurIPS 23, slides)

Talks: Erice 2022 (slides), FOCM 2023 (poster), NeurIPS@Paris 2023 (slides)

Last-iterate convergence of mirror methods

Determining how Bregman geometry impacts last-iterate guarantees in variational inequalities.

Talks: COLT 21, ICCOPT 22 (slides), SMAI MODE 2024 (slides)

Graph neural networks

Smooth game optimization for Machine Learning

Unified analyses and accelerated methods for differentiable games.

  • A tight and unified analysis of gradient-based methods for a whole spectrum of differentiable games (AISTATS 20, slides)
  • Accelerating smooth games by manipulating spectral shapes (AISTATS 20)

Full bibliography

2025

  1. How does the pretraining distribution shape in-context learning? task selection, generalization, and robustness
    Waïss Azizian and Ali Hasan
    arXiv: 2510.01163, 2025
  2. The geometries of truth are orthogonal across tasks
    Waïss Azizian, Michael Kirchhof, Eugene Ndiaye, and 4 more authors
    In ICML 2025 Workshop on Reliable and Responsible Foundation Models, 2025
  3. The global convergence time of stochastic gradient descent in non-convex landscapes: sharp estimates via large deviations
    Waïss Azizian, Franck Iutzeler, Jerome Malick, and 1 more author
    In ICML, 2025
  4. Almost sure convergence of stochastic gradient methods under gradient domination
    Simon Weissmann, Sara Klein, Waïss Azizian, and 1 more author
    Transactions on Machine Learning Research, 2025

2024

  1. The rate of convergence of bregman proximal methods: local geometry versus regularity versus sharpness
    Waı̈ss Azizian, Franck Iutzeler, Jérôme Malick, and 1 more author
    SIAM Journal on Optimization, 2024
  2. What is the long-run distribution of stochastic gradient descent? A large deviations analysis
    Waïss Azizian, Franck Iutzeler, Jerome Malick, and 1 more author
    In ICML, 2024
  3. skwdro: a library for Wasserstein distributionally robust machine learning
    Florian Vincent, Waïss Azizian, Franck Iutzeler, and 1 more author
    arXiv: 2410.21231, 2024

2023

  1. Regularization for Wasserstein distributionally robust optimization
    Waïss Azizian, Franck Iutzeler, and Jérôme Malick
    ESAIM: Control, Optimisation and Calculus of Variations, 2023
  2. Exact generalization guarantees for (regularized) Wasserstein distributionally robust models
    Waïss Azizian, Franck Iutzeler, and Jérôme Malick
    In NeurIPS, 2023
  3. Automatic Rao-Blackwellization for sequential Monte Carlo with belief propagation
    Waïss Azizian, Guillaume Baudart, and Marc Lelarge
    In ICML 2023 Workshop on Structured Probabilistic Inference & Generative Modeling, 2023

2021

  1. Expressive power of invariant and equivariant graph neural networks
    Waiss Azizian and Marc Lelarge
    In ICLR , 2021
  2. The last-iterate convergence rate of optimistic mirror descent in stochastic variational inequalities
    Waïss Azizian, Franck Iutzeler, Jérôme Malick, and 1 more author
    In COLT, 2021

2020

  1. Accelerating smooth games by manipulating spectral shapes
    Waïss Azizian, Damien Scieur, Ioannis Mitliagkas, and 2 more authors
    In AISTATS, 2020
  2. A tight and unified analysis of gradient-based methods for a whole spectrum of differentiable games
    Waïss Azizian, Ioannis Mitliagkas, Simon Lacoste-Julien, and 1 more author
    In AISTATS, 2020
  3. Linear lower bounds and conditioning of differentiable games
    Adam Ibrahim, Waïss Azizian, Gauthier Gidel, and 1 more author
    In ICML, 2020