Math & Data (MaD) Group

On this page: AboutSeminar SeriesFall 2024 SeminarsPeopleSponsors

About

The Math and Data (MaD) group at CDS, in collaboration with the Courant Institute of Mathematical Sciences, focuses on building the mathematical and statistical foundations of data science. With expertise spanning signal processing, machine learning, deep learning, and high-dimensional statistics, the group tackles some of the field’s most critical challenges, from understanding neural networks to improving climate modeling.

Launched in 2017 by CDS Professor Joan Bruna, CDS Associate Professor Carlos Fernandez-Granda, and former colleague Afonso Bandeira, the group is known for its influential MaD Seminar, which serves as a hub for in-depth discussion on the theoretical foundations of machine learning and data science. The group’s research aims to make AI systems more interpretable and reliable by uncovering the mathematical principles underlying complex algorithms.

Whether developing mathematical frameworks to expedite climate simulations or exploring the optimal transport problem’s modern applications, MaD researchers balance theoretical rigor with practical impact. Their work, driven by a deep commitment to both pure math and real-world applications, positions them at the forefront of data science research.

Seminar Series

MaD and MIC

The Math and Data (MaD) Seminar Series at CDS serves as a forum to explore the mathematical foundations of data science, bringing together researchers from various disciplines to discuss topics ranging from classical statistics to modern machine learning. Founded in 2016 by CDS Associate Professor Carlos Fernandez-Granda, Professor Joan Bruna, and former NYU Assistant Professor Afonso Bandeira (now at ETH Zurich), the seminar reflects the diverse research interests of the MaD group’s faculty and has grown to become one of the longest-running and most impactful series at CDS.

The MaD Seminar has become a cornerstone of the CDS community, fostering a space where faculty, postdocs, and students can engage with groundbreaking research and cultivate new ideas. Regular speakers include both seasoned experts and promising new voices, making the series a launching pad for rising stars in the field. By bringing together these diverse perspectives, the MaD Seminar is not just a venue for presenting research but a catalyst for collaboration and innovation in data science.

The Mathematics, Information and Computation (MIC) Seminar runs at irregular intervals and covers specific aspects at the interface of applied maths, information theory and theory of computation.

Fall 2024 Seminars

MaD Seminar with Yury Polyanskiy (MIT): Optimal Quantization for Matrix Multiplication

Thursday, Dec 5, 10:30am EST, 7th Floor Open Space, Center for Data Science, NYU, 60 5th Ave

Abstract: The main building block of large language models is matrix multiplication, which is often bottlenecked by the speed of loading these matrices from memory. A possible solution is to trade accuracy for speed by storing the matrices in low precision (“quantizing” them). In recent years a number of quantization algorithms with increasingly better performance were proposed (e.g., SmoothQuant, Brain compression, GPTQ, QuIP, QuIP#, QuaRot, SpinQuant). In this work, we prove an information theoretic lower bound on achievable accuracy of computing matrix product as a function of compression rate (number of bits per matrix entry). We also construct a quantizer (based on nested lattices) achieving this lower bound.

Based on a joint work with Or Ordentlich (HUJI), arXiv:2410.13780.

Bio: Yury Polyanskiy is a Professor of Electrical Engineering and Computer Science and a member of IDSS and LIDS at MIT.  Yury received M.S. degree in applied mathematics and physics from the Moscow Institute of Physics and Technology, Moscow, Russia in 2005 and Ph.D. degree in electrical engineering from Princeton University, Princeton, NJ in 2010. His research interests are in information theory, statistical learning and error-correcting codes.

Dr. Polyanskiy was elected IEEE Fellow (2024), and won the 2020 IEEE Information Theory Society James Massey Award, 2013 NSF CAREER award and 2011 IEEE Information Theory Society Paper Award.

MaD Seminar with Renyuan Xu (NYU): Generative diffusion models: optimization, generalization and fine-tuning

November 7, 2:00pm EST, Auditorium Hall 150, Center for Data Science, NYU, 60 5th Ave

Abstract: Recently, generative diffusion models have outperformed previous architectures, such as GANs, in generating high-quality synthetic data, setting a new standard for generative AI. A key component of these models is learning the associated Stein’s score function. Though diffusion models have demonstrated practical success, their theoretical foundations are far from mature, especially regarding whether gradient-based algorithms can provably learn the score function. In this talk, I will present a suite of non-asymptotic theory aimed at understanding the data generation process in diffusion models and the accuracy of score estimation. Our analysis addresses both the optimization and generalization aspects of the learning process, establishing a novel connection to supervised learning and neural tangent kernels.

Building on these theoretical insights, another key challenge arises when fine-tuning pre-trained diffusion models for specific tasks or datasets to improve performance. Fine-tuning requires refining the generated outputs based on particular conditions or human preferences while leveraging prior knowledge from the pre-trained model.  In the second part of the talk, we formulate this fine-tuning as a stochastic control problem, establishing its well-definedness through the Dynamic Programming Principle and proving convergence for an iterative Bellman scheme.

This talk is based on joint works with Yinbin Han (NYU) and Meisam Razaviyayn (USC).

Bio: Renyuan Xu is an assistant professor at the Department of Finance and Risk Engineering at New York University. Previously, she was an assistant professor at the University of Southern California and a Hooke Research Fellow at the University of Oxford. She completed her Ph.D. from UC Berkeley in 2019. Her research interests include stochastic analysis, stochastic control and games, machine learning theory, and mathematical finance. She received an NSF CAREER Award in 2024, the SIAM Financial Mathematics and Engineering Early Career Award in 2023, and a JP Morgan AI Faculty Research Award in 2022.

MaD Seminar with Nikita Zhivotovskiy (UC Berkeley): Mean and covariance estimation of anisotropic distributions in the presence of adversarial outliers

October 10, 2:00pm EST, Auditorium Hall 150, Center for Data Science, NYU, 60 5th Ave

Abstract: Suppose we are observing a sample of independent random vectors with unknown general covariance structure, knowing that the original distribution was contaminated, so that a fraction of observations came from a different distribution. How to estimate the mean and the covariance matrix of the original distribution in this case? In this talk, we discuss some recent estimators that achieve the optimal non-asymptotic, dimension-free rate of convergence under the model where the adversary can corrupt a fraction of the samples arbitrarily. The discussion will cover a wide range of distributions including heavy-tailed, sub-Gaussian, and specifically Gaussian distributions.

Bio: Nikita Zhivotovskiy is an Assistant Professor in the Department of Statistics at the University of California Berkeley. He previously held postdoctoral positions at ETH Zürich in the department of mathematics hosted by Afonso Bandeira, and at Google Research, Zürich hosted by Olivier Bousquet. He also spent time at the Technion I.I.T. mathematics department hosted by Shahar Mendelson. Nikita completed his thesis at Moscow Institute of Physics and Technology under the guidance of Vladimir Spokoiny and Konstantin Vorontsov.

MIC Seminar with Anya Katsevich (MIT): Laplace asymptotics in high-dimensional Bayesian inference

September 30, 12:00pm EST, Room 650, Center for Data Science, NYU, 60 5th Ave

Abstract: Computing integrals against a high-dimensional posterior distribution is the major computational bottleneck in Bayesian inference. A popular technique to make this computation cheaper is to use the Laplace approximation (LA), a Gaussian distribution, in place of the true posterior. Yet the accuracy of this approximation is not fully understood in high dimensions. We derive a new, leading order asymptotic decomposition of the LA error in high dimensions. This leads to lower bounds which resolve the question of the dimension dependence of the LA. It also leads to a simple modification to the LA which yields a higher-order accurate posterior approximation. Finally, we derive the high-dimensional analogue of the classical asymptotic expansion of Laplace-type integrals. This opens the door to approximating the partition function (aka the posterior normalizing constant), of use in high-dimensional model selection and many other applications beyond statistics.

MaD Seminar with Cyril Letrouit (Orsay): Stability of optimal transport: old and new

September 19, 2:00pm EST, Auditorium Hall 150, Center for Data Science, NYU, 60 5th Ave

Abstract: Optimal transport consists in sending a given source probability measure to a given target probability measure, in a way which is optimal with respect to some cost. On bounded subsets of R^d, if the cost is given by the squared Euclidean distance and the source measure is absolutely continuous, a unique optimal transport map exists.

The question we will discuss is the following: how does this optimal transport map change if we perturb the target measure? For instance, if instead of the target measure we only have access to samples of it, how much does the optimal transport map change? This question, motivated by numerical aspects of optimal transport, has started to receive partial answers only recently, under quite restrictive assumptions on the source measure. We will review these answers and show how to handle much more general cases.

This is a joint work with Quentin Mérigot.

MaD Seminar with Joel A. Tropp (Caltech): Randomly pivoted Cholesky

September 12, 2:00pm EST, Auditorium Hall 150, Center for Data Science, NYU, 60 5th Ave

Abstract: André-Louis Cholesky entered École Polytechnique as a student in 1895. Before 1910, during his work as a surveyer for the French army, Cholesky invented a technique for solving positive-definite systems of linear equations. Cholesky’s method can also be used to approximate a positive-semidefinite (psd) matrix using a small number of columns, called “pivots”. A longstanding question is how to choose the pivot columns to achieve the best possible approximation.

This talk describes a simple but powerful randomized procedure for adaptively picking the pivot columns. This algorithm, randomly pivoted Cholesky (RPC), provably achieves near-optimal approximation guarantees. Moreover, in experiments, RPC matches or improves on the performance of alternative algorithms for low-rank psd approximation.

Cholesky died in 1918 from wounds suffered in battle. In 1924, Cholesky’s colleague, Commandant Benoit, published his manuscript. One century later, a modern adaptation of Cholesky’s method still yields state-of-the-art performance for problems in scientific machine learning.

Joint work (Randomly pivoted Cholesky: Practical approximation of a kernel matrix with few entry evaluations) with Yifan Chen, Ethan Epperly, and Rob Webber.

September 12​

Joel A. Tropp: Randomly Pivoted Cholesky

André-Louis Cholesky entered École Polytechnique as a student in 1895. Before 1910, during his work as a surveyer for the French army, Cholesky invented a technique for solving positive-definite systems of linear equations. Cholesky’s method can also be used to approximate a positive-semidefinite (psd) matrix using a small number of columns, called “pivots”. A longstanding question is how to choose the pivot columns to achieve the best possible approximation.

This talk describes a simple but powerful randomized procedure for adaptively picking the pivot columns. This algorithm, randomly pivoted Cholesky (RPC), provably achieves near-optimal approximation guarantees. Moreover, in experiments, RPC matches or improves on the performance of alternative algorithms for low-rank psd approximation.

Cholesky died in 1918 from wounds suffered in battle. In 1924, Cholesky’s colleague, Commandant Benoit, published his manuscript. One century later, a modern adaptation of Cholesky’s method still yields state-of-the-art performance for problems in scientific machine learning.

Joint work (“Randomly pivoted Cholesky: Practical approximation of a kernel matrix with few entry evaluations“) with Yifan Chen, Ethan Epperly, and Rob Webber.

September 19​

Cyril Letrouit: Stability of optimal transport: old and new

Optimal transport consists in sending a given source probability measure to a given target probability measure, in a way which is optimal with respect to some cost. On bounded subsets of R^d, if the cost is given by the squared Euclidean distance and the source measure is absolutely continuous, a unique optimal transport map exists.

The question we will discuss is the following: how does this optimal transport map change if we perturb the target measure? For instance, if instead of the target measure we only have access to samples of it, how much does the optimal transport map change? This question, motivated by numerical aspects of optimal transport, has started to receive partial answers only recently, under quite restrictive assumptions on the source measure. We will review these answers and show how to handle much more general cases.

This is a joint work with Quentin Mérigot.

People

Core Faculty

MaD Core Faculty

Jonathan Niles-Weed

Deputy Director & Associate Professor of Mathematics and Data Science

Research Areas: High-dimensional statistics, optimal transport, information theory

Qi Lei

Assistant Professor of Mathematics and Data Science

Julia Kempe

Silver Professor of Computer Science, Mathematics, and Data Science

Research Areas: Machine Learning, AI for science, Mathematics of Data Science, Foundations of Machin...

Yanjun Han

Assistant Professor of Mathematics and Data Science

Research Areas: Statistics; Online learning and bandits, information theory, machine learning

Carlos Fernandez- Granda

CDS Interim Director & Associate Professor of Mathematics and Data Science

Joan Bruna

Professor of Computer Science and Data Science

Affiliated Faculty

MaD Affiliated Faculty

Gérard Ben-Arous

Professor of Mathematics, Courant Institute

Xi Chen

Assistant Professor of Information, Operations and Management Sciences, Stern

Sinan Gunturk

Professor of Mathematics, Courant Institute

Eyal Lubetzky

Professor of Mathematics, Courant Institute

Eero Simoncelli

Professor of Neural Science, Mathematics, Data Science, and Psychology

Esteban Tabak

Professor of Mathematics, Courant Institute

Eric Vanden-Eijnden

Professor of Mathematics, Courant Institute

PhD / MSc Students

MaD PhD

Postdocs and Fellows

MaD Postdocs and Fellows

Sponsors

We are thankful for the generous financial support of the following institutions:

  • National Science Foundation
  • National Institutes of Health
  • Alfred P. Sloan Foundation
  • ARO
  • Vesri Schmidt Futures
  • Samsung Research
  • Samsung Electronics
  • Capital One
Scroll to Top