Learning theory from first principles

Lecturer

Francis Bach

INRIA - SIERRA project-team and École Normale Supérieure, Paris

https://www.di.ens.fr/~fbach

Schedule and Place

When: This 15-hour course will take place over 3 days, with morning/afternoon sessions of 2.5 hours each, from April 1–3, 2025.

Where: KU Leuven
Auditorium Arenberg (Room 01.07), Kasteelpark Arenberg 1, 3001 Heverlee

Parking is available at the following address: Kapeldreef 74, 3001 Heverlee (Google maps)
The parking code, for the three days, is: 16244#
This code should be used to enter AND to leave the parking

Planned Schedule

9:20 - 9:45: Welcome coffee
9:45 - 12:35: Session of 2.5 hours (lecture, 20-min coffee break, exercise sessions or additional lecture)
13:50 - 16:40: Session of 2.5 hours (lecture, 20-min coffee break, exercise sessions or additional lecture)

Abstract

Data have become ubiquitous in science, engineering, industry, and personal life, leading to the need for automated processing. Machine learning is concerned with making predictions from training examples and is used in all of these areas, in small and large problems, with a variety of learning models, ranging from simple linear models to deep neural networks. It has now become an important part of the algorithmic toolbox.

How can we make sense of these practical successes? Can we extract a few principles to understand current learning methods and guide the design of new techniques for new applications or to adapt to new computational environments? This is precisely the goal of learning theory and this series of lectures, with a particular eye toward adaptivity to specific structures that make learning faster (such as smoothness of the prediction functions or dependence on low-dimensional subspaces).

Description

The course will be given over 3 consecutive days, with 6 sessions of 2.5 hours each.

Lecture 1: Learning with Infinite Data (Population Setting)
- Decision theory (loss, risk, optimal predictors)
- Decomposition of excess risk into approximation and estimation errors
- No free lunch theorems
- Basic notions of concentration inequalities (MacDiarmid, Hoeffding, Bernstein)
Lecture 2: Linear Least-Squares Regression
- Guarantees in the fixed design settings (simple in closed form)
- Ridge regression: dimension-independent bounds
- Guarantees in the random design settings
- Lower bound of performance
Lecture 3: Empirical Risk Minimization
- Convexification of the risk
- Estimation error: finite number of hypotheses and covering numbers
- Rademacher complexity
- Penalized problems
Lecture 4: Optimization for Machine Learning
- Gradient descent
- Stochastic gradient descent
- Generalization bounds through stochastic gradient descent
Lecture 5: Kernel Methods
- Kernels and representer theorems
- Algorithms
- Universal consistency
Lecture 6: Neural Networks
- Single hidden layer neural networks
- Estimation error
- Approximation properties and universality

Course Material

The course will be based on the recently published book: Learning Theory from First Principles, MIT Press, 2024

Available online at https://www.di.ens.fr/~fbach/ltfp_book.pdf

Evaluation

TBD - Students who request an evaluation are invited to approach the teacher in the first lecture.

Registration

Mandatory. See "Registration to courses" in the left column.