Laboratoire de Probabilités, Statistique et Modélisation (LPSM, UMR 8001)

Le LPSM est une unité mixte de recherche (UMR 8001) dépendant du CNRS, de Sorbonne Université et de l’Université Paris Cité. Le laboratoire compte environ 200 personnes (dont env. 90 permanents), répartis sur deux sites (Campus P. et M. Curie de Sorbonne Université et Campus Paris Rive Gauche de l’Université Paris Cité)

Les activités de recherche du LPSM couvrent un large spectre en Probabilités et Statistique, depuis les aspects les plus fondamentaux (qui incluent notamment l'Analyse Stochastique, la Géométrie Aléatoire, les Probabilités Numériques et les Systèmes Dynamiques) jusqu’aux applications à la Modélisation dans diverses disciplines (Physique, Biologie, Sciences des Données, Finance, Actuariat, etc), applications qui incluent des partenariats en dehors du monde académique.

Le LPSM est un laboratoire relativement récent. Cependant, ses composantes sont anciennes et proviennent du développement des « mathématiques du hasard » dans le centre de Paris, depuis le premier quart du 20ième siècle (voir ici pour plus de détails).

NB: Site largement inspiré de celui de l'IRIF (merci à eux pour la mise à disposition de leur maquette).

Piet Lammers

Piet Lammers est lauréat 2023-24 du prix et cours Claude-Antoine Peccot du Collège de France. Félicitations Piet!

Dominique Picard

Dominique Picard a été élue membre international de l'Académie des sciences américaine. Félicitations Dominique!


Le projet présenté à l'ADEME par l'entreprise Califrais, et dans lequel le LPSM est partenaire, a reçu un financement pour 5 ans dans le cadre de l'appel d'offre Logistique 4.0 du PIA 4 “Stratégie d'accélération, Digitalisation et décarbonation des mobilités”.

Francis Comets

Conference Mathematics of disordered systems: a tribute to Francis Comets organized by Thierry Bodineau, Bernard Derrida, Giambattista Giacomin and Dasha Loukianova, Paris 5-7 June 2023.

Institut Universitaire de France

Quentin Berger, Claire Boyer et Max Fathi ont été nommés à l'Institut Universitaire de France lors de la campagne 2023. Félicitations à tous les trois!

(Ces actualités sont présentées selon un classement mêlant priorité et aléatoire.)

Séminaire de statistique
Jeudi 28 septembre 2023, 9 heures 30, Jussieu en salle 15-25.102
Ruth Heller (Tel-Aviv University) Simultaneous Directional Inference

We consider the problem of inference on the signs of n > 1 parameters. We aim to provide 1 − α post-hoc confidence bounds on the number of positive and negative (or non-positive) parameters. The guarantee is simultaneous, for all subsets of parameters. Our suggestion is as follows: start by using the data to select the direction of the hypothesis test for each parameter; then, adjust the p-values of the one-sided hypotheses for the selection, and use the adjusted p-values for simultaneous inference on the selected n one-sided hypotheses. The adjustment is straightforward assuming that the p-values of one-sided hypotheses have densities with monotone likelihood ratio, and are mutually independent. We show that the bounds we provide are tighter (often by a great margin) than existing alternatives, and that they can be obtained by at most a polynomial time. We demonstrate the usefulness of our simultaneous post-hoc bounds in the evaluation of treatment effects across studies or subgroups. Specifically, we provide a tight lower bound on the number of studies which are beneficial, as well as on the number of studies which are harmful (or non beneficial), and in addition conclude on the effect direction of individual studies, while guaranteeing that the probability of at least one wrong inference is at most 0.05.

The relevant paper is arXiv:2301.01653 Joint work with Aldo Solari

Soutenances de thèse
Jeudi 28 septembre 2023, 15 heures, Salle Paul Lévy, 16-26 209
Yazid Janati (LPSM) Monte Carlo methods for Machine Learning: practical and theoretical contributions for Importance Sampling and sequential methods

Soutenances de thèse
Vendredi 29 septembre 2023, 14 heures, Salle 15-25 102
Ariane Marandon (LPSM) Contributions to reliable machine learning via false discovery rate control

Abstract: The reliability of machine learning (ML) methods is critical in contexts that involve high-stakes decisions. However, while ML methods have achieved impressive results in a wide range of applications, none of them are able to provide a small error guarantee in any situation. Since models cannot be perfect, they should at least “know that they do not know”. While there have been many efforts in the literature to address this issue, whether in the field of probability calibration, or prediction sets, these solutions are not satisfactory when an actual decision is required. By contrast, a key to keeping the error rate (or risk) below a certain threshold is to make use of a type of abstention option, which amounts to abstain from making a decision when there is too much uncertainty. The goal of this thesis is to propose new methods for risk control, i.e. for keeping the risk below a certain user-specified threshold α, in several learning tasks: novelty detection, clustering and link prediction. Our general idea is to enhance the best existing ML methods by developing an additional layer on top of them that provides an interpretable guarantee on the error rate. This is achieved by formalizing risk control in a certain task as a type of false discovery rate (FDR) control problem, and by using tools from the multiple testing literature on FDR control. Our methods can be seen as wrappers that take as input an off-the-shelf ML technique, designed for a certain learning task, and return a set of decisions such that the FDR is controlled.

Séminaire de Probabilités
Mardi 3 octobre 2023, 14 heures, Jussieu, Salle Paul Lévy, 16-26 209
Guillaume Baverez (Université Humboldt de Berlin) Singular modules and null-vector equations in Liouville conformal field theory

In conformal field theory (CFT), the null-vector (or BPZ) equations are a set of PDEs satisfied by correlation functions (and conformal blocks) involving so-called “degenerate primary fields” (i.e. fields associated to degenerate representations of the Virasoro algebra). These equations are parametrised by a pair of positive integers (r,s) labelling the representation (“Kac table”), and by the topological type of the surface on which the CFT lives. The BPZ operator is a partial differential operator of order rs (the “level”) on the Teichmüller space of the surface, and it annihilates the conformal blocks if the corresponding representation is irreducible. In the probabilistic formulation of Liouville CFT, the BPZ equations have been shown to hold in some special cases (for correlation functions at level 2, on the sphere and to some extent in genus one), and they are key inputs in the proofs of certain exact formulae (e.g. the “DOZZ formula”). In this work, we generalise these results to all values of the parameters, and we make an explicit connection with the algebraic structure of the theory. Namely, we construct the degenerate modules for all (r,s) and show that they are irreducible. Then, we use a geometric characterisation of conformal blocks to translate this local information into a PDE on Teichmüller space. The talk will focus on the probabilistic aspects of this work: I will explain how to construct the modules using elementary properties of the Gaussian free field and Gaussian multiplicative chaos. An interesting feature of this construction is a probabilistic interpretation of the Kac table. Ongoing work with Baojun Wu.

Groupe de Travail Modélisation Stochastique
Mercredi 4 octobre 2023, 14 heures 15, Sophie Germain 1013
Tony Lelièvre (Ecole des Ponts ParisTech) Finding saddle points of energy landscapes: why and how?

The motivation of this presentation comes from the analysis of metastable stochastic process in statistical physics. One way to bridge the scale between full atomistic models and more coarse-grained descriptions is to use Markov State models parameterized by the Eyring Kramers formulas. These formulas give the hopping rates between local minima of the potential energy function. They require to identify the local minima and saddle points of the potential energy function. This approach is for example used in materials science (kinetic Monte Carlo models).

In this talk, I will first present a recent result obtained in collaboration with D. Le Peutrec (Université d'Orléans) and B. Nectoux (Université Clermont Auvergne) about the mathematical foundations of this approach, by deriving these Eyring-Kramers exit rates starting from the overdamped Langevin dynamics [1]. I will then introduce a recent algorithm we proposed together with P. Parpas (Imperial College London) in order to locate saddle points [2]. I will explain why these two works both rely on concentration properties of the eigenvectors of Witten Laplacians, in the small temperature regime.

References: [1] TL, D. Le Peutrec and B. Nectoux, Eyring-Kramers exit rates for the overdamped Langevin dynamics: the case with saddle points on the boundary, [2] TL, P. Parpas /Using Witten Laplacians to locate index-1 saddle points/,

Séminaire de Probabilités
Mardi 10 octobre 2023, 14 heures, Jussieu, Salle Paul Lévy, 16-26 209
Quentin Berger (LPSM, Sorbonne Université) Limites d’échelles de systèmes désordonnés

Je présenterai quelques résultats récents concernant les limites d’échelles de systèmes désordonnés et des conséquences que l’on peu en tirer. Je me concentrerai essentiellement sur le modèle de Poland-Scheraga, aussi connu sous le nom de modèle d'accrochage, qui est utilisé pour décrire le phénomène de dénaturation de l’ADN : la question est de savoir si (et comment) le désordre perturbe la transition de dénaturation. Je décrirai notamment les résultats obtenus dans le cadre d’une version généralisée (censée être plus réaliste) du modèle, en collaboration avec Alexandre Legrand.

Séminaire de statistique
Mardi 10 octobre 2023, 9 heures 30, Jussieu en salle 15-16.201
Paul Escande On the Concentration of the Minimizers of Empirical Risks

Obtaining guarantees on the convergence of the minimizers of empirical risks to the ones of the true risk is a fundamental matter in statistical learning.

Instead of deriving guarantees on the usual estimation error, we will explore concentration inequalities on the distance between the sets of minimizers of the risks. We will argue that for a broad spectrum of estimation problems, there exists a regime where optimal concentration rates can be proven. The bounds will be showcased on a selection of estimation problems such as barycenters on metric space with positive or negative curvature, subspaces of covariance matrices, regression problems and entropic-Wasserstein barycenters.

Les probas du vendredi
Vendredi 13 octobre 2023, 11 heures, Jussieu, Salle Paul Lévy, 16-26 209
Justin Salez (Paris Dauphine) À venir

Séminaire de Probabilités
Mardi 17 octobre 2023, 14 heures, Jussieu, Salle Paul Lévy, 16-26 209
Roberto Imbuzeiro Oliveira (IMPA) Contact process over dynamical graphs

Les probas du vendredi
Vendredi 20 octobre 2023, 11 heures, Jussieu, Salle Paul Lévy, 16-26 209
Antoine Mouzard (ENS) À venir