Séminaire


Day, hour and place

Tuesday at 09:30, Sophie Germain en salle 1013 / Jussieu en salle 15-16.201


Contact(s)

To add the talks calendar to your agenda, subscribe to this calendar by using this link.



Year 2023

Statistics seminar
Tuesday May 30, 2023, 9:30AM, Jussieu en salle 15-16.201
Michael Arbel (INRIA) Non-Convex Bilevel Games with Critical Point Selection Maps

Bilevel optimization problems involve two nested objectives, where an upper-level objective depends on a solution to a lower-level problem. When the latter is non-convex, multiple critical points may be present, leading to an ambiguous definition of the problem. In this paper, we introduce a key ingredient for resolving this ambiguity through the concept of a selection map which allows one to choose a particular solution to the lower-level problem. Using such maps, we define a class of hierarchical games between two agents that resolve the ambiguity in bilevel problems. This new class of games requires introducing new analytical tools in Morse theory to characterize their evolution. In particular, we study the differentiability of the selection, an essential property when analyzing gradient-based algorithms for solving these games. We show that many existing algorithms for bilevel optimization, such as unrolled optimization, solve these games up to approximation errors due to finite computational power. Our analysis allows introducing a simple correction to these algorithms for removing the errors.

Statistics seminar
Thursday May 25, 2023, 9:30AM, Jussieu en salle 15-16.201
Jeffrey Näf (INRIA Montpellier) Distributional Random Forest: Heterogeneity Adjustment and Multivariate Distributional Regression

Random Forest is a successful and widely used regression and classification algorithm. Part of its appeal and reason for its versatility is its (implicit) construction of a kernel-type weighting function on training data, which can also be used for targets other than the original mean estimation. We propose a novel forest construction for multivariate responses based on their joint conditional distribution, called the Distributional Random Forest (DRF). It uses a new splitting criterion based on the MMD distributional metric, which is suitable for detecting heterogeneity in multivariate distributions. The induced weights define an estimate of the full conditional distribution, which in turn can be used for arbitrary and potentially complicated targets of interest.

Statistics seminar
Tuesday May 23, 2023, 9:30AM, Jussieu en salle 15-16.201
Evguenii Chzhen (Orsay) Demographic parity constraint for algorithmic fairness : a statistical perspective

In this talk I will give a brief introduction to the recently emerged field of algorithmic fairness and advocate for a statistical study of the problem. To support my claims, I will focus on the Demographic Parity fairness constraint, describing various connections to classical statistical theory, optimal transport, and conformal prediction literature. In particular, I will present the form of an optimal prediction function under this constraint in both regression and classification. Then, I will describe a learning procedure which is supported by statistical guarantees under no or mild assumptions on the underlying data distribution.

This talk is based on a sequence of joint works with Ch. Denis, S. Gaucher, M. Hebiri, L. Oneto, M. Pontil, and N. Schreuder.

Statistics seminar
Tuesday May 9, 2023, 9:30AM, Jussieu en salle 15-16.201
Charlotte Dion-Blanc (Sorbonne Université) Classification multi-classes, pour des trajectoires issues de processus de diffusions

Dans cet exposé je présenterai le problème de classification multi-classes, lorsque que les données sont supposées provenir d'un modèle d'équation différentielle stochastique, différent selon la classe, et observées en temps court. J'étudierai en particulier le cas où les classes sont discriminées par le coefficient de dérive de l'équation. Nous verrons les vitesses de convergence pour un classifieur de type plug-in basé sur des estimateurs non-paramétriques des coefficients inconnus

Statistics seminar
Tuesday April 11, 2023, 9:30AM, Sophie Germain en salle 1013
Tabea Rebafka (Sorbonne Université) Model-based graph clustering with an application to ecological networks

We consider the problem of clustering multiple networks into groups of networks with similar topology. A statistical model-based approach based on a finite mixture of stochastic block models is proposed. A clustering is obtained by maximizing the integrated classification likelihood criterion. This is done by a hierarchical agglomerative algorithm, that starts from singleton clusters and successively merges clusters of networks. As such, a sequence of nested clusterings is computed that can be represented by a dendrogram providing valuable insights on the data. We present results of our method obtained for a collection of foodwebs in ecology. We illustrate that the method provides relevant clusterings and that the estimated model parameters are highly interpretable and useful in practice.

Statistics seminar
Tuesday March 28, 2023, 9:30AM, Jussieu en salle 15-16.201
David Rossell (Universitat Pompeu Fabra) Statistical inference with external information: high-dimensional data integration

Statistical inference when there are many parameters is a well-studied problem. For example, there are fundamental limits in what types of signals one may learn from data, e.g. given by minimal sample sizes, signal strengths or sparsity conditions. There are many applied problems however where, besides the data directly being analyzed, one has access to external data that one intuitively thinks may help improve inference. Examples include data integration and high-dimensional causal inference methods, where formally incorporating external information is the default and has shown significant practical benefits. We will discuss some of these situations, showcasing the use graphical models related to COVID19 evolution and causal inference methods for gender salary gaps, and provide a theoretical analysis in a simplified Gaussian sequence model setting. The latter shows that, by integrating external information, one may push the theoretical limits of what's possible to learn from data, providing a theoretical justification for this popular applied practice. We will also discuss some practical modelling and computational considerations in formulating Bayesian data analysis methods that provide informative & accurate inference, yet remain computationally tractable.

Statistics seminar
Tuesday March 21, 2023, 9:30AM, Sophie Germain en salle 1013
Cécile Durot (Université Paris Nanterre) To be announced.

Statistics seminar
Thursday March 9, 2023, 9:30AM, Jussieu en salle 16-26.209
Pierre Wolinski (INRIA) Gaussian Pre-Activations in Neural Networks: Myth or Reality?

The study of feature propagation at initialization in neural networks lies at the root of numerous initialization designs. An assumption very commonly made in the field states that the pre-activations are Gaussian. Although this convenient Gaussian hypothesis can be justified when the number of neurons per layer tends to infinity, it is challenged by both theoretical and experimental works for finite-width neural networks. Our major contribution is to construct a family of pairs of activation functions and initialization distributions that ensure that the pre-activations remain Gaussian throughout the network's depth, even in narrow neural networks. In the process, we discover a set of constraints that a neural network should fulfill to ensure Gaussian pre-activations. Additionally, we provide a critical review of the claims of the Edge of Chaos line of works and build an exact Edge of Chaos analysis. We also propose a unified view on pre-activations propagation, encompassing the framework of several well-known initialization procedures. Finally, our work provides a principled framework for answering the much-debated question: is it desirable to initialize the training of a neural network whose pre-activations are ensured to be Gaussian?

Statistics seminar
Thursday February 9, 2023, 9:30AM, Sophie Germain en salle 1016
Vincent Divol (CEREMADE) Estimation d'applications de transport optimal dans des espaces fonctionnels généraux

Nous considérons le problème de l'estimation d'une application de transport optimal entre une loi source P (fixée) et une loi cible inconnue Q, sur la base d'un échantillon de loi Q. Un tel problème a récemment gagné en popularité avec de nouvelles applications en apprentissage automatique, comme les modèles génératifs. Jusqu'à maintenant, des vitesses d'estimations sont connues seulement dans un petit nombre de cas (par exemple, quand P et Q ont des densités majorées et minorées et que l'application de transport appartient à un espace de Hölder), qui sont rarement réalisés en pratique. Nous présentons une méthodologie permettant d'obtenir des vitesses d'estimation de l'application de transport optimal dans sous des hypothèses générales, qui se base sur l'optimisation de la formulation duale du problème de transport empirique. À titre d'exemple, nous donnons des vitesses de convergence dans le cas où P est gaussien et l'application de transport est donnée par un réseau de neurones à deux couches avec un nombre arbitrairement grands de neurones. Collaboration avec Aram-Alexandre Pooladian et Jonathan Niles-Weed

Statistics seminar
Tuesday January 24, 2023, 9:30AM, Jussieu en salle 15-16.201
Laure Sansonnet (INRAE MIA Paris-Saclay) Sélection de variables dans des modèles linéaires (généralisés) multivariés avec dépendance

Dans cet exposé, on s'intéresse au problème de sélection de variables dans deux cadres de modélisation : (i) un modèle linéaire multivarié prenant en compte la dépendance qui peut exister entre les réponses et (ii) un modèle GLARMA multivarié. Dans une première partie, on présentera une procédure de sélection de variables dans le cadre des modèles linéaires multivariés prenant en compte la dépendance qui peut exister entre les réponses. Elle consiste à estimer au préalable la matrice de covariance des réponses qui doit satisfaire certaines hypothèses, et d'utiliser cet estimateur dans un critère Lasso pour obtenir un estimateur parcimonieux de la matrice des coefficients. On illustrera théoriquement et numériquement les bonnes performances de cette méthode appelée MultiVarSel. Dans une seconde partie, après avoir introduit les modèles GLARMA (Generalized Linear Autoregressive Moving Average) multivariés pouvant modéliser des séries temporelles à valeurs discrètes, on proposera une nouvelle approche efficace de sélection de variables dans ces modèles. Elle consiste à combiner itérativement deux étapes : l'estimation des coefficients ARMA et la sélection de variables dans les coefficients de la partie GLM avec des méthodes régularisées. Les bonnes performances de cette approche appelée MultiGlarmaVarSel seront illustrées sur des données synthétiques et sur des données RNA-Seq sur la germination des graines (en collaboration avec Christophe Bailly et Loïc Rajjou).

La première partie est en collaboration avec Julien Chiquet, Céline Lévy-Leduc et Marie Perrot-Dockès et la deuxième partie est en collaboration avec Marina Gomtsyan, Céline Lévy-Leduc et Sarah Ouadah.


Year 2022

Statistics seminar
Tuesday December 6, 2022, 9:30AM, Jussieu en salle 15-16.201
Vianney Perchet (ENSAE) An algorithmic solution to the Blotto game using multi-marginal couplings

We describe an efficient algorithm to compute solutions for the general two-player Blotto game on n battlefields with heterogeneous values. While explicit constructions for such solutions have been limited to specific, largely symmetric or homogeneous, setups, this algorithmic resolution covers the most general situation to date: value-asymmetric game with asymmetric budget. The proposed algorithm rests on recent theoretical advances regarding Sinkhorn iterations for matrix and tensor scaling. An important case which had been out of reach of previous attempts is that of heterogeneous but symmetric battlefield values with asymmetric budget. In this case, the Blotto game is constant-sum so optimal solutions exist, and our algorithm samples from an ε-optimal solution in time O~(n2+ε−4), independently of budgets and battlefield values. In the case of asymmetric values where optimal solutions need not exist but Nash equilibria do, our algorithm samples from an ε-Nash equilibrium with similar complexity but where implicit constants depend on various parameters of the game such as battlefield values.

Statistics seminar
Tuesday November 22, 2022, 9:30AM, Sophie Germain en salle 1013 / Jussieu en salle 15-16.201
Morgane Austern (Harvard University) To split or not to split that is the question: From cross validation to debiased machine learning.

Data splitting is an ubiquitous method in statistics with examples ranging from cross validation to cross-fitting. However, despite its prevalence, theoretical guidance regarding its use is still lacking. In this talk we will explore two examples and establish an asymptotic theory for it. In the first part of this talk, we study the cross-validation method, a ubiquitous method for risk estimation, and establish its asymptotic properties for a large class of models and with an arbitrary number of folds. Under stability conditions, we establish a central limit theorem and Berry-Esseen bounds for the cross-validated risk, which enable us to compute asymptotically accurate confidence intervals. Using our results, we study the statistical speed-up offered by cross validation compared to a train-test split procedure. We reveal some surprising behavior of the cross-validated risk and establish the statistically optimal choice for the number of folds. In the second part of this talk, we study the role of cross fitting in the generalized method of moments with moments that also depend on some auxiliary functions. Recent lines of work show how one can use generic machine learning estimators for these auxiliary problems, while maintaining asymptotic normality and root-n consistency of the target parameter of interest. The literature typically requires that these auxiliary problems are fitted on a separate sample or in a cross-fitting manner. We show that when these auxiliary estimation algorithms satisfy natural leave-one-out stability properties, then sample splitting is not required. This allows for sample re-use, which can be beneficial in moderately sized sample regimes.

Statistics seminar
Tuesday November 8, 2022, 9:30AM, Jussieu en salle 15-16.201
Arshak Minasyan (CREST-ENSAE) All-In-One Robust Estimator of sub-Gaussian Mean

We propose a robust-to-outliers estimator of the mean of a multivariate Gaussian distribution that enjoys the following properties: polynomial computational complexity, high breakdown point, orthogonal and geometric invariance, minimax rate optimality (up to logarithmic factor) and asymptotical efficiency. Non-asymptotic risk bound for the expected error of the proposed estimator is dimension-free and involves only the effective rank of the covariance matrix. Moreover, we show that the obtained results also hold with high probability and can be extended to the cases of unknown rate of contamination or unknown covariance matrix. In the end, I will also discuss the topic of sparse robust mean estimation in the same framework of adversarial contamination.

Statistics seminar
Thursday October 20, 2022, 11AM, Jussieu en salle 15-16.201
Misha Belkin (University of California) Neural networks, wide and deep, singular kernels and Bayes optimality

Wide and deep neural networks are used in many important practical setting.</span>In this talk I will discuss some aspects of width and depth related to optimization and generalization. I will first discuss what happens when neural networks become infinitely wide, giving a general result for the transition to linearity (i.e., showing that neural networks become linear functions of parameters) for a broad class of wide neural networks corresponding to directed graphs.<br><br>I will then proceed to the question of depth, showing equivalence between infinitely wide and deep fully connected networks trained with gradient descent and Nadaraya-Watson predictors based on certain singular kernels. Using this connection we show that for certain activation functions these wide and deep networks are (asymptotically) optimal for classification but, interestingly, never for regression. Based on joint work with Chaoyue Liu, Adit Radhakrishnan, Caroline Uhler and Libin Zhu.

Statistics seminar
Tuesday October 11, 2022, 9:30AM, Jussieu en salle 15-16.201 et retransmission
Yifan Cui (Zhejiang University) Instrumental Variable Approaches To Individualized Treatment Regimes Under A Counterfactual World

There is fast-growing literature on estimating optimal treatment regimes based on randomized trials or observational studies under a key identifying condition of no unmeasured confounding. Because confounding by unmeasured factors cannot generally be ruled out with certainty in observational studies or randomized trials subject to noncompliance, we propose a robust classification-based instrumental variable approach to learning optimal treatment regimes under endogeneity. Specifically, we establish the identification of both value functions for a given regime and optimal regimes with the aid of a binary instrumental variable when no unmeasured confounding fails to hold. We also construct novel multiply robust classification-based estimators. In addition, we propose to identify and estimate optimal treatment regimes among those who would comply with the assigned treatment under a monotonicity assumption. Furthermore, we consider the problem of individualized treatment regimes under the sign and partial identification. In the former case, i) we provide a necessary and sufficient identification condition of optimal treatment regimes with an instrumental variable; ii) we establish the somewhat surprising result that complier optimal regimes can be consistently estimated without directly collecting compliance information and therefore without the compiler average treatment effect itself being identified. In the latter case, we establish a formal link between individualized decision making under partial identification and classical decision theory under uncertainty through a unified lower bound perspective.

Statistics seminar
Tuesday September 27, 2022, 9:30AM, Jussieu en salle 15-16.201
Emilie Kaufmann (CNRS) Exploration non paramétrique dans les modèles de bandits

Dans un modèle de bandit, un agent sélectionne de manière séquentielle des “bras”, qui sont des lois de probabilité initialement inconnues de l’agent, dans le but de maximiser la somme des échantillons obtenus, qui sont vus comme des récompenses. Les algorithmes de bandits les plus populaires sont basés sur la construction d’intervalles de confiance ou l’échantillonnage d’une loi a posteriori, mais ne peuvent atteindre des performances optimales qu’en ayant des connaissances a priori sur la famille de distributions des bras. Dans cet exposé nous allons présenter des approches alternatives basées sur du ré-échantillonnage de l’historique de chaque bras. De tels algorithmes peuvent s’avérer plus robustes en deux sens. Nous verrons qu’ils peuvent être optimaux pour différentes classes de distributions, et être aisément adaptés à des situations où le critère de performance n’est pas lié à la récompense moyenne de l’agent, mais prend en compte une mesure de risque.

Statistics seminar
Tuesday May 31, 2022, 9:30AM, Sophie Germain en salle 1013 / Jussieu en salle 15-16.201
Elsa Cazelles (IRIT) A novel notion of barycenter for probability distributions based on optimal weak mass transport

We introduce weak barycenters of a family of probability distributions, based on the recently developed notion of optimal weak transport of mass. We provide a theoretical analysis of this object and discuss its interpretation in the light of convex ordering between probability measures. In particular, we show that, rather than averaging the input distributions in a geometric way (as the Wasserstein barycenter based on classic optimal transport does) weak barycenters extract common geometric information shared by all the input distributions, encoded as a latent random variable that underlies all of them. We also provide an iterative algorithm to compute a weak barycenter for a finite family of input distributions, and a stochastic algorithm that computes them for arbitrary populations of laws. The latter approach is particularly well suited for the streaming setting, i.e., when distributions are observed sequentially. The notion of weak barycenter is illustrated on several examples.

Statistics seminar
Tuesday May 10, 2022, 9:30AM, Sophie Germain en salle 1013 / Jussieu en salle 15-16.201
Guillaume Lecué (CREST) A geometrical viewpoint on the benign overfitting property of the minimum $\ell_2$-norm interpolant estimator.

Practitioners have observed that some deep learning models generalize well even with a perfect fit to noisy training data [1,2]. Since then many theoretical works have revealed some facets of this phenomenon [3,4,5] known as benign overfitting. In particular, in the linear regression model, the minimum l_2-norm interpolant estimator \hat\bbeta has received a lot of attention [3,4,6] since it was proved to be consistent even though it perfectly fits noisy data under some condition on the covariance matrix \Sigma of the input vector. Motivated by this phenomenon, we study the generalization property of this estimator from a geometrical viewpoint. Our main results extend and improve the convergence rates as well as the deviation probability from [6]. Our proof differs from the classical bias/variance analysis and is based on the self-induced regularization property introduced in [4]: \hat\bbeta can be written as a sum of a ridge estimator \hat\bbeta_{1:k} and an overfitting component \hat\bbeta_{k+1:p} which follows a decomposition of the features space \bR^p=V_{1:k}\oplus^\perp V_{k+1:p} into the space V_{1:k} spanned by the top k eigenvectors of \Sigma and the one V_{k+1:p} spanned by the p-k last ones. We also prove a matching lower bound for the expected prediction risk. The two geometrical properties of random Gaussian matrices at the heart of our analysis are the Dvoretsky-Milman theorem and isomorphic and restricted isomorphic properties. In particular, the Dvoretsky dimension appearing naturally in our geometrical viewpoint coincides with the effective rank from [3,6] and is the key tool to handle the behavior of the design matrix restricted to the sub-space V_{k+1:p} where overfitting happens. (Joint work with Zong Shang).

[1] Mikhail Belkin, Daniel Hsu, Siyuan Ma, and Soumik Mandal. Reconciling modern machine-learning practice and the classical bias-variance trade-off. Proc. Natl. Acad. Sci. USA, 116(32):15849–15854, 2019.

[2] Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, and Oriol Vinyals. Understanding deep learning (still) requires rethinking generalization. Commun. ACM, 64(3):107–115, 2021.

[3] Peter L. Bartlett, Philip M. Long, Gabor Lugosi, and Alexander Tsigler. Benign overfitting in linear regression. Proc. Natl. Acad. Sci. USA, 117(48):30063–30070, 2020.

[4] Peter L. Bartlett, Andreas Montanari, and Alexander Rakhlin. Deep learning: a statistical viewpoint. To appear in Acta Numerica, 2021.

[5] Mikhail Belkin. Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation. To appear in Acta Numerica, 2021.

[6] Alexander Tsigler and Peter L. Bartlett. Benign overfitting in ridge regression. 2021.

Statistics seminar
Tuesday April 19, 2022, 9:30AM, Sophie Germain en salle 1013 / Jussieu en salle 15-16.201
Clément Marteau (Université Lyon 1) Supermix : régularisation parcimonieuse pour des modèles de mélange

Cet exposé s'intéresse à l'estimation d'une mesure de probabilité discrète $\mu_0$ impliquée dans un modèle de mélange. Utilisant des résultats récents en régularisation l1 sur l'espace des mesures, nous considérerons un problème d'optimisation convexe pour l'estimation de $\mu_0$ sans faire appel à l'utilisation d'une grille. Le traitement de ce problème d'optimisation nécessite l'introduction d'un certificat dual. Nous discuterons ensuite les propriétés statistiques de l'estimateur obtenu en s'intéressant en particulier au cas gaussien.

Statistics seminar
Tuesday April 5, 2022, 9:30AM, Sophie Germain en salle 1013 / Jussieu en salle 15-16.201
Fabrice Grela (Université de Nantes) Minimax detection and localisation of an abrupt change in a Poisson process

Considering a Poisson process observed on a bounded, fixed interval, we are interested in the problem of detecting an abrupt change in its distribution, characterized by a jump in its intensity. Formulated as an off-line change-point problem, we address two questions : the one of detecting a change-point and the one of estimating the jump location of such change-point. This study aims at proposing a non-asymptotic minimax testing set-up, first to construct a minimax and adaptive detection procedure and then to give a minimax study of a multiple testing procedure designed for simultaneously detect and localise a change-point.

Statistics seminar
Tuesday March 22, 2022, 9:30AM, Sophie Germain en salle 1013 / Jussieu en salle 15-16.201
Aymeric Dieuleveut (Polytechnique) Federated Learning and optimization: from a gentle introduction to recent results

In this presentation, I will present some results on optimization in the context of federated learning. I will summarise the main challenges and the type of results people have been interested in, and dive into some more recent results on tradeoffs between (bidirectional) compression, communication, privacy and user-heterogeneity. The presentation will be based on recent work with Constantin Philippenko, Maxence Noble, Aurélien Bellet.

Refs:Mainly: Differentially Private Federated Learning on Heterogeneous Data, M Noble, A Bellet, A Dieuleveut, Aistats 2022, Link Preserved central model for faster bidirectional compression in distributed settings C Philippenko, A Dieuleveut, Neurips 2021 LinkIf time allows it (unlikely): Federated Expectation Maximization with heterogeneity mitigation and variance reduction, A Dieuleveut, G Fort, E Moulines, G Robin, Neurips 2021 Link

Statistics seminar
Tuesday March 8, 2022, 9:30AM, Sophie Germain en salle 1013 / Jussieu en salle 15-16.201
Lihua Lei (Stanford University) Testing for outliers with conformal p-values

We study the construction of p-values for nonparametric outlier detection, taking a multiple-testing perspective. The goal is to test whether new independent samples belong to the same distribution as a reference data set or are outliers. We propose a solution based on conformal inference, a broadly applicable framework that yields p-values that are marginally valid but mutually dependent for different test points. We prove these p-values are positively dependent and enable exact false discovery rate control, although in a relatively weak marginal sense. We then introduce a new method to compute p-values that are both valid conditionally on the training data and independent of each other for different test points; this paves the way to stronger type-I error guarantees. Our results depart from classical conformal inference as we leverage concentration inequalities rather than combinatorial arguments to establish our finite-sample guarantees. Furthermore, our techniques also yield a uniform confidence bound for the false positive rate of any outlier detection algorithm, as a function of the threshold applied to its raw statistics. Finally, the relevance of our results is demonstrated by numerical experiments on real and simulated data.

Statistics seminar
Tuesday February 8, 2022, 9:30AM, Sophie Germain en salle 1013 / Jussieu en salle 15-16.201
Élisabeth Gassiat Deconvolution with unknown noise distribution

I consider the deconvolution problem in the case where no information is known about the noise distribution. More precisely, no assumption is made on the noise distribution and no samples are available to estimate it: the deconvolution problem is solved based only on observations of the corrupted signal. I will prove the identifiability of the model up to translation when the signal has a Laplace transform with an exponential growth $\rho$ smaller than 2 and when it can be decomposed into two dependent components, so that the identifiability theorem can be used for sequences of dependent data or for sequences of iid multidimensional data. In the case of iid multidimensional data, I will propose an adaptive estimator of the density of the signal and provide rates of convergence. This rate of convergence is known to be minimax when ρ = 1.

Statistics seminar
Tuesday January 25, 2022, 9:30AM, Sophie Germain en salle 1013 / Jussieu en salle 15-16.201
Nicolas Verzelen (Université de Montpellier) Optimal ranking in crowd-sourcing problem

Consider a crowd sourcing problem where we have n experts and d tasks. The average ability of each expert for each task is stored in an unknown matrix M, from which we have incomplete and noise observations. We make no (semi) parametric assumptions, but assume that both experts and tasks can be perfectly ordered: so that if an expert A is better than an expert B, the ability of A is higher than that of B for all tasks - and that the same holds for the tasks. This implies that if the matrix M, up to permutations of its rows and columns, is bi-isotonic. We focus on the problem of recovering the optimal ranking of the experts in l2 norm, when the ordering of the tasks is known to the statistician. In other words, we aim at estimating the suitable permutation of the rows of M while the permutation of the columns is known. We provide a minimax-optimal and computationally feasible method for this problem, based on hierarchical clustering, PCA, change-point detection, and exchange of informations among the clusters. We prove in particular - in the case where d > n - that the problem of estimating the expert ranking is significantly easier than the problem of estimating the matrix M.

This talk is based on a joint ongoing work with Alexandra Carpentier and Emmanuel Pilliat.


Year 2021

Statistics seminar
Tuesday December 14, 2021, 9:30AM, Sophie Germain en salle 1013 / Jussieu en salle 15-16.201
Julie Delon (Université de Paris) Some perspectives on stochastic models for Bayesian image restoration

Random image models are central for solving inverse problems in imaging. In a Bayesian formalism, these models can be used as priors or regularisers and combined to an explicit likelihood function to define posterior distributions. Most of the time, these posterior distributions are used to derive Maximum A Posteriori (MAP) estimators, leading to optimization problems that may be convex or not, but are well studied and understood. Sampling schemes can also be used to explore these posterior distributions, to derive Minimum Mean Square Error (MMSE) estimators, quantify uncertainty or perform other advanced inferences. While research on inverse problems has focused for many years on explicit image models (either directly in the image space, or in a transformed space), an important trend nowadays is to use implicit image models encoded by neural networks. This opens the way to restoration algorithms that exploit more powerful and accurate prior models for natural images but raises novel challenges and questions on the corresponding posterior distributions and their resulting estimators. The goal of this presentation is to provide some perspectives and present recent developments on these questions.

Statistics seminar
Tuesday November 30, 2021, 9:30AM, Sophie Germain en salle 1013 / Jussieu en salle 15-16.201
Frédéric Chazal (INRIA) A framework to differentiate persistent homology with applications in Machine Learning and Statistics

Understanding the differentiable structure of persistent homology and solving optimization tasks based on functions and losses with a topological flavor is a very active, growing field of research in data science and Topological Data Analysis, with applications in non-convex optimization, statistics and machine learning.

However, the approaches proposed in the literature are usually

anchored to a specific application and/or topological construction, and do not come with theoretical guarantees.

In this talk, we will study the differentiability of a general map associated with the most common topological construction, that is, the persistence map. Building on real analytic geometry arguments, we propose a general framework that allows to define and compute gradients for persistence-based functions in a very simple way. As an application, we also provide a simple, explicit and sufficient condition for convergence of stochastic subgradient methods for such functions. If time permits, as another application, we will also show how this framework combined with standard geometric measure theory arguments leads to results on the statistical behavior of persistence diagrams of filtrations built on top of random point clouds.

Statistics seminar
Tuesday November 23, 2021, 9:30AM, Sophie Germain en salle 1013 / Jussieu en salle 15-16.201
Yannick Baraud (Université de Luxembourg) Comment construire des lois a posteriori robustes à partir de tests ?

Les estimateurs bayésiens classiques, tout comme ceux bâtis à partir de la vraisemblance, ont de bonnes qualités d’estimation lorsque le modèle statistique est exact et rend parfaitement compte de la loi des données. Dans le cas contraire, lorsque le modèle n’est qu’approximatif, ces estimateurs peuvent devenir terriblement mauvais et il suffit parfois d’une seule donnée aberrante au regard du modèle utilisé pour que cela advienne. Nous montrerons comment remédier à ce problème d’instabilité en proposant dans le cadre bayésien une nouvelle loi a posteriori construite à partir de tests robustes convenables. Nous verrons comment cette approche fournit des estimateurs à la fois optimaux lorsque le modèle est exact et stables à une légère erreur de modélisation.

Statistics seminar
Tuesday November 9, 2021, 9:30AM, Sophie Germain en salle 1013 / Jussieu en salle 15-16.201
Alessandro Rudi (INRIA) PSD models for Non-convex optimization and beyond

In this talk we present a rather flexible and expressive model for non-negative functions. We will show direct applications in probability representation and non-convex optimization. In particular, the model allows to derive an algorithm for non-convex optimization that is adaptive to the degree of differentiability of the objective function and achieves optimal rates of convergence. Finally we show how to apply the same technique to other interesting problems in applied mathematics that can be easily expressed in terms of inequalities.

Statistics seminar
Tuesday October 19, 2021, 9:30AM, Sophie Germain en salle 1013
Antoine Marchina (Université de Paris) Concentration inequalities for suprema of unbounded empirical processes

In this talk, we will provide new concentration inequalities for suprema of (possibly) non-centered and unbounded empirical processes associated with independent and identically distributed random variables. In particular, we establish Fuk-Nagaev type inequalities with the optimal constant in the moderate deviation bandwidth. We will also explain the use of these results in statistical applications (ongoing research)

Statistics seminar
Tuesday October 5, 2021, 9:30AM, Jussieu en salle 15-16.201
Judith Rousseau (Oxford) Semiparametric and nonparametric Bayesian inference in hidden Markov models

In this work we are interested in inference in Hidden Markov models with infinite state space and nonparametric emission distributions. Since the seminal paper of Gassiat et al. (2016), it is known that in such models the transition matrix Q and the emission distributions F1; … ; FK are identi fiable, up to label switching. We propose an (almost) Bayesian method to simultaneously estimate Q at the rate sqrt(n) and the emission distributions at the usual nonparametric rates. To do so, we first consider a prior pi1 on Q and F1; … ; Fk which leads to a posterior marginal distribution on Q which veri fies the Bernstein von mises property and thus to an estimator of Q which is efficient. We then combine the marginal posterior on Q with an other posterior distribution on the emission distributions, following the cut-posterior approach, to obtain a posterior which also concentrates around the emission distributions at the minimax rates. In addition an important intermediate result of our work is an inversion inequality which allows to upper bound the L1 norms between the emission densities by the L1 norms between marginal densities of 3 consecutive observations.

Joint work with D. Moss (Oxford).