Stochastic Models for the Inference of Life Evolution


SMILE is an interdisciplinary research group gathering probabilists, statisticians, bio-informaticians and biologists.
SMILE is affiliated to the Stochastics and Biology group of LPSM (Lab of Probability, Statistics and Modeling) at Sorbonne Université (ex Université Pierre et Marie Curie Paris 06).
SMILE is hosted within the CIRB (Center for Interdisciplinary Research in Biology) at Collège de France.
SMILE is supported by Collège de France and CNRS.
Visit also our homepage at CIRB.

Recent contributions of the SMILE group related to SARS-Cov2 and COVID-19.


SMILE is hosted at Collège de France in the Latin Quarter of Paris. To reach us, go to 11 place Marcelin Berthelot (stations Luxembourg or Saint-Michel on RER B).
Our working spaces are rooms 107, 121 and 122 on first floor of building B1 (ask us for the code). Building B1 is facing you upon exiting the traversing hall behind Champollion's statue.


You can reach us by email (amaury.lambert - at - or (smile - at -

Light on



Predicted success of prophylactic antiviral therapy to block or delay SARS-CoV-2 infection depends on the targeted mechanism

Repurposed drugs that are immediately available and have a good safety profile constitute a first line of defense against new viral infections. Despite a limited antiviral activity against SARS-CoV-2, several drugs serve as candidates for application, not only in infected individuals but also as prophylaxis to prevent infection establishment. Here we use a stochastic model to describe the early phase of a viral infection. We find that the critical efficacy needed to block viral establishment is typically above 80\%. This value can be improved by combination therapy. Below the critical efficacy, establishment can still sometimes be prevented; for that purpose, drugs blocking viral entry into target cells (or equivalently enhancing viral clearance) are more effective than drugs reducing viral production or enhancing infected cell death. When a viral infection cannot be prevented because of high exposure or low drug efficacy, antivirals can still delay the time to reach detectable viral loads from 4 days when untreated to up to 30 days. This delay flattens the within-host epidemic curve, and possibly reduces transmission and symptom severity. These results suggest that antiviral prophylaxis, even with reduced efficacy, could be efficiently used to prevent or alleviate infection in people at high risk. It could thus be an important component of the strategy to combat the SARS-CoV-2 pandemic in the months or years to come.



Testing for Independence between Evolutionary Processes

Evolutionary events co-occurring along phylogenetic trees usually point to complex adaptive phenomena, sometimes implicating epistasis. While a number of methods have been developed to account for co-occurrence of events on the same internal or external branch of an evolutionary tree, there is a need to account for the larger diversity of possible relative positions of events in a tree. Here we propose a method to quantify to what extent two or more evolutionary events are associated on a phylogenetic tree. The method is applicable to any discrete character, like substitutions within a coding sequence or gains/losses of a biological function. Our method uses a general approach to statistically test for significant associations between events along the tree, which encompasses both events inseparable on the same branch, and events genealogically ordered on different branches. It assumes that the phylogeny and themapping of branches is known without errors. We address this problem from the statistical viewpoint by a linear algebra representation of the localization of the evolutionary events on the tree.We compute the full probability distribution of the number of paired events occurring in the same branch or in different branches of the tree, under a null model of independence where each type of event occurs at a constant rate uniformly inthephylogenetic tree. The strengths and weaknesses of themethodare assessed via simulations; we then apply the method to explore the loss of cell motility in intracellular pathogens.



A mathematical assessment of the efficiency of quarantining and contact tracing in curbing the COVID-19 epidemic

In our model of the COVID-19 epidemic, infected individuals can be of four types, according whether they are asymptomatic (\$$A\$$) or symptomatic (\$$I\$$), and use a contact tracing mobile phone app (\$$Y\$$) or not (\$$N\$$). We denote by \$$f\$$ the fraction of \$$A\$$'s, by \$$y\$$ the fraction of \$$Y\$$'s and by \$$R_0\$$ the average number of secondary infections from a random infected individual. We investigate the effect of non-digital interventions (voluntary isolation upon symptom onset, quarantining private contacts) and of digital interventions (contact tracing thanks to the app), depending on the willingness to quarantine, parameterized by four cooperating probabilities. For a given `effective' \$$R_0\$$ obtained with non-digital interventions, we use non-negative matrix theory and stopping line techniques to characterize mathematically the minimal fraction \$$y_0\$$ of app users needed to curb the epidemic. We show that under a wide range of scenarios, the threshold \$$y_0\$$ as a function of \$$R_0\$$ rises steeply from 0 at \$$R_0=1\$$ to prohibitively large values (of the order of 60-70\% up) whenever the effective \$$R_0\$$ is above 1.3. Our results show that moderate rates of adoption of a contact tracing app can reduce \$$R_0\$$ but are by no means sufficient to reduce it below 1 unless it is already very close to 1 thanks to non-digital interventions.



From individual-based epidemic models to McKendrick-von Foerster PDEs: A guide to modeling and inferring COVID-19 dynamics

We present a unifying, tractable approach for studying the spread of viruses causing complex diseases, requiring to be modeled with a large number of types (infective stage, clinical state, risk factor class...). We show that recording for each infected individual her infection age, i.e., the time elapsed since she was infected,
1. The age distribution \$$n(t,a)\$$ of the population at time \$$t\$$ is simply described by means of a first-order, one-dimensional partial differential equation (PDE) known as the McKendrick--von Foerster equation;
2. The frequency of type \$$i\$$ at time \$$t\$$ is simply obtained by integrating the probability \$$p(a,i)\$$ of being in state \$$i\$$ at age \$$a\$$ against the age distribution \$$n(t,a)\$$.
The advantage of this approach is three-fold. First, regardless of the number of types, macroscopic observables (e.g., incidence or prevalence of each type) only rely on a one-dimensional PDE ``decorated'' with types. This representation induces a simple methodology based on the McKendrick-von Foerster PDE with Poisson sampling to infer and forecast the epidemic. This technique is illustrated with French data of the COVID-19 epidemic.
Second, our approach generalizes and simplifies standard compartmental models using high-dimensional systems of ODEs to account for disease complexity. We show that such models can always be rewritten in our framework, thus providing a low-dimensional yet equivalent representation of these complex models.
Third, beyond the simplicity of the approach and its computational advantages, we show that our population model naturally appears as a universal scaling limit of a large class of fully stochastic individual-based epidemic models,
where the initial condition of the PDE emerges as the limiting age structure of an exponentially growing population starting from a single individual.

Upcoming seminars


Planning des salles du Collège de France.
Intranet du Collège de France.