Stochastic Models for the Inference of Life Evolution


SMILE is an interdisciplinary research group gathering probabilists, statisticians, bio-informaticians and biologists.
SMILE is affiliated to the Stochastics and Biology group of LPSM (Lab of Probability, Statistics and Modeling) at Sorbonne Université (ex Université Pierre et Marie Curie Paris 06).
SMILE is hosted within the CIRB (Center for Interdisciplinary Research in Biology) at Collège de France.
SMILE is supported by Collège de France and CNRS.
Visit also our homepage at CIRB.

Recent contributions of the SMILE group related to SARS-Cov2 and COVID-19.


SMILE is hosted at Collège de France in the Latin Quarter of Paris. To reach us, go to 11 place Marcelin Berthelot (stations Luxembourg or Saint-Michel on RER B).
Our working spaces are rooms 107, 121 and 122 on first floor of building B1 (ask us for the code). Building B1 is facing you upon exiting the traversing hall behind Champollion's statue.


You can reach us by email (amaury.lambert - at - or (smile - at -

Light on



The genomic view of diversification

Evolutionary relationships between species are traditionally represented in the form of a tree, called the species tree. The reconstruction of the species tree from molecular data is hindered by frequent conflicts between gene genealogies. A standard way of dealing with this issue is to postulate the existence of a unique species tree where disagreements between gene trees are explained by incomplete lineage sorting (ILS) due to random coalescences of gene lineages inside the edges of the species tree. This paradigm, known as the multi-species coalescent (MSC), is constantly violated by the ubiquitous presence of gene flow revealed by empirical studies, leading to topological incongruences of gene trees that cannot be explained by ILS alone. Here we argue that this paradigm should be revised in favor of a vision acknowledging the importance of gene flow and where gene histories shape the species tree rather than the opposite. We propose a new, plastic framework for modeling the joint evolution of gene and species lineages relaxing the hierarchy between the species tree and gene trees. As an illustration, we implement this framework in a mathematical model called the genomic diversification (GD) model based on coalescent theory, with four parameters tuning replication, genetic differentiation, gene flow and reproductive isolation. We use it to evaluate the amount of gene flow in two empirical data-sets. We find that in these data-sets, gene tree distributions are better explained by the best fitting GD model than by the best fitting MSC model. This work should pave the way for approaches of diversification using the richer signal contained in genomic evolutionary histories rather than in the mere species tree.



A mathematical assessment of the efficiency of quarantining and contact tracing in curbing the COVID-19 epidemic

In our model of the COVID-19 epidemic, infected individuals can be of four types, according whether they are asymptomatic (\$$A\$$) or symptomatic (\$$I\$$), and use a contact tracing mobile phone app (\$$Y\$$) or not (\$$N\$$). We denote by \$$f\$$ the fraction of \$$A\$$'s, by \$$y\$$ the fraction of \$$Y\$$'s and by \$$R_0\$$ the average number of secondary infections from a random infected individual. We investigate the effect of non-digital interventions (voluntary isolation upon symptom onset, quarantining private contacts) and of digital interventions (contact tracing thanks to the app), depending on the willingness to quarantine, parameterized by four cooperating probabilities. For a given `effective' \$$R_0\$$ obtained with non-digital interventions, we use non-negative matrix theory and stopping line techniques to characterize mathematically the minimal fraction \$$y_0\$$ of app users needed to curb the epidemic. We show that under a wide range of scenarios, the threshold \$$y_0\$$ as a function of \$$R_0\$$ rises steeply from 0 at \$$R_0=1\$$ to prohibitively large values (of the order of 60-70\% up) whenever the effective \$$R_0\$$ is above 1.3. Our results show that moderate rates of adoption of a contact tracing app can reduce \$$R_0\$$ but are by no means sufficient to reduce it below 1 unless it is already very close to 1 thanks to non-digital interventions.



Predicted success of prophylactic antiviral therapy to block or delay SARS-CoV-2 infection depends on the targeted mechanism

Repurposed drugs that are immediately available and have a good safety profile constitute a first line of defense against new viral infections. Despite a limited antiviral activity against SARS-CoV-2, several drugs serve as candidates for application, not only in infected individuals but also as prophylaxis to prevent infection establishment. Here we use a stochastic model to describe the early phase of a viral infection. We find that the critical efficacy needed to block viral establishment is typically above 80\%. This value can be improved by combination therapy. Below the critical efficacy, establishment can still sometimes be prevented; for that purpose, drugs blocking viral entry into target cells (or equivalently enhancing viral clearance) are more effective than drugs reducing viral production or enhancing infected cell death. When a viral infection cannot be prevented because of high exposure or low drug efficacy, antivirals can still delay the time to reach detectable viral loads from 4 days when untreated to up to 30 days. This delay flattens the within-host epidemic curve, and possibly reduces transmission and symptom severity. These results suggest that antiviral prophylaxis, even with reduced efficacy, could be efficiently used to prevent or alleviate infection in people at high risk. It could thus be an important component of the strategy to combat the SARS-CoV-2 pandemic in the months or years to come.

Upcoming seminars


Planning des salles du Collège de France.
Intranet du Collège de France.