Stochastic Models for the Inference of Life Evolution


SMILE is an interdisciplinary research group gathering probabilists, statisticians, bio-informaticians and biologists.
SMILE is affiliated to the Stochastics and Biology group of LPSM (Lab of Probability, Statistics and Modeling) at Sorbonne Université (ex Université Pierre et Marie Curie Paris 06).
SMILE is hosted within the CIRB (Center for Interdisciplinary Research in Biology) at Collège de France.
SMILE is supported by Collège de France and CNRS.
Visit also our homepage at CIRB.

Recent contributions of the SMILE group related to SARS-Cov2 and COVID-19.


SMILE is hosted at Collège de France in the Latin Quarter of Paris. To reach us, go to 11 place Marcelin Berthelot (stations Luxembourg or Saint-Michel on RER B).
Our working spaces are rooms 107, 121 and 122 on first floor of building B1 (ask us for the code). Building B1 is facing you upon exiting the traversing hall behind Champollion's statue.


You can reach us by email (amaury.lambert - at - or (smile - at -

Light on



How Ecology and Landscape Dynamics Shape Phylogenetic Trees

Whether biotic or abiotic factors are the dominant drivers of clade diversification is a long-standing question in evolutionary biology. The ubiquitous patterns of phylogenetic imbalance and branching slowdown have been taken as supporting the role of ecological niche filling and spatial heterogeneity in ecological features, and thus of biotic processes, in diversification. However, a proper theoretical assessment of the relative roles of biotic and abiotic factors in macroevolution requires models that integrate both types of factors, and such models have been lacking. In this study, we use an individual-based model to investigate the temporal patterns of diversification driven by ecological speciation in a stochastically fluctuating geographic landscape. The model generates phylogenies whose shape evolves as the clade ages. Stabilization of tree shape often occurs after ecological saturation, revealing species turnover caused by competition and demographic stochasticity. In the initial phase of diversification (allopatric radiation into an empty landscape), trees tend to be unbalanced and branching slows down. As diversification proceeds further due to landscape dynamics, balance and branching tempo may increase and become positive. Three main conclusions follow. First, the phylogenies of ecologically saturated clades do not always exhibit branching slowdown. Branching slowdown requires that competition be wide or heterogeneous across the landscape, or that the characteristics of landscape dynamics vary geographically. Conversely, branching acceleration is predicted under narrow competition or frequent local catastrophes. Second, ecological heterogeneity does not necessarily cause phylogenies to be unbalanced--short time in geographical isolation or frequent local catastrophes may lead to balanced trees despite spatial heterogeneity. Conversely, unbalanced trees can emerge without spatial heterogeneity, notably if competition is wide. Third, short isolation time causes a radically different and quite robust pattern of phylogenies that are balanced and yet exhibit branching slowdown. In conclusion, biotic factors have a strong and diverse influence on the shape of phylogenies of ecologically saturating clades and create the evolutionary template in which branching slowdown and tree imbalance may occur. However, the contingency of landscape dynamics and resource distribution can cause wide variation in branching tempo and tree balance. Finally, considerable variation in tree shape among simulation replicates calls for caution when interpreting variation in the shape of real phylogenies.



A mathematical assessment of the efficiency of quarantining and contact tracing in curbing the COVID-19 epidemic

In our model of the COVID-19 epidemic, infected individuals can be of four types, according whether they are asymptomatic (\$$A\$$) or symptomatic (\$$I\$$), and use a contact tracing mobile phone app (\$$Y\$$) or not (\$$N\$$). We denote by \$$f\$$ the fraction of \$$A\$$'s, by \$$y\$$ the fraction of \$$Y\$$'s and by \$$R_0\$$ the average number of secondary infections from a random infected individual. We investigate the effect of non-digital interventions (voluntary isolation upon symptom onset, quarantining private contacts) and of digital interventions (contact tracing thanks to the app), depending on the willingness to quarantine, parameterized by four cooperating probabilities. For a given `effective' \$$R_0\$$ obtained with non-digital interventions, we use non-negative matrix theory and stopping line techniques to characterize mathematically the minimal fraction \$$y_0\$$ of app users needed to curb the epidemic. We show that under a wide range of scenarios, the threshold \$$y_0\$$ as a function of \$$R_0\$$ rises steeply from 0 at \$$R_0=1\$$ to prohibitively large values (of the order of 60-70\% up) whenever the effective \$$R_0\$$ is above 1.3. Our results show that moderate rates of adoption of a contact tracing app can reduce \$$R_0\$$ but are by no means sufficient to reduce it below 1 unless it is already very close to 1 thanks to non-digital interventions.



Ranked Tree Shapes, Nonrandom Extinctions, and the Loss of Phylogenetic Diversity

Phylogenetic diversity (PD) is a measure of the evolutionary legacy of a group of species, which can be used to define conservation priorities. It has been shown that an important loss of species diversity can sometimes lead to a much less important loss of PD, depending on the topology of the species tree and on the distribution of its branch lengths. However, the rate of decrease of PD strongly depends on the relative depths of the nodes in the tree and on the order in which species become extinct. We introduce a new, sampling-consistent, three-parameter model generating random trees with covarying topology, clade relative depths and clade relative extinction risks. This model can be seen as an extension to Aldous' one parameter splitting model (\$$\beta\$$, which controls for tree balance) with two additional parameters: a new parameter \$$\alpha\$$ quantifying the correlation between the richness of a clade and its relative depth, and a parameter \$$\eta\$$ quantifying the correlation between the richness of a clade and its frequency (relative abundance or range), taken herein as a proxy for its overall extinction risk. We show on simulated phylogenies that loss of PD depends on the combined effect of all three parameters, \$$\beta\$$, \$$\alpha\$$ and \$$\eta\$$. In particular, PD may decrease as fast as species diversity when high extinction risks are clustered within small, old clades, corresponding to a parameter range that we term the `thin ice zone' (\$$\beta<-1\$$ or \$$\alpha<0\$$; \$$\eta>1\$$). Besides, when high extinction risks are clustered within large clades, the loss of PD can be higher in trees that are more balanced (\$$\beta>0\$$), in contrast to the predictions of earlier studies based on simpler models. We propose a Monte-Carlo algorithm, tested on simulated data, to infer all three parameters. Applying it to a real dataset comprising 120 bird clades (class Aves) with known range sizes , we show that parameter estimates precisely fall close to close to a 'thin ice zone': the combination of their ranking tree shape and non-random extinctions risks makes them prone to a sudden collapse of PD.

Upcoming seminars


Planning des salles du Collège de France.
Intranet du Collège de France.