Université Paris 6
Pierre et Marie Curie | Université Paris 7
Denis Diderot | |

CNRS U.M.R. 7599
| ||

``Probabilités et Modèles Aléatoires''
| ||

**Auteur(s): **

**Code(s) de Classification MSC:**

- 62G05 Estimation
- 62E25 Monte Carlo studies

**Résumé:** Given an $n$-sample from some unknown density $f$ on $[0,1]$, it is easy to construct an
histogram of the data based on some given partition of $[0,1]$, but not so much is known
about an optimal choice of the partition, especially when the set of data is not large, even if
one restricts to partitions into intervals of equal length. Existing methods are either rules
of thumbs or based on asymptotic considerations and often involve some smoothness
properties of $f$. Our purpose in this paper is to give a fully automatic and simple method to
choose the number of bins of the partition from the data. It is based on a nonasymptotic
evaluation of the performances of penalized maximum likelihood estimators in some
exponential families due to Castellan and heavy simulations which allowed us to optimize
the form of the penalty function. These simulations show that the method works quite well
for sample sizes as small as 25.

**Mots Clés:** *Regular histogram ; density estimation ; penalized maximum likelihood ; model selection*

**Date:** 2002-04-12

**Prépublication numéro:** *PMA-721*

**Pdf file :** PMA-721.pdf