Estimating menarcheal age distribution from partially recalled data

Article Type

Research Article

Publication Title

Biostatistics (Oxford, England)


In a cross-sectional study, adolescent and young adult females were asked to recall the time of menarche, if experienced. Some respondents recalled the date exactly, some recalled only the month or the year of the event, and some were unable to recall anything. We consider estimation of the menarcheal age distribution from this interval-censored data. A complicated interplay between age-at-event and calendar time, together with the evident fact of memory fading with time, makes the censoring informative. We propose a model where the probabilities of various types of recall would depend on the time since menarche. For parametric estimation, we model these probabilities using multinomial regression function. Establishing consistency and asymptotic normality of the parametric maximum likelihood estimator requires a bit of tweaking of the standard asymptotic theory, as the data format varies from case to case. We also provide a non-parametric maximum likelihood estimator, propose a computationally simpler approximation, and establish the consistency of both these estimators under mild conditions. We study the small sample performance of the parametric and non-parametric estimators through Monte Carlo simulations. Moreover, we provide a graphical check of the assumption of the multinomial model for the recall probabilities, which appears to hold for the menarcheal data set. Our analysis shows that the use of the partially recalled part of the data indeed leads to smaller confidence intervals of the survival function.

First Page


Last Page




Publication Date



Open Access, Green