Use of EM algorithm for data reduction under sparsity assumption

Article Type

Research Article

Publication Title

Computational Statistics


Recent scientific applications produce data that are too large for storing or rendering for further statistical analysis. This motivates the construction of an optimum mechanism to choose only a subset of the available information and drawing inferences about the parent population using only the stored subset. This paper addresses the issue of estimation of parameter from such filtered data. Instead of all the observations we observe only a few chosen linear combinations of them and treat the remaining information as missing. From the observed linear combinations we try to estimate the parameter using EM based technique under the assumption that the parameter is sparse. In this paper we propose two related methods called ASREM and ESREM. The methods developed here are also used for hypothesis testing and construction of confidence interval. Similar data filtering approach already exists in signal sampling paradigm, for example, Compressive Sampling introduced by Candes et al. (Commun Pure Appl Math 59(8):1207–1223, 2006) and Donoho (IEEE Trans Inf Theory 52: 1289–1306, 2006). The methods proposed in this paper are not claimed to outperform all the available techniques of signal recovery, rather our methods are suggested as an alternative way of data compression using EM algorithm. However, we shall compare our methods to one standard algorithm, viz., robust signal recovery from noisy data using min-ℓ1 with quadratic constraints. Finally we shall apply one of our methods to a real life dataset.

First Page


Last Page




Publication Date


This document is currently not available here.