Conference Articles

Exploring the Gap Between Tolerant and Non-Tolerant Distribution Testing

Sourav Chakraborty, Indian Statistical Institute, Kolkata
Eldar Fischer, Technion - Israel Institute of Technology
Arijit Ghosh, Indian Statistical Institute, Kolkata
Gopinath Mishra, University of Warwick
Sayantan Sen, Indian Statistical Institute, Kolkata

Document Type

Conference Article

Publication Title

Leibniz International Proceedings in Informatics, LIPIcs

Abstract

The framework of distribution testing is currently ubiquitous in the field of property testing. In this model, the input is a probability distribution accessible via independently drawn samples from an oracle. The testing task is to distinguish a distribution that satisfies some property from a distribution that is far in some distance measure from satisfying it. The task of tolerant testing imposes a further restriction, that distributions close to satisfying the property are also accepted. This work focuses on the connection between the sample complexities of non-tolerant testing of distributions and their tolerant testing counterparts. When limiting our scope to label-invariant (symmetric) properties of distributions, we prove that the gap is at most quadratic, ignoring poly-logarithmic factors. Conversely, the property of being the uniform distribution is indeed known to have an almost-quadratic gap. When moving to general, not necessarily label-invariant properties, the situation is more complicated, and we show some partial results. We show that if a property requires the distributions to be non-concentrated, that is, the probability mass of the distribution is sufficiently spread out, then it cannot be non-tolerantly tested with o(√n) many samples, where n denotes the universe size. Clearly, this implies at most a quadratic gap, because a distribution can be learned (and hence tolerantly tested against any property) using O(n) many samples. Being non-concentrated is a strong requirement on properties, as we also prove a close to linear lower bound against their tolerant tests. Apart from the case where the distribution is non-concentrated, we also show if an input distribution is very concentrated, in the sense that it is mostly supported on a subset of size s of the universe, then it can be learned using only O(s) many samples. The learning procedure adapts to the input, and works without knowing s in advance.

DOI

10.4230/LIPIcs.APPROX/RANDOM.2022.27

Publication Date

9-1-2022

Recommended Citation

Chakraborty, Sourav; Fischer, Eldar; Ghosh, Arijit; Mishra, Gopinath; and Sen, Sayantan, "Exploring the Gap Between Tolerant and Non-Tolerant Distribution Testing" (2022). Conference Articles. 380.
https://digitalcommons.isical.ac.in/conf-articles/380

This document is currently not available here.

COinS

Conference Articles

Exploring the Gap Between Tolerant and Non-Tolerant Distribution Testing

Document Type

Publication Title

Abstract

DOI

Publication Date

Recommended Citation

Browse

Search

Author Corner

Links

Conference Articles

Exploring the Gap Between Tolerant and Non-Tolerant Distribution Testing

Authors

Document Type

Publication Title

Abstract

DOI

Publication Date

Recommended Citation

Share

Browse

Search

Author Corner

Links