Improved Streaming Algorithm for the Klee’s Measure Problem and Generalizations
Document Type
Conference Article
Publication Title
Leibniz International Proceedings in Informatics Lipics
Abstract
Estimating the size of the union of a stream of sets S1, S2, . . ., SM where each set is a subset of a known universe Ω is a fundamental problem in data streaming. This problem naturally generalizes the well-studied F0 estimation problem in the streaming literature, where each set contains a single element from the universe. We consider the general case when the sets Si can be succinctly represented and allow efficient membership, cardinality, and sampling queries (called a Delphic family of sets). A notable example in this framework is the Klee’s Measure Problem (KMP), where every set Si is an axis-parallel rectangle in d-dimensional spaces (Ω “r∆sd where r∆s:“t1, . . ., ∆u and ∆ P N). Recently, Meel, Chakraborty, and Vinodchandran (PODS-21, PODS-22) designed a streaming algorithm for pϵ, δq-estimation of the size of the union of set streams over Delphic family with space and update time complexity O ´ logε32|Ω| ¨ log 1δ¯ and Or ´ logε42|Ω| ¨ log 1δ¯, respectively. This work presents a new, sampling-based algorithm for estimating the size of the union of Delphic sets that has space and update time complexity Or ´ logε22|Ω| ¨ log 1δ¯ . This improves the space complexity bound by a log |Ω| factor and update time complexity bound by a log2 |Ω| factor. A critical question is whether quadratic dependence of log |Ω| on space and update time complexities is necessary. Specifically, can we design a streaming algorithm for estimating the size of the union of sets over Delphic family with space and complexity linear in log |Ω| and update time polyplog |Ω|q? While this appears technically challenging, we show that establishing a lower bound of ωplog |Ω|q with polyplog |Ω|q update time is beyond the reach of current techniques. Specifically, we show that under certain hard-to-prove computational complexity hypothesis, there is a streaming algorithm for the problem with optimal space complexity Oplog |Ω|q and update time polyplogp|Ω|qq. Thus, establishing a space lower bound of ωplog |Ω|q will lead to break-through complexity class separation results.
DOI
10.4230/LIPIcs.APPROX/RANDOM.2024.26
Publication Date
9-1-2024
Recommended Citation
Nandi, Mridul; Vinodchandran, N. V.; Ghosh, Arijit; Meel, Kuldeep S.; Pal, Soumit; and Chakraborty, Sourav, "Improved Streaming Algorithm for the Klee’s Measure Problem and Generalizations" (2024). Conference Articles. 871.
https://digitalcommons.isical.ac.in/conf-articles/871