# On Some Estimation Problems through the Sub-linear Lens

## Date of Submission

December 2021

## Date of Award

12-1-2022

## Institute Name (Publisher)

Indian Statistical Institute

## Document Type

Doctoral Thesis

## Degree Name

Doctor of Philosophy

## Subject Name

Computer Science

## Department

Advance Computing and Microelectronics Unit (ACMU-Kolkata)

## Supervisor

Bishnu, Arijit (ACMU-Kolkata; ISI)

## Abstract (Summary of the Work)

The way the data is processed or some computation done over it has changed drastically over time with the advent of the paradigm of big data. The study of model-centric computation gives the theoretical underpinning to the machine models that abstract such data processing or computation. Of the many such models, this thesis focuses on the following models: (i) sub-linear time and query complexity, (ii) sub-linear space or streaming. The main focus of this thesis is on estimating some parameters of graphs using the above mentioned models. In the query complexity framework, this thesis starts with a query oracle called BIPARTITE INDEPENDENT SET (BIS) on graphs introduced by Beame et al.[ITCS’20] to estimate the number of edges in a graph with polylogarithmic BIS queries. In BIS, the input to the oracle is two disjoint subsets of vertices of the graph, and the oracle reports if there exists an edge with vertices in both of the sets. In this thesis, we first generalized BIS to TRIPARTITE INDEPENDENT SET oracle (TIS) to estimate the number of triangles using polylogarithmic TIS queries. In TIS, the input to the oracle is three pairwise disjoint subsets of vertices of the graph, and the oracle reports if there exists a triangle with vertices in each of the three sets. BIS oracle is further generalized to GENERALIZED D-PARTITE INDEPENDENT SET oracle (GPIS) for a d-uniform hypergraph, where we give d disjoint vertex sets of the hypergraph as input to the oracle, and the oracle reports if there exists a hyperedge having one vertex in each of the d sets. We show that the number of hyperedges can be estimated using polylogarithmically many queries to GPIS oracle. Note that BIS is a special case of GPIS when d = 2. Also, triangle estimation using TIS queries is essentially the same as that of hyperedge estimation in a 3-uniform hypergraph using GPIS queries when d = 3, where the vertex set of the hypergraph is same as that of the graph and three vertices form a hyperedge in the hypergraph if they form a triangle in the original graph. This thesis also considers a problem with a parameterization flavor in the query setting – the d-HITTING SET problem. Here, a parameter k 2 N is given as input along with GPIS query oracle access to a d-uniform hypergraph; k is a parameter related to the hitting set. We show that d-HITTING SET can be solved by using O minfkd log n; k2d2g log k GPIS queries, and (kd) GPIS queries are necessary to solve the problem. In the streaming framework, we consider two problems in this thesis. The first one is about parameterized graph deletion problems in all standard graph streaming models like DYNAMIC EDGE ARRIVAL, EDGE ARRIVAL, VERTEX ARRIVAL and ADJACENCY LIST. The main upper bound result is that SUBGRAPH DELETION and MINOR DELETION problem can be solved by using f(K) space in ADJACENCY LIST model, where the parameter K 2 N is the vertex cover size of the input graph and f is a polynomial function. We complement the upper bound results by a set of lower bound results when we take solution size as our parameter for most of the streaming models. The second problem, in the streaming framework, is a graph coloring problem that is otherwise easy in the RAM model but becomes quite non-trivial in the one-pass streaming model. In contrast to previous graph coloring problems in streaming that try to find an assignment of colors to vertices, our main work is on estimating the number of conflicting or monochromatic edges given a coloring function that is streaming along with the graph; we call the problem CONFLICT-EST. The coloring function on a vertex can be read or accessed only when the vertex is revealed in the stream. We provide algorithms for a graph that is streaming in different variants of the one-pass vertex arrival streaming model, viz. the VERTEX ARRIVAL (VA), Vertex Arrival With Degree Oracle (VADEG), VERTEX ARRIVAL IN RANDOM ORDER (VARAND) models, with special focus on the random order model and the model with degree oracle. We also provide matching lower bounds for most of the cases

## Control Number

ISILib-TH527

## Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

## DOI

http://dspace.isical.ac.in:8080/jspui/handle/10263/2146

## Recommended Citation

Mishra, Gopinath Dr., "On Some Estimation Problems through the Sub-linear Lens" (2022). *Doctoral Theses*. 549.

https://digitalcommons.isical.ac.in/doctoral-theses/549

## Comments

ProQuest Collection ID: https://www.proquest.com/pqdtlocal1010185/dissertations/fromDatabasesLayer?accountid=27563