#### Date of Submission

2-28-2010

#### Date of Award

2-28-2011

#### Institute Name (Publisher)

Indian Statistical Institute

#### Document Type

Doctoral Thesis

#### Degree Name

Doctor of Philosophy

#### Subject Name

Quantitative Economics

#### Department

Applied Statistics Unit (ASU-Kolkata)

#### Supervisor

Bose, Mausumi (ASU-Kolkata; ISI)

#### Abstract (Summary of the Work)

In this thesis we study some problems in survey sampling and propose their solutions. The thesis is divided into seven chapters, the first being an introductory one. Chapters 2-4 relate to surveys with direct responses, while Chapters 5-7 deal with randomized responses. We begin by introducing the motivations behind the principal problems studied in this thesis.The first problem arose while developing a survey where the objective was to estimate the tot al number of workers in different industries in the rural unorganized sector in a district of the state of West Bengal, India. For this, the technique of adaptive sampling was found useful but this escalated the survey costs substantially as the final adaptive sample size was prohibitively large. So we studied this aspect and proposed an easily implement able modification to the traditional adaptive sampling method which effectively controls the final sample size. Corresponding estimators and variance estimators were derived tooIn the above study we adopted the Rao-Hartley-Cochran (1962) (RHC) scheme of sampling and used the traditional estimator (RHCE) and its variance estimator given by Rao, Hartley and Cochran (1962). We were then curious to see whether the celebrated estimator due to Horvitz and Thompson (HTE) could also be employed here when the sample was an RHC sample. On studying this problem and deducing the required inclusion probabilities, we found that this can indeed be done; the HTE based on an RHC sample turning out to be quite competitive compared to the traditionally used RHCE.The second problem came from a survey which was carried out to estimate the prevalence rate of a disease in a certain geographical are a, namely the Kolkata Municipal Are a in West Bengal. For this estimation, we proposed a modified estimator similar to the Hartley-Ross estimator and showed that suitable modeling, combined with our proposed design-based estimator, can lead to improved estimation.Next, we focused on the randomized response (RR) techniques which are of immense use in practice when the variable under study is a stigmatizing or sensitive one. Here it is well known that in Warner's model, based on a simple random s ample with replacement (SRSWR), a respondent is asked to generate an RR every time he /she is selected; the traditional estimator being the sample mean of these RR's. In the context of direct surveys with SRSWR, classical results due to Basu (1958), Pathak (1962) and others showed that estimators using the direct responses from the distinct sampled units perform better than the sample mean. We were curious to examine if a parallel result is also true in the RR scenario. We carried out this study for sampling with a fixed s ample-size and also for inverse sampling. This in turn led to some more related questions in the area of RR based estimation.We now introduce a number of terms and notation which we will use throughout. In the next section, we briefly discuss some sampling schemes and estimators, together with a few relevant references. More details and references are cited in the appropriate chapters that follow. In Section 1.3, we present a chapter-wise summary.A finite collection of a known number of identifiable units will be called a survey population. This population will be denoted by U and we suppose that it consists of N units, labeled as 1, ..., i,.., N. Let y be a variable of interest, the unknown but fixed values of y for the units in U being Y1, Y2,..., Y N. We assume that for a sampled unit Yi can be ascert ained without error if y is not a sensitive variable. This is the usual case of direct surveys where a direct response can be obt ained for y;. Otherwise, that is, if y is a sensitive variable, then y; cannot be ascertained by direct response and so one has to have recourse to ran domized responses obtained on using a suit able randomization device.

#### Control Number

ISILib-TH300

#### Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

#### DOI

http://dspace.isical.ac.in:8080/jspui/handle/10263/2146

#### Recommended Citation

Dihidar, Kajal Dr., "Sampling 'Survey Populations': Some Problems and Their Solutions." (2011). *Doctoral Theses*. 82.

https://digitalcommons.isical.ac.in/doctoral-theses/82

## Comments

ProQuest Collection ID: http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqm&rft_dat=xri:pqdiss:28842858