Date of Submission


Date of Award


Institute Name (Publisher)

Indian Statistical Institute

Document Type

Doctoral Thesis

Degree Name

Doctor of Philosophy

Subject Name

Computer Science


Theoretical Statistics and Mathematics Unit (TSMU-Delhi)


Rao, B. L. S. Prakasa (TSMU-Delhi; ISI)

Abstract (Summary of the Work)

In Statistics, a classical problem is that of estimating the regression function which is defined as m{t) := E(Y|X = ), te R, for two random variables X and Y such that EY < 0o. The estimators are constructed iased on a sample {(Xi, Yi.)}, 1sis n,n 2 1, from the distribution of (X, Y). Throughout this thesis, we assume X and Y to be real-valued for the sake of convenience. The classical approach to this problem is to assume a parametrized, polynomial form for nt-), i.e., m(t) := Bo + E-1 P,ti, p 21, and obtain estimates of the unknown paraineters Bo, Bj,, 1sjsp. Later, with the development of techıniques for non-parametrie density estimation, it was sought to extend these techniques to regression estimation. Heuristically, the two problems can be seen to be related as follows : let fi(-) be the marginal density of X and note that E1(X S x) = h(t)dt, z € R, whereas EY 1(X Sx) = m(t)fi(t)dt, x E MR. (1.0.2) In other wordds, (1.0.1) can be looked upon as a special case of (1.0.2), with Y = 1. + similarity, as we shall see later on, has been the underlying theme in Chapters 2 and 4 of the present work.) The following non-parametric regression estimator was proposed independently by Nadaraya (1964) and Watson (1964): "(): := m.(Y,)/m.(1, ), te R, (1.0.3) where m,(Y, t) = (nan)- E-, Y;K((t - X:)/an). (1.0.4) m,(1,1) (na,)-E, K((I - X)/a,). Here K(), the so-called kernel function, is chosen to satiafy various analytical conditions (typically, K(-) is taken to be a density function), and a, 1 0 are the bandwidths which go to zero sufficiently slowly (e.g., na,0o as n00) in order to ensure consistency of the estimator mW (). The intuition behind such an estimator is that m,(Y,) is an estimator of mt-)fi() while m,(1,) cstimates the density fa(-). See Prakasa Rao (1983), Chapters 1-4. for an introduction to non-parametric density and regression estimation. Now, m(t) is a functional of the conditional distribution of Y, given X = t. A natu- ral generalisation of the regression estimation problem seems to be the estimation of the following functionals: mh(t1, := E{h(Y1,.....Yk) | X1, = t1.,Xk. = tk), (t....) € R*, k 2 1, (1.0.5) where h: R*- R is such that Elh(Y...., Y) < 0. A similar generalisation led Hoelfding (1948) from the sample mcan to the theory of so-called U-statistics, in the uncondilional set-up. The estimation of (1.0.5) were considerexl, for the first time in published form, in Stute (1991) where the following conditional U-statistics were proposed as estimators;where Fn(-) := n-1E, 1(Xi; < ) denotes the empirical distribution function (c.d.f Bochynek discussed the asymptotic normality of conditional U- and V-statistics and pei formed simulation studies on them. Stute (1991) established weak and strong pointwis consistency and asymptotic normality of U(t). Liero (1991) studied uniform strong con sistency of conditional U-statistics and established asymptotic normality of the integrate squared error (ISE) statistic:for suitable A c R* and weight function w(-). We quote the following examples to illustrate the possible use of conditional U-statistics See Stute (1991) and Bochynek (1987) for other examples. Throughout this thesis, our set up will be as foliows: {(Xn, Yn)}n>ı is a bi-variate i.i.d sequence, with (X1, Y1) having join density f(,-) and X, having marginal density fi(-). Consequently,


ProQuest Collection ID:

Control Number


Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.


Included in

Mathematics Commons