Estimation of Bergsma’s covariance

Article Type

Research Article

Publication Title

Journal of the Korean Statistical Society

Abstract

Bergsma (A new correlation coefficient, its orthogonal decomposition and associated tests of independence, arXiv preprint arXiv:math/0604627 , 2006) proposed a covariance κ(X, Y) between random variables X and Y, and gave two estimates for it, based on n i.i.d. samples. He derived the asymptotic distributions of these estimates under the assumption of independence between X and Y. Our main focus is on the dependent case. This measure turns out to be same as the distance covariance (dCov) measure for multivariate X and Y, when we specialize to real-valued X and Y. We first derive several alternate expressions for κ , which are useful to understand the properties of κ and its estimates better. One of the alternate expressions for κ leads to a very intuitive third estimator of κ that is a nice function of four U-statistics. We establish the exact finite sample algebraic relation between the three estimates. This yields the relation between the bias of these estimators. In the dependent case, using the U statistics central limit theorem, it is easy to show that our estimate is asymptotic normal. The relation between the three estimates is then used to show that Bergsma’s two estimates have the same limit distribution in the dependent case. When X and Y are independent, the above limit is degenerate. With a higher scaling, the non-degenerate limit distribution of all three estimators is obtained using the theory of degenerate U-statistics and the above algebraic relations. In particular, the known asymptotic distribution results for the two estimates of Bergsma for the independent case follow. For specific parametric bivariate distributions, the value of κ can be derived in terms of the natural dependence parameters of these distributions. In particular, we derive the formula for κ when (X, Y) are distributed as Gumbel’s bivariate exponential. We bring out various aspects of these estimators through extensive simulations from several prominent bivariate distributions. In particular, we investigate the empirical relationship between κ and the dependence parameters, the distributional properties of the estimators, and the accuracy of these estimators. We also investigate the finite sample powers of these measures for testing independence, compare these among themselves, and with other well known such measures. Based on these exercises, the proposed estimator seems as good or better than its competitors both in terms of power and computing efficiency.

First Page

1025

Last Page

1054

DOI

https://10.1007/s42952-023-00236-1

Publication Date

12-1-2023

This document is currently not available here.

Share

COinS