Files
Abstract
In this thesis, two different nonparametric methods are developed in the statistical field ofmultivariate association and dimension reduction.While the underlying goal in both methodsis to detect both linear and nonlinear relationships between multiple sets and groups ofmultivariate random vectors, different uses in statistical applications motivate the methods.The primary goal of the information theory based method of Chapter 2 is to provide an overallmeasure of association between sets of random vectors. In Chapter 3, a method focusing ondimension reduction is developed to extend Canonical Correlation Analysis(CCA), pioneeredby Hotelling [6], to identify nonlinear relationships.Motivated by a problem in morphological integration studies, a field in biological science,a new general index based on Kullback-Leibler(KL) information is proposed to measure therelationships between multiple sets of random vectors. The relationships are detected usinga measure of the dependence between multiple sets by calculating the difference between thejoint and marginal densities of affine matrix transformations of the random vectors. From thisindex, we define an overall measure of dependence between multiple sets, initially motivatedby a problem in morphometrics. In addition, we develop two methods for dimension reductionfor m-sets of random vectors and then extend these to multiple groups of multiple sets.The second index recovers relationships between sets using a composite L2 distance measurebetween linear combinations of one vector and an unknown single index model regressionfunction of the other, interchanging the roles of each respectively. Estimates of the regressionfunctions are calculated using the nonparametric Nadaraya and Watson [19] [27] smoother,thus enabling our index to detect both linear and nonlinear relationships. This method is thenextended to identify associations between multiple sets and multiple groups of random vectors.In addition to detecting the nature of the relationships, a bootstrap procedure inspiredby Ye and Weiss [32] is developed to determine the number of significant associations. Moreover,this procedure is independent of the measure used to detect the relationships.Canonical Correlation Analysis is a common measure of the pair-wise linear associationbetween two sets of random vectors and is often used as a benchmark for comparison. Incontrast to CCA, both of our methods are shown to determine the existence of both linearand nonlinear relationships, thereby making them useful in many statistical applications.