Files
Abstract
Classic statistical methodologies can not be readily applied to symbolic data, which take multiple values such as intervals and histograms instead of a single value as for classic data. In particular, this research focuses on the study of multivariate histogram-valued data. Although there are some existing methodologies that use such data, a widely used and theoretical founded method is still lacking. In this research work, we adapt copula theory to symbolic data to estimate the distribution of the data, and then study the covariance structure of histogram-valued symbolic data. We develop some theoretical results and demonstrate them with some numerical studies and a real data example. As two extensions, we generate the principal component analysis and mixture clustering model for the histogram-valued data.