Divisive hierarchical clustering for interval-valued data

Zhu, Jiankun

Divisive hierarchical clustering for interval-valued data

Zhu, Jiankun

2019

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket

Files

Abstract

In the real world, we may encounter various kinds of data types to which conventional statistical theories and methods may not be directly applied, and symbolic data is one kind. The observations of a symbolic data set are described by categorical variables, intervals, histograms, distributions and so forth, instead of single values. Therefore, symbolic data need novel methods for analysis. In this dissertation, we develop divisive hierarchical clustering methodologies for interval-valued data which are the most commonly-used symbolic data. We first propose three monothetic divisive clustering algorithms for interval-valued data and a weighted symbolic principal component analysis (PCA) method for interval-valued data. The first algorithm is based on the symbolic covariance PCA method proposed by Le-Rademacher (2008) and Le-Rademacher and Billard (2012). The second algorithm is based on our proposed weighted symbolic PCA method. The third one is based on the endpoints of intervals. Then, two mixed-strategy algorithms combining these three algorithms are also proposed. A series of simulations is conducted to compare these algorithms with an existing monothetic divisive algorithm proposed by Chavent (1998, 2000). The two mixed-strategy algorithms outperform all the other monothetic algorithms in these simulations and they are also applied to real-world data to further validate their effectiveness. Furthermore, we propose a polythetic divisive clustering algorithm for interval-valued data based on minimum spanning trees. The effectiveness of this algorithm is also verified through several simulations and real data applications.

Details

Record ID

7181

Record Created

2024-12-05

Title

Divisive hierarchical clustering for interval-valued data

Author

Zhu, Jiankun

Contributor

Billard, Lynne Advisor
Bai, Shuyang Committee Member
Park, Cheolwoo Committee Member
Sriram, T. N. Committee Member

College or School

Franklin College of Arts and Sciences

Department

Statistics

Date

2019

Publisher

University of Georgia

Content Type

Dissertation

Language

English

Dissertation/ Thesis Note

Doctoral

Degree Type

Doctor of Philosophy (PHD)

Name of Granting Institution

University of Georgia, Summer 2019

Year Degree Granted

2019

Keywords

Symbolic data; Interval-valued data; Divisive hierarchical clustering; Monothetic algorithm; Polythetic algorithm; Symbolic principal component analysis; Minimum spanning tree

Record Appears in

College, School, or Unit > Franklin College of Arts and Sciences > Statistics
Electronic Theses and Dissertations > Doctoral Dissertation
All Resources
Doctoral

System Control Number

9949334969502959

PDF

Statistics

Download Full History