New Regularization Methods for Supervised Learning with High-Dimensional Data

Ma, Ziyang

New Regularization Methods for Supervised Learning with High-Dimensional Data

Ma, Ziyang

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket

Files

Abstract

Supervised learning problems in high-dimensional settings have a wide range of applications across different disciplines, such as the predictions using high-throughput data from molecular biology. High dimensionality poses many challenges to traditional supervised learning problems and has captured great attention in the statistics and machine learning community. One solution is to use regularization methods. This dissertation considers new regularization approaches for high-dimensional data under the context of two supervised learning topics. The first topic concerns the ordinal classification problems which lie between standard classification and regression. We propose two novel methods that consider a new regularization idea, which weights the features by calculating their rank correlations with the class labels. In the first method, we incorporate the feature weights into the framework of linear discriminant analysis and add the group Lasso penalty to achieve sparse solutions. In the second method, we add the weights into sparse optimal scoring with an adaptive Lasso penalty. Both of the proposed methods can project the original data onto a lower-dimensional subspace which reveals the underlying ordinal structure. This distinguishes our methods from existing work which assume a strict underlying linear ordinality within the data. We also demonstrate the difference between linear and nonlinear ordinality and show that our methods are capable of detecting the nonlinear ordinality and applicable to high-dimensional data. Simulation studies and real data examples show that the proposed methods have superior performance for ordinal classification with respect to various evaluation metrics. The second topic revisits the trace ratio optimization problems involved in dimension reduction. Solving the trace ratio optimization is not straightforward and it is conventionally replaced by a sub-optimal alternative, the ratio trace problem. We consider a trace regularization method and modify it in the scenario of high-dimensional canonical correlation analysis (CCA). Results from numerical studies demonstrate the efficiency of the modified trace regularization method, compared with other well-known high-dimensional CCA approaches.

Details

Record ID

4876

Record Created

2024-12-05

Title

New Regularization Methods for Supervised Learning with High-Dimensional Data

Author

Ma, Ziyang

Contributor

Ahn, Jeongyoun Advisor
Ke, Yuan Committee Member
Liu, Liang Committee Member
Strait, Justin Committee Member

College or School

Franklin College of Arts and Sciences

Department

Statistics

Subjects

Statistics

Content Type

Dissertation

Pagination

124

File Format

pdf

Language

English

Degree Type

Doctor of Philosophy (PHD)

Name of Granting Institution

University of Georgia

Year Degree Granted

2021-05

Keywords

Regularization; Canonical correlation analysis; Dimension reduction; Feature weighting; High-dimensional data; Ordinal classification

Record Appears in

College, School, or Unit > Franklin College of Arts and Sciences > Statistics
Electronic Theses and Dissertations > Doctoral Dissertation
All Resources
Doctoral

System Control Number

9949375049202959

PDF

Statistics

Download Full History