Files
Abstract
At the era of big data where 2.5 quintillion bytes of data are produced daily, effective reduction on predictor variables to maintain important information about response variable is required. Sufficient dimension folding (SDF) in particular defines a powerful framework for compressing dimensions of matrix/array predictor variable while preserving its inner structure. This dissertation is composed with three studies based on SDF. In the first study, we introduce sufficient dimension folding with categorical variables which simultaneously brings matrix predictors and categorical variables into consideration during the reduction. In order to improve interpretation of SDF methods, we propose model-free variable selection techniques in the second study by reformulating SDF methods as least square estimations and adapt regularized regression methods to regularized SDF methods. In the third study, three hypothesis testing methods including marginal dimension test, conditional and marginal coordinate tests are presented as future study to assist evaluating matrix predictor's contributions.