Files
Abstract
Sample matching is one statistical technique that can be applied to observational data to archive covariate balance and thus aid in estimating causal effects in studies lacking of randomization. This thesis (a) describes three types of sample matching methodologies-Propensity Score Matching (PSM), Coarsen Exact Matching (CEM), and Genetic Matching (GM), and (b) demonstrates and compares their application using empirical data from the Early Childhood Longitudinal Study-Kindergarten Class of 199899 (ECLS-K) and simulated data with seven scenarios differing by non-linear and/or non-additive associations between exposure and covariates. The study shows that CEM produces higher multivariate balance and consistently less biased effect estimate then the other two methods, although for data containing many categorical covariates curse of dimensionality is a noticeable concern in CEM. PSM and GM can result in more matched samples but carry higher extrapolation and model dependence in effect estimate.