Files
Abstract
Transmission of Mycobacterium tuberculosis (M. tuberculosis) relies on prolonged contacts with people infected with M. tuberculosis. The Community Health Study of Social Networks and Tuberculosis (COHSONET), an ongoing study initiated by Whalen, aims to evaluate the effects of social contacts on the risk of M. tuberculosis conversion through Ecological Momentary Assessment (EMA). This dissertation offers the linear probability model as an alternative to the logistic regression, to describe the risk of M. tuberculosis conversion as a function of proportions of time participants spent in different location contexts, a surrogate variable for social contacts. To restrict the predictive values from the linear probability model in a meaningful interval [0, 1], we propose two constrained optimization approaches in the current dissertation: the constrained ordinary least squares (OLS) and constrained adaptive LASSO. Within the constrained parameter space, both constrained OLS and constrained adaptive LASSO estimators are asymptotically consistent and asymptotically normal given all parameter estimates lying within the boundary of parameter space. Other than that, the constrained adaptive LASSO is an oracle procedure, and thus has consistent model selection. Intensive simulations demonstrate that both constrained OLS and constrained adaptive LASSO estimators are asymptotically consistent because their empirical mean biases tend to approach zero with an increased sample size. Moreover, the constrained OLS estimators (MLEs) perform as well as maximum likelihood estimators and bias-reduced penalized maximum likelihood estimators(PMLEs) in the logistic regression when all parameters are in the interior of the boundary. In particular, the constrained OLS in the linear probability model outperforms both MLE and PMLE in the logistic regression model when some parameters close to the boundary of the parameter space. The constrained adaptive LASSO appears to have better performance than the constrained OLS in the linear probability model when some parameters lie well on the boundary of the parameter space.