Files
Abstract
Clustered data with excess zeros are becoming increasingly common in many research areas including health sciences, business, manufacturing and natural resource management. There is a need to improve existing methods and to create software routines that can handle such data. Marginal zero-inflated (ZI) regression models have been studied. They are two component mixtures of a degenerate component with a point mass of one at zero and a non-degenerate component suitable for the scale of the response. The EM algorithm is convenient for fitting mixture models with a closed form likelihood, however, marginal ZI regression models are partially specified models for dependent data that generally have an intractable likelihood function. EM-type algorithms that account for dependence in the data and extend beyond likelihood based formulation are desired. The expectation-solution (ES) algorithm is an EM-type algorithm that can be used to fit marginal ZI regression models to account for dependence in the data using GEEs. Nonetheless, the ES algorithm assumes a simplifying assumption of independence in the data to obtain a closed-form solution for the E-step and this is inefficient for clustered data. Marginal ZI regression models with multinomial mixing are proposed. This ensures we account for dependence in the mixing component and fit a fully parametric model instead of a GEE for the S-step of the mixing data. In addition, we propose the regression-solution (RS) algorithm to explicitly account for dependence in the data when updating the ``missing data” in an EM-like formulation. The RS algorithm replaces the E-step of the ES algorithm by using an appropriate regression model to estimate the conditional expectation of the missing data given the observed data and current parameter estimates. In simulation studies, we demonstrate that models with a multinomial mixing structure fitted with the RS algorithm produce more efficient regression parameter estimators than those using the ES algorithm. We demonstrate with an example data set and introduce an R package, margZIfit for implementing the model fitting algorithms.