Go to main content
Formats
Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

In this thesis, we compare several methods to handle correlated data related to genome frequency copies. First, we used standard Poisson Regression to analyze the data. From the results, we find that there are several problems related to over-dispersion and under-dispersion. It is easy to handle over-dispersion using the scale-adjustment method. However, remedying problems related to dependence caused by correlated Poisson data are not so easily handled. We first created a statistic to help us test the null hypothesis that data are independent Poisson realizations vs. the alternative that they are positively associated. From this, we found that 225 base-pairs separation is the minimum cut-off distance needed to achieve approximate independence. We also used results from this analysis to devise a formula which yields the approximate correlation coefficient (r) between counts which are separated by b base-pairs. Finally, we use our method to weight observations, and find significant improvement compared to other methods.

Details

PDF

Statistics

from
to
Export
Download Full History