Uncertainty of the dependent variable in genome-wide association studies

Smith, Shannon Nicole

Uncertainty of the dependent variable in genome-wide association studies

Smith, Shannon Nicole

2013

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket

Files

Abstract

The use of high throughput technology has the potential to provide insight into the underlying biological mechanisms of several important complex traits. Unfortunately this information comes at a cost, sometimes in terms of low accuracy and poor quality of the data. One application of genomics in humans is the diagnoses of diseases such as bipolar disorder, Alzheimers disease, and cancer. However in order to correctly predict disease statuses, methods should be trained on datasets containing no errors. In the case for most conditions this is seldom the case, as mis-diagnostic errors are prevalent due to overlapping symptoms and lack of precise diagnostic tools. Therefore, a new approach for dealing with misclassification was developed and applied to simulated data sets where case and control observations were randomly switched and with varying odds ratios of influential SNPs, to examine the effects of potential misclassification on diagnostic accuracy. The cases when misclassification was ignored resulted in limited predictive power of the model. When the misclassification algorithm was applied, the predictive power increased across all scenarios demonstrating the effectiveness of the misclassification algorithm. Additionally in livestock applications, genomic technology is used to detect genetic variants associated with economically important traits as well as to estimate genomic enhanced breeding values to be used in animal selection. For genome wide association studies (GWAS) in animal applications the dependent variable is often a pseudo phenotype (estimated breeding values, de-regressed breeding values, etc.). Being estimates, these pseudo-phenotypes carry a certain level of inaccuracy or uncertainty. In some situations, such uncertainty is large and it is not constant across observations. Consequently, using these estimates directly as dependent variables in the GWAS can be problematic because the residual variance of the model is composed of two components (sampling variance and the error variance) that current methods are unable to accommodate. Thus, we developed and implemented a new procedure that correctly accounts for both components of the residual variance leading to an increase in accuracies of the estimated genomic breeding values. The proposed method was evaluated with real and simulated data.

Details

Record ID

15695

Record Created

2024-12-05

Title

Uncertainty of the dependent variable in genome-wide association studies

Author

Smith, Shannon Nicole

Contributor

Rekaya, Romdhane Advisor
Aggrey, Samuel E. Committee Member
Bertrand, J. Keith Committee Member
Misztal, Ignacy Committee Member

College or School

Animal and Dairy Science

Date

2013

Publisher

University of Georgia

Content Type

Dissertation

Language

English

Dissertation/ Thesis Note

Doctoral

Degree Type

Doctor of Philosophy (PHD)

Name of Granting Institution

University of Georgia, Summer 2013

Year Degree Granted

2013

Keywords

Genomics; misclassification algorithm; discrete responses; estimated breeding values; accuracy

Record Appears in

Electronic Theses and Dissertations > Doctoral Dissertation
All Resources
Doctoral

System Control Number

9949334057402959

PDF

Statistics

Download Full History