WHOLE-GENOME SEQUENCE DATA IN FARM ANIMALS: FROM SNP SELECTION TO GENOMIC PREDICTIONS

Jang, Sungbong

WHOLE-GENOME SEQUENCE DATA IN FARM ANIMALS: FROM SNP SELECTION TO GENOMIC PREDICTIONS

Jang, Sungbong

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket

Files

Abstract

Using whole-genome sequence (WGS) data to identify the causative variants and improve genomic prediction is of current research interest. However, single nucleotide polymorphisms (SNP) chips are still the primary source for genomic predictions. Regular SNP chips only include a small number of SNP. Therefore, more accurate genomic predictions would be expected with WGS data. The objective of first study was to investigate the impact of using preselected variants from WGS for large-scale single-step GBLUP (ssGBLUP) genomic predictions in maternal and terminal pig lines separately. Genomic predictions with regular SNP chip data were compared with preselected SNP sets. Preselection of SNP relied on genome-wide association studies (GWAS) and linkage disequilibrium (LD) pruning. A second study aimed to explore the use of selected WGS variants in a multi-line ssGBLUP genomic evaluation (MLE), which comprised over 200,000 sequenced/imputed animals. A multi-line GWAS was conducted to preselect WGS variants, and unknown parent groups (UPGs) or metafounders (MFs) accounted for genetic differences among lines in a joint evaluation. Those first two studies reported small to no gain in accuracy of genomic prediction with WGS data. To explore the possible reasons for the limited gain in accuracy of genomic prediction with WGS data, a simulation study with different effective population sizes (Ne) was carried out in the third study. We investigated different discovery set sizes in GWAS, relating them to the limited dimensionality of genomic information. The selected variants based on different GWAS sample sizes were then added to simulated SNP panels that mimicked regular chips used commercially. Populations with smaller effective sizes (Ne = 20) require more data to capture causative variants, whereas for large populations (Ne = 200), using the number of genotyped animals equal to that of the largest eigenvalues explaining 98% of the variance of the genomic relationship matrix suffices. However, only a small proportion of the causative variants can be discovered if those genotyped animals do not have many progeny records. Even when several causative variants are preselected, their impact on ssGBLUP genomic predictions is minimal because medium-density commercial SNP chips already account for most of the information added.

Details

Record ID

3816

Record Created

2024-12-05

Title

WHOLE-GENOME SEQUENCE DATA IN FARM ANIMALS: FROM SNP SELECTION TO GENOMIC PREDICTIONS

Author

Jang, Sungbong

Contributor

Lourenco, Daniela DL Advisor
Misztal, Ignacy IM Committee Member
Rekaya, Romdhane RR Committee Member
Chen, Ching-Yi CC Committee Member

College or School

College of Agricultural and Environmental Sciences

Department

Animal and Dairy Science

Subjects

Animal sciences

Content Type

Dissertation

Pagination

190

File Format

pdf

Language

English

Degree Type

Doctor of Philosophy (PHD)

Name of Granting Institution

University of Georgia

Year Degree Granted

2022-08

Keywords

genome-wide association study; limited dimensionality of genomic information; multi-line genomic evaluation; variants selection; whole-genome sequence

Record Appears in

Electronic Theses and Dissertations > Doctoral Dissertation
College of Agricultural and Environmental Sciences
All Resources
Doctoral

System Control Number

9949467525302959

PDF

Statistics

Download Full History