Go to main content
Formats
Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

Transposable elements (TEs) are mobile repetitive DNA sequences that are ubiquitous in eukaryotic organisms. Saccharomyces cerevisiae is a leading model organism for studying TEs at the molecular level. However, the species-wide polymorphism and evolution of TE for Saccharomyces yeasts have not been fully revealed, partially due to the technical challenges in identifying repetitive DNA in high-throughput sequencing data. In this dissertation, I focused on developing new bioinformatics approaches and investigating the evolutionary history of TEs among Saccharomyces species.

First, I updated a meta-pipeline called McClintock that consists of 12 polymorphic TE detectors using short-read whole genome sequencing (WGS) data, and developed a reproducible simulation framework to comprehensively evaluate the performance of component TE detectors. Simulation analysis using Saccharomyces yeast as a paradigm identified four “best-in-class” methods that were further confirmed using empirical S. cerevisiae WGS datasets, which was also used to provide novel insights into TE biology of S. cerevisiae. This work presented a user-friendly pipeline and reproducible simulation evaluation framework for TE detection with short-read WGS data, which would facilitate future studies on TE polymorphism in different organisms.

Next, I sequenced and generated two high-quality genome assemblies of Saccharomyces strains that are useful for studies on TEs: the S. paradoxus strain DG1768 and the S. uvarum strain CBS 7001 combining the PacBio and/or Illumina sequencing data. I documented genetic alterations of the key lab strain – DG1768 – in molecular biological research about Ty1 mobility. This study also provided resources of an important reference genome for S. uvarum type strain CBS 7001 which should benefit future genomic studies on this species.

Finally, I investigated the evolutionary history of the Ty4 family among Saccharomyces species using both short-read and long-read WGS data. First, I implemented an integrated pipeline that compiled WGS datasets, reconstructed host species phylogeny, and estimated TE abundance. The results revealed species-wide ancestral states of Ty4/Tsu4 subfamilies across multiple Saccharomyces species. Next, I conducted genome-wide TE annotation with a RepeatMasker-based pipeline from public genome assemblies of Saccharomyces species, that cross-validated observations from short-read data and provided a rich dataset of full- length TE sequences for phylogenetic analysis. My results reported several new horizontal TE transfer (HTT) events in the Ty4 family of Saccharomyces species and further supported the significant role of HTT in shaping Ty content in Saccharomyces yeasts.

Details

Statistics

from
to
Export