Files
Abstract
Flowering plants (angiosperms) are the most diverse group of land plants, with >260,000 current species classified into 64 orders and 416 families. Angiosperm families Brassicaceae and Rosaceae have been extensively studied and include numerous crops and the model plant Arabidopsis. Genome sizes differ by >2000-fold in angiosperms, while gene numbers and genic colinearity are more highly conserved. Transposable elements (TEs) make up 15% to 60% of the nuclear DNA in most sequenced angiosperm genomes but can comprise >85% of the DNA in diploid genomes that are >2.5 Gb in size. TE contents are dynamic in genomes because of their amplification and their removal by such mechanisms as unequal homologous recombination and illegitimate recombination. Many TEs, including Helitrons and LTR-retrotransposons, can acquire gene fragments, and sometimes entire genes, which adds to the plasticity for gene creation. Horizontal transfers of TEs are rare but can even occur between distantly related species.
Comparative inter-genome analysis of TEs will shed light on TE dynamics during evolution, including its influence on genome size. Unfortunately, most current TE research is limited to intra-species analyses. Rare inter-species comparisons suffer from heterogenous methods and standards for TE annotation in each published genome. My study quantifies TE abundance to the superfamily level using raw DNA sequencing reads in 17 Brassicaceae species and 12 Rosaceae species. These results indicated that raw read analysis provides a more accurate estimation of TE content. Moreover, the patterns of TE accumulation indicated that a great number of different TE superfamilies can become predominant in specific genomes, and that the patterns of their amplification shows very little or no phylogenetic signal. Hence, massive TE amplifications that influence genome size appear to be rare but random, at least at this level of analysis.
Comparative inter-genome analysis of TEs will shed light on TE dynamics during evolution, including its influence on genome size. Unfortunately, most current TE research is limited to intra-species analyses. Rare inter-species comparisons suffer from heterogenous methods and standards for TE annotation in each published genome. My study quantifies TE abundance to the superfamily level using raw DNA sequencing reads in 17 Brassicaceae species and 12 Rosaceae species. These results indicated that raw read analysis provides a more accurate estimation of TE content. Moreover, the patterns of TE accumulation indicated that a great number of different TE superfamilies can become predominant in specific genomes, and that the patterns of their amplification shows very little or no phylogenetic signal. Hence, massive TE amplifications that influence genome size appear to be rare but random, at least at this level of analysis.