Files
Abstract
Flowering plants (angiosperms), comprising ~250,000 species, vary tremendously at the levels of chromosome number and nuclear genome size. At the moment, there is no agreed-upon format for quantitation of the degree of conservation of genic content and colinearity across species. The nature, origins and biases of the numerous small genic rearrangements that differentiate genomes have not been comprehensively investigated, especially inconsideration of possible lineage-specificities in the quality or quantity of rearrangements. The research goals of this project are to investigate the retention or loss of gene pair linkage in various plant species, to quantify the frequencies of various types of genome rearrangements, to understand the lineage-specificity of the genomic instability, and to gain insights into the mechanisms responsible for genic rearrangements.The great complexity of most angiosperm genomes leads to significant challenges in precise genome comparison, so I will pursue a process of local and global genome comparison through investigating pairs of adjacent genes, and a sampling approach to manually inspect the retention and rearrangement of gene pair linkage. Use of this approach indicates that relative gene pair orientation is random for most plant genes in most flowering plant genomes, with the dramatic exception of genes that are very tightly linked, where convergent genes are highly over-represented. Careful manual inspection suggests that ~59% of adjacent gene pairs are conserved in rice compared to sorghum. Less than 3% of gene pairs in this comparison are disrupted by gene loss or gene creation. Gene deletions and insertions are observed to be the most common disruptor of gene pairs, relative to other genome rearrangement types, such as inversion and translocation, but most genome rearrangements appear to be the results of multiple events. The gene pair comparison approach has also been extended to a number of plant genomes, including foxtail millet, Brachypodium, Arabidopsis, and Medicago, and suggests that more than 50% of adjacent gene pairs are conserved in every grass pair investigated. The recently sequenced banana and date palm genomes are the first two sequenced monocot genomes outside the grass family, which serve as outgroups to determine the lineage-specificity of the genomic rearrangements that were observed.Mutation is one of the most important genetic processes, which generate genetic variation between individuals within a species. However, it is not fully clear in plants whether different rates or types of mutation are found in different parts of the genome. By comprehensively investigating mutations that differentiate pairs of LTRs on rice chromosomes 3 and 4, we found that point mutations in chromosome 3 are more abundant near the centromeres, while the transition to transversion ratio (averaging 2.9) does not exhibit any genome location bias. The overall number of these small mutations is significantly correlated with LTR retrotransposon age, but there is no correlation between the transition to transversion ratio and the age of LTR retrotransposons.This work represents the first to quantify genomic instability during the evolution of flowering plants by combining both high throughput characterization and manual inspection. It advances our understanding of the mechanistic basis of genomic instability in flowering plants. The investigation of rates and natures of genome rearrangement across lineages allows us to identify the evolutionary origins of changes in genome instability, and may provide insights into the mechanisms of the adaptation to various environments for certain species.