Files
Abstract
Over the last decade, with the advent of high-throughput technologies, a massive flood of biological data has occurred and is continuing to occur. These technologies generate diverse biological datasets including whole-genome sequences, transcriptome sequencing (RNA-Seq,EST sequence), epigenetics (ChIP-chip, ChIP-Seq), and other -omics. These datasets offer unprecedented opportunities to increase our understanding of the functions and dynamics of the genome and the cell. This dissertation entitled Evolution & Detection of ncRNA and Transcriptome Analyses of Two Non-Model Systems combines an evolutionary approach to study non-coding RNAs (ncRNA), and their identification in genomic data using patterns of chromatin modifications, and the analysis of transcriptomes of non-model species chosen for their evolutionary and ecological interest. The evolutionary study of non-coding RNA involves analyzing the patterns of mutations which causes the variabilitys in the secondary structure of RNA. From the analysis, I found that secondary structures evolve both by whole stem insertion/deletion, and by mutations that create or disrupt stem base pairing. I analyzed the evolution of stem lengths and constructed substitution matrices describing the changes responsible for the variation in the RNA stem length. I believe that data generated from the study will provide new insights into the evolution of RNA secondary structures and will facilitate design of improved mutational models for RNA structure evolution. I also developed a novel machine learning based approach, based upon using patterns of chromatin-modification to discriminate/detect different genomic features such as protein coding gene, RNA gene, pseudogene and transposon element gene. I implemented this approach on the model plant species Arabidopsis and detected 33 novel genes. I believe this approach will help in improving the annotation of newly sequenced species.From the transcriptome analysis of two non-model systems (Pitcher plants and Songbird), I was able to identify the polymorphic loci which are fixed and shared between sub-species. I also performed functional annotation of all the genes and identified the fast evolving genes by substitution rate determination. I believe that genomic resources developed during these studies will contribute greatly to future research on these genera and their distinctive ecological adaptations.