Files
Abstract
The abundance and multiple functions of long non-coding RNAs (lncRNA) in mammalian systems have been one of the most important discoveries in molecular biology in recent years. However, the identification and characterization of lncRNAs in plants, especially cereals, is in its early stages. We conducted a reference-guided transcriptome assembly with RNA-Seq data from four economically important cereals, and screened for RNAs that were at least 200 bases in length, at most 70 amino acids in open reading frames and lack of homology in Uniprot database. We identified 7,196 lncRNA candidates in Zea mays, 1,974 in Sorghum bicolor, 4,236 in Setaria italica and 2,542 in Oryza sativa, and conducted sequence composition analysis, transposable elements detection and miRNA precursor screen. Further, a cross-species comparison, including sequence- and structure-based lncRNA homology search, synteny analysis, and lncRNA secondary structure prediction, uncovered some limited sequence similarity and sub-regions elucidating putative conserved secondary structures.