Files
Abstract
Genetic diversity is the raw material upon which crop domestication and improvement are based, however, knowledge remains very limited of the types, levels, and patterns of genetic change that have permitted tremendous advances in crop productivity to be realized. Here, we investigate the types, levels, and patterns of evolutionary change in sorghum (Sorghum bicolor L.), a cereal that has abstained from genome duplication for 96 million years or more, in comparison to a recently duplicated sister lineage (maize), and an abstentious outgroup lineage (rice), applying phylogenetic approaches to 69 resequenced genotypes. For Oryza and Sorghum, tandem duplicates tend to accumulate more SNPs than genes retained from genome duplications. However, the maize genome, which experienced an additional genome duplication, shows more SNPs accumulated in the duplicates resulting from genome duplication. Moreover, the proportion of small-scale duplicates experiencing recent positive selection suggested by Tajimas D is larger than that of syntenic genes in sorghum. Nevertheless, duplicates from whole genome duplication have a significantly larger proportion showing positive selection than tandem genes in maize. A large proportion of recent duplications in rice are species specific, however, the majority of recent duplications in sorghum are derived from ancestral gene families. A new retrotransposon family was identified in sorghum and expression analysis suggests that it is related to drought resistance. Both SNP frequency and haplotype information were used to infer selection pressure for all the annotated genes in the sorghum genome, and genes showing reduced nucleotide divergence and extended LD decay were identified. SNPs that are fixed between domesticated and wild sorghum accessions are identified and several interesting genes containing these SNPs are characterized as well as SNPs with large effects on gene function. Using a parsimony algorithm to trace the occurrence of the large effect SNPs on the phylogenetic tree, we show that several ancient branches are enriched with striking SNPs. Combining the high density SNP data from this study with an association study (GWAS), we are able to identify candidate genes for a number of domestication traits.