Files
Abstract
Centromeres are a critical element of eukaryotic genome architecture, serving as the site of kinetochore assembly and responsible for faithful chromosome segregation during cell division. While specialized histones define centromeres epigenetically, tandemly-repeating DNA is frequently associated on the sequence level. These repeats vary widely in sequence across species but tend to share key features, including head-to-tail orientation, regular higher order repeat patterns (HOR’s), and high conservation of sequence within a species. Maize has four such repeats. Underlying canonical centromeres, CentC is commonly associated with CENH3, the epigenetic marker of the active centromere, and the kinetochore structure. Centromere 4 has an additional pericentromeric repeat of Cent4. Together, these two repeats inhabit <1% of the maize genome. In contrast, two classes of neocentromeric satellites, TR1 and knob180 repeats, exist in dozens of loci along chromosome arms, inhabiting upwards of 8% of the genome. These neocentromeres are meiotically active only in the presence of abnormal chromosome 10, AB10, which encodes microtubule-based motor proteins that allow them to selfishly distort chromosome segregation by outpacing canonical centromeres and altering chromatid movement. Through comparisons of genomic composition, positions, and patterns of these four centromere and neocentromere classes, maize provides a valuable natural experiment to consider how large-scale tandem repeat patterns are related to centromeric activity and evolution. In this study, we generated 13 repeat-sensitive assemblies of maize and teosinte, its recent, wild ancestor. We then created a novel HOR identification pipeline, which is sensitive to small, overlapping, local HORs that are rich in the maize genome, but difficult to identify using standard HOR annotation pipelines. We found that CentC HOR patterns are plentiful but not directly related to the active centromere. In knobs, we found a more active HOR landscape, where arrays can reach sizes greater than 40mb and may be driven by rare unequal crossing over events that can expand array sizes rapidly. We also identify highly-conserved HOR patterns that are shared among several non-homologous knobs that we believe are functional. These deeply-conserved knob sequences provide evidence of shared evolutionary history among independent knobs and may be key sequences involved in drive.