Files
Abstract
Multiple sequence alignment plays a crucial role in extracting structural, functional, and evolutionary information from the exponentially growing sequence data from the ongoing genome sequencing. Based on the case study of retrotransposon sequence alignment, this thesis compares three alignment programs, DIALIGN, CLUSTALW, and PRRN, and proposes some strategies to improve alignment quality, such as realigning certain sequences or sequence ranges with different programs or parameters and hand editing. Entropy is used as an alignment quality indicator. This study also presents the design and development of an alignment tool, named AlignAgain, which is built to help biologists to improve alignment quality. AlignAgain is written in Java and allows users to display, edit, realign whole or partial sequences with CLUSTALW or PRRN, and append sequences with profile alignment.