ORTHOREFINE: IMPROVING IDENTIFICATION OF ORTHOLOGOUS GENES

Ludwig, John

ORTHOREFINE: IMPROVING IDENTIFICATION OF ORTHOLOGOUS GENES

Ludwig, John

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket

Files

Abstract

Identifying orthologous genes continues to be an early and imperative step in genome analysis but remains a challenging problem. While synteny (conservation of gene order) has previously been used independently and in combination with other methods to identify orthologs, applying synteny in ortholog identification has yet to be automated in a user-friendly manner. This desire for automation and ease-of-use led me to develop OrthoRefine, a standalone program that uses synteny to improve ortholog identification. OrthoRefine implements a look-around window approach to detect synteny, which is used to distinguish orthologs from paralogs in situations where other methods cannot separate paralogs from orthologs reliably. OrthoRefine, applied as a postprocessing step to results obtained with other methods, was tested in tandem with OrthoFinder, one of the most used software for identification of orthologs in recent years, and OMA, an online database of orthologous genes. I evaluated improvements provided by OrthoRefine in several datasets comprised of bacterial, eukaryotic, and archaeal genomes. OrthoRefine efficiently eliminates paralogs from orthologous groups detected by OrthoFinder and those obtained from OMA. Using synteny increased specificity and functional ortholog identification; additionally, analysis of BLAST e-values, phylogenetics, and operon occurrence further supported using synteny for ortholog identification. A comparison of several window sizes suggested that smaller window sizes (eight genes) were generally the most suitable for identifying orthologs via synteny. However, larger windows (30 genes) performed better in datasets containing less closely related genomes. A typical run of OrthoRefine with ~10 bacterial genomes can be completed in a few minutes on a regular desktop PC.

Details

Record ID

2182

Record Created

2024-12-05

Title

ORTHOREFINE: IMPROVING IDENTIFICATION OF ORTHOLOGOUS GENES

Author

Ludwig, John

Contributor

Mrázek, Jan Advisor
Neidle, Ellen Committee Member
Karls, Anna Committee Member
Liu, Liang Committee Member

College or School

Franklin College of Arts and Sciences

Department

Genetics

Content Type

Dissertation

Pagination

89

File Format

pdf

Language

English

Degree Type

Doctor of Philosophy (PHD)

Name of Granting Institution

University of Georgia

Year Degree Granted

2024-05

Keywords

bioinformatics; homolog; orthogroup; Ortholog; paralog; synteny

Record Appears in

Electronic Theses and Dissertations > Doctoral Dissertation
Franklin College of Arts and Sciences
All Resources
Doctoral

System Control Number

9949644826102959

PDF

Statistics

Download Full History