Abstract

Horizontal gene transfer is an important contributor to evolution. Following Walter M. Fitch, two genes are xenologs if at least one HGT separates them. More formally, the directed Fitch graph has a set of genes as its vertices, and directed edges (x, y) for all pairs of genes x and y for which y has been horizontally transferred at least once since it diverged from the last common ancestor of x and y. Subgraphs of Fitch graphs can be inferred by comparative sequence analysis. In many cases, however, only partial knowledge about the “full” Fitch graph can be obtained. Here, we characterize Fitch-satisfiable graphs that can be extended to a biologically feasible “full” Fitch graph and derive a simple polynomial-time recognition algorithm. We then proceed to show that several versions of finding the Fitch graph with total maximum (confidence) edge-weights are NP-hard. In addition, we provide a greedy-heuristic for “optimally” recovering Fitch graphs from partial ones. Somewhat surprisingly, even if \~ 80\% of information of the underlying input Fitch-graph G is lost (i.e., the partial Fitch graph contains only \~ 20\% of the edges of G), it is possible to recover \~ 90\% of the original edges of G on average.Competing Interest StatementThe authors have declared no competing interest.

Links and resources

Tags