Rapid and accurate large-scale coestimation of sequence alignments and phylogenetic trees.

Research paper by Kevin K Liu, Sindhu S Raghavan, Serita S Nelesen, C Randal CR Linder, Tandy T Warnow

Indexed on: 23 Jun '09Published on: 23 Jun '09Published in: Science


Inferring an accurate evolutionary tree of life requires high-quality alignments of molecular sequence data sets from large numbers of species. However, this task is often difficult, slow, and idiosyncratic, especially when the sequences are highly diverged or include high rates of insertions and deletions (collectively known as indels). We present SATé (simultaneous alignment and tree estimation), an automated method to quickly and accurately estimate both DNA alignments and trees with the maximum likelihood criterion. In our study, it improved tree and alignment accuracy compared to the best two-phase methods currently available for data sets of up to 1000 sequences, showing that coestimation can be both rapid and accurate in phylogenetic studies.