The size of the character state space affects the occurrence and detection of homoplasy: modelling the probability of incompatibility for unordered phylogenetic characters.

Research paper by Jennifer J Hoyal Cuthill

Indexed on: 03 Dec '14Published on: 03 Dec '14Published in: Journal of Theoretical Biology


This study models the probability of incompatibility versus compatibility for binary or unordered multistate phylogenetic characters, by treating the allocation of taxa to character states as a classical occupancy problem in probability. It is shown that, under this model, the number of character states has a non-linear effect on the probability of character incompatibility, which is also affected by the number of taxa. Effects on homoplasy from the number of character states are further explored using evolutionary computer simulations. The results indicate that the character state space affects both the known levels of homoplasy (recorded during simulated evolution) and those inferred from parsimony analysis of the resulting character data, with particular relevance for morphological phylogenetic analyses which generally use the parsimony method. When the evolvable state space is large (more potential states per character) there is a reduction in the known occurrence of homoplasy (as reported previously). However, this is not always reflected in the levels of homoplasy detected in a parsimony analysis, because higher numbers of states per character can lead to an increase in the probability of character incompatibility (as well as the maximum homoplasy measurable with some indices). As a result, inferred trends in homoplasy can differ markedly from the underlying trend (that recorded during evolutionary simulation). In such cases, inferred homoplasy can be entirely misleading with regard to tree quality (with higher levels of homoplasy inferred for better quality trees). When rates of evolution are low, commonly used indices such as the number of extra steps (H) and the consistency index (CI) provide relatively good measures of homoplasy. However, at higher rates, estimates may be improved by using the retention index (RI), and particularly by accounting for homoplasy measured among randomised character data using the homoplasy excess ratio (HER).