Molecular based approaches to phylogenetic analysis, driven by technological advances in gene or whole genome sequencing and computational analyses employing bioinformatics, have revolutionized our view of evolution in the microbial world. More recently, there has been a focus on studying the functional traits of microorganisms with an emphasis on predicting protein function. The type, rate and function of protein mutations are important to our mechanistic understanding of evolutionary processes.
When a mutation in a protein gives rise to another, its molecular mass changes. This mass difference can be localised within peptide segments following the digestion of the protein. For the most part these mass differences correspond to unique values, that enables the nature of the amino acid substitution to be identified.
We have advanced a Mass Tree phylogenetic approach, that avoids the need for gene or protein sequences, to identify and chart protein mutations associated with the evolution of the organisms in which they are expressed. The modified MassTree algorithm identifies and displays all such mutations and calculates the frequency of a particular mutation across a tree. Its significance in terms of its position(s) on the tree is scored, where mutations that occur toward the root of the tree are weighted more favourably. A comparison with data generated from conventional sequence based trees has demonstrated the reliability of mutational analyses employing this mass tree approach.
Although illustrated to study the evolution of influenza hemagglutinin in this work, the approach has far broader applicability and can be applied to investigate the evolution of any organism. In the case of simple microorganisms this can be achieved even without the separation of component proteins. Given the central role that mass map or fingerprint data plays in protein identification in proteomics, this work demonstrates that such data can be successfully employed in a phylogenetics strategy to better understand and predict future evolutionary trends from the perspective of functional proteins expressed by the organism.