Das Leibniz-Institut zur Analyse des Biodiversitätswandels

ist ein Forschungsmuseum der Leibniz Gemeinschaft





Titel des Projekts: 
Selektion einer optimalen Teilmenge von verketteten Supermatricen - MARE software


In phylogenomics character matrices with extensive missing data are frequently used. These missing data have potentially detrimental effects on the accuracy and robustness of tree inference.

Therefore, many investigators select taxa and genes with high data coverage. Drawbacks of these selections are their exclusive reliance on data coverage without consideration of actual signal in the data. The simple selection of taxa and genes with high data coverage might thus not deliver data matrices with optimal signal. As an alternative, we have developed a heuristics which

(1) assesses information content of genes in super\-matrices using a measure of tree--likeness combined with data coverage and

(2) reduces super\-matrices with a simple hill climbing procedure to matrices with high total information content.

The selection of a data subset with the proposed approach  increased the chance to recover correct partial trees > 10-fold.

Our simulations and analyses of empirical data demonstrate that the selection of data subsets can be improved with formal approaches compared with simply selecting taxa and genes of high data coverage. We are further developing this approach into a hypotheses-driven selection of an optimal concatenated supermatrix.


Ansprechpartnerin / Ansprechpartner

Generaldirektor des Leibniz-Instituts zur Analyse des Biodiversitätswandels (Museum Koenig Bonn und Museum der Natur Hamburg)
Inhaber des Lehrstuhls "Spezielle Zoologie" an der Universität Bonn
+49 228 9122-200
+49 228 9122-212
b.misof [at] leibniz-lib.de