Abstract
We describe the conditions under which a set of continuous variables or characters can be described as an X-tree or a split network. A distance matrix corresponds exactly to a split network or a valued X-tree if, after ordering of the taxa, the variables values can be embedded into a function with at most a local maximum and a local minimum, and crossing any horizontal line at most twice. In real applications, the order of the taxa best satisfying the above conditions can be obtained using the Minimum Contradiction method. This approach is applied to 2 sets of continuous characters. The first set corresponds to craniofacial landmarks in Hominids. The contradiction matrix is used to identify possible tree structures and some alternatives when they exist. We explain how to discover the main structuring characters in a tree. The second set consists of a sample of 100 galaxies. In that second example one shows how to discretize the continuous variables describing physical properties of the galaxies without disrupting the underlying tree structure.
Introduction
Maximum parsimony and distance-based approaches are the most popular methods to produce phylogenetic trees. Whereas most studies use discrete characters, there is a growing need for applying phylogenetic methods to continuous characters. Examples of continuous data include gene expressions, 1 gene frequencies,2,3 phenotypic characters 4 or some morphologic characters.5,6
The simplest method to deal with continuous characters using maximal parsimony consists of discretizing the characters into a number of states small enough to be processed by the software. Recent software programs such as TNT (Tree analysis using New Technology) 7 or CoMET (Continuous-character Model Evaluation and Testing Model) 8 use developments of the contrast method to deal with continuous characters. These methods assume that the characters evolve at comparable rates according to a Brownian motion, an assumption that is often difficult to verify.4,9 Distance-based methods are applied to both discrete and continuous input data. Compared to character-based approaches, distance-based approaches are quite fast and furnish in many instances quite reasonable results. As pointed out by Felsenstein, 9 the amount of information that is lost when using a distance-based algorithm compared to a character-based approach is often surprisingly small. The use of continuous characters in distance-based methods may at first glance be less problematic than in character-based methods, since algorithms like the Neighbour-Joining work identically on discrete or continuous characters. However, here too it is often not easy to determine if the data can be described by a tree. When does a set of continuous characters describe a split network or an X-tree? The article furnishes some new insights on that question. It explains when a set of continuous characters can be described exactly by a split network or a valued X-tree. In real applications, the distance matrix corresponds only approximately to a split network or a tree topology. An adequate method is necessary to quantify to what extent the distance matrix corresponds to a split network or a tree. The Minimum Contradiction method can be used for that purpose.10–12
The paper is organized as follows. Section 2 succinctly presents the Minimum Contradiction method. It explains why some inequalities, called Kalmanson inequalities, are central to phylogenies. Section 3 extends the Minimum Contradiction method to a set of continuous characters. Section 4 furnishes the conditions under which a set of continuous characters can be described by a tree or a phylogenetic network. Section 5 presents an application of the algorithms in morphometrics using a set of faciocranial characters of hominids. Section 6 presents preliminary results on the evolution of a number of physical characters in galaxies. It illustrates how the Minimum Contradiction approach can be applied to discover structuring characters.
Ordering the Taxa on a Tree or a Split Network
A valued X-tree T is a graph with X the set of leaves and a unique path between any two distinct vertices x and y, with internal vertices of at most degree 3. A circular order on an X-tree corresponds to an indexing of the n leaves according to a circular (clockwise or anti-clockwise) scanning of the leaves in T.
13
Figure 1 shows a tree and an indexing of the taxa that corresponds to a circular order. For taxa indexed according to a circular order the distance matrix

The distance
In real applications, the distance matrix
The best order of a distance matrix is, by definition, the order minimizing the contradiction. The ordered matrix
If one leaves aside Euclidian geometry, other metrics fulfil Kalmanson inequalities. Kalmanson inequalities are also satisfied by taxa on an X-tree or a split network. If the taxa are circularly ordered, then the Kalmanson inequalities are fulfilled. As developed in a number of publications,17–19 perfect order corresponds in X-trees and split networks to a solution of the travelling salesman problem (TSP) for both the distance matrices d
i, j
and
In the next section we show that for trees and split networks as well, the Kalmanson inequalities are related to convexity. This result furnishes a new perspective on when trees and phylogenetic networks can be used to describe a set of continuous characters.
As of today, it is still not really clear when the use of continuous characters in distance-based phylogenetic studies is a valid approach. To clarify that problem, we will first consider a single character.
Let us now discuss the conditions for which a set of taxa characterized by a single continuous character f1 can be perfectly ordered. Let us define the distance d
i, j
between two taxa as d
i, j
= abs(f(i) – f(j)). The taxa {1, …, n} are perfectly ordered when the order is such that the distance matrix
Proposition 1
A distance matrix
the function f(x) has at most one local maximum and one local minimum
the function f(x) crosses the reference line L(x) = f1(n) = const at most once.
Proof
A central distinction can be made between the taxa depending on whether the character value is smaller or larger than the value of a reference taxon n. The set of taxa can be divided into two disjoint sets, the set S of taxa with values smaller or equal to the reference value f1(n) and the set of taxa L with values larger than the reference value (See Fig. 5 for an illustration). Let us show that a distance matrix fulfilling the conditions i) and ii) is perfectly ordered for any 3 ordered taxa i ≤ j ≤ k. We will consider all possible cases.
a) All 3 taxa are in the same set (S or L). The distance
b) The taxon i is in one set of taxa and the taxa j, k in another set. In that case one has
c) Condition ii) prevents the second taxon to be in another set than the taxa i and k.
d) If the third taxa is in another set than the taxa i, j one has
Let us show that if the conditions of the proposition are not fulfilled then Kalmanson inequalities are violated. If the function f(x) has two maxima (or 2 minima) corresponding to the taxa i and k, then there exists a taxa j with
Figure 3 illustrates Prop. 1 with a simple example. The matrix
The results on a single character can be easily generalized to several characters as the sum of perfectly ordered matrices
We are now ready to discuss the connection between Kalmanson inequalities and convexity in phylogenies. The tree metrics case is different from the Euclidean metrics described in Figure 2. In an Euclidean metrics, Kalmanson inequalities are fulfilled if the points (cities) are on a convex hull, while for split networks and trees the hull must be orthogonally convex. In an Euclidean metrics, a set Z ⊂ ℜ n is defined to be orthogonally convex if, for every line that is parallel to one of the axes of the Cartesian coordinate system, the intersection of Z with the line is empty, a point, or a single interval.

The travelling salesman problem (TSP) can be easily solved if the points are on a convex hull in the Euclidean plane. Points on a convex hull fulfil the Kalmanson inequalities.

Top: The taxa are ordered so that the characters f1(i) on the taxa {1, …, i, …, n} can be embedded in a function f(x) fulfilling proposition 1. Bottom: Distance matrix

The values of two characters that are perfectly ordered are on an orthogonal convex hull. Two examples of an orthogonal convex hulls.
Corollary 2
If the taxa {1, …, n} are ordered so that the distance matrices
Proof
Proposition 1 for a single character is equivalent to the following proposition: if the distance matrix
Corollary 2 can be extended to higher dimensions. The geometry, associated to trees and split networks built on a set of perfectly ordered characters, corresponds to an orthogonally convex hull.
In the previous section we have explained when a set of characters on a set of taxa fulfils Kalmanson inequalities and can be described by a tree or a split network. In this section, we explicitly show how the branches of the trees evolve when several characters are combined. For a single character, the taxa can be ordered so as to fulfil the conditions of Prop. 1. The resulting tree is a line tree. In a line tree, all taxa are on a single path and one has
Figure 5 shows an example of a line tree with perfectly ordered taxa.

The tree associated to a single character is a line tree. In a line tree, all taxa are on the same path.
At least two independent characters are necessary to generate a tree that is not a line tree. An independent character can be defined as follows.
Definition 1
Two characters f1 and f2 are independent if there exists at least 2 taxa i and j (i < j < n) so that
Proposition 3
If two characters f1 and f2 are independent, then the distance matrix
Proof
A line tree is so that either
Figure 6a shows 3 examples of independent characters. If two characters are independent and the taxa are perfectly ordered on both f1 and f2, then the distance matrix corresponds to a split network or an X-tree different from a line tree. Let us discuss the first example in Figure 6. Without restriction, let us assume that for the reference taxon n, f1(n) = f2(n) = 0. The distance matrix elements are given by

A) Examples of independent characters, B) X-tree corresponding to the first two examples, C) The characters f1 and f2 are not independent.
The expression reduces to
Figure 7 is another illustration of Proposition 3 for two characters on perfectly ordered taxa. The ordered matrix

The distance matrix
The Minimum Contradiction on continuous characters was tested on a set of independently analyzed data representing craniofacial properties of hominid fossils. The results obtained with the Minimum Contradiction Method are compared to those obtained with TNT in a recent article in Nature. González-José et al 6 have analysed sets of craniofacial landmarks representing the flexure of the cranial base, facial retraction, neurocranial globularity, and masticatory apparatus. Phylogenetic relationships among Homo species and hominid taxa were obtained with the maximum parsimony module for continuous characters in TNT. The reader is referred to González-José et al 6 for the details on the extraction of the data.
Similarly to González-José et al, we have preprocessed the 4 sets of landmarks with the Generalized Procrustes Analysis in Morphologika. 20 The Generalized Procrustes analysis is a superimposition method that rotates, scales and translates the landmarks to adjust for isometric effects of size and orientation. The distance between two taxa is computed as the sum of the absolute difference between each Procrustes coordinate. The best circular order was subsequently obtained by minimizing the contradiction C in Eq. (1). 11 Figure 8 shows the minimum contradiction matrix using Gorilla gorilla as reference taxon. Gorilla gorilla is taken as the reference taxon in order to be able to compare the results with González-José et al.

Minimum contradiction matrix
The matrix
Table 1 shows the best order obtained with the minimum contradiction approach and the order of the taxa on the maximum parsimony tree. (The best order is a circular order and Gorilla gorilla is adjacent to both P. aethiopicus and Pan troglodytes) Except for H. sapiens the specimens are very similarly ordered. The 2 main branches of the maximum parsimony tree are indicated by a colour in the Table 1.
Circular order obtained with the Minimum Contradiction and the Maximum Parsimony approach on a set of craniofacial landmarks of hominids (Maximum Parsimony order adapted from González-José et al). 6
Let us illustrate with an example the possibilities offered by the Minimum Contradiction Method to analyze phylogenetic data. In Figure 8, the largest values of
Contrarily to González-José et al our analysis is done without using a Principal Components Analysis (PCA). This simplifies considerably the interpretation of the results. Landmarks satisfying to a good approximation Prop. 1 can be identified quite simply. Once those characters are identified, one can discover which splits are supported by each character. Figure 9 shows a character that supports the second interpretation of Figure 8. The landmark 9 (Facial retraction) supports a split between Homo without H. habilis and H. rudolfensis and the other taxa. In that example, both interpretations are equally valid. 22

Examples showing how characters supporting well a split can be identified using Prop. 1 in this article. The order is the same as in Table I.A) The character “Facial retraction: landmark 9” supports the split between Homo without H. habilis and H. rudolfensis and the other taxa. B) Split for the character “Facial retraction: landmark 9”.
The level of contradiction can be used as an objective criterion to choose the reference node. As discussed in details in Thuillard,11,12 the reference node is an important choice in the presence of contradictions. In our example, the normalized level of contradiction is about 30% lower with Pan troglodytes as reference taxon. This suggests that Pan troglodytes is a better choice than Gorilla gorilla as a reference taxon. Figure 10 shows quite interestingly that the ambiguity concerning H. habilis is removed with Pan troglodytes as reference taxon. H. habilis belongs clearly to Homo. In summary, with the data analyzed here, H. habilis shares some characters with non Homo, but has a majority of characters shared with other Homo specimen, predominantly H. erectus/H. ergaster.

Minimum contradiction matrix
A deeper analysis of the above results would go much beyond the goal of this section. In this section we wanted to illustrate how information can be extracted from a minimum contradiction analysis on continuous variables.
The second example, illustrating the continuous minimum contradiction approach, shows how a character-based phylogenetic tree can be inferred from a distance matrix. A standard approach to constructing phylogenetic trees from continuous variables consists of discretizing the variables and to run a maximum parsimony software treating the discretized variables as characters. The difficulty with that approach is that the discretization may easily disrupt an underlying tree structure. This problem is particularly acute when 2-states characters are used. The Minimum Contradiction Method can be applied to remedy that problem. For illustration, we have taken from Ogando et al 23 a sample of 100 galaxies described by some observables and derived quantities. In this section, our goal is to illustrate how the Minimum Contradiction approach can be used in practice, in particular to discover structuring characters. The astrophysical implications are out of the scope of the present work. It will be presented in subsequent papers together with more in-depth analysis. In practice, identifying a priori characters that behave like on Figure 7a is difficult. For complex objects in evolution, this would require some good knowledge of the evolution of the characters together with some ideas about the correct phylogeny or at least a rough evolutionary classification. In astrophysics, the study of galaxy evolution has not yet reached this point.24–27 However, we want to show here how the approach presented in this paper can be extremely valuable even in cases with very little a priori hints.
In this example, three variables are selected: Brie, B-R, and OIII. Brie measures the surface brightness of the galaxy, on a negative logarithm scale. B-R is the difference between the B- and R-magnitudes: a high B-R indicates a red object (old stars and/or high metallicity), while a low B-R indicates a blue object (young stars and/or low metallicity). There is no a priori direct physical connections between the three variables. High OIII (star formation) could be expected to correspond to low B-R (young stars). As shown in Figure 11, that is not always true, due in large part to the dependence of B-R on the metallicity of the stars.

Analysis of 3 selected characters Brie, OIII and B-R on an ensemble of 100 galaxies ordered with the Minimum Contradiction method. A) Distance matrix
After ordering, a number of clusters are clearly recognized. The galaxies associated to the discrete character “High Brie” are far from being perfectly ordered. The data cannot be described well with either a split network or a tree. This problem can be solved by discretizing the variables. In Figure 11b, the 3 ordered variables are represented together with a discretization of the input variable using threshold values (dashed lines). Discretization removes most contradictions on the order (In order to see it, let us consider the character Brie. Let us code Brie High as 1 and Brie low as 0. The discretized function fulfils Prop. 1 as it has only a minimum and any horizontal line crosses the discretized function at most twice). The distance matrix corresponds well to a split network. The split network can be represented, in first approximation, by an X-tree. To do so let us move the boundary (dashed line) separating “low” from “high Brie” slightly to the right. The main split in the tree corresponds to the “High Brie” and “Low Brie” branches. Each branch is split into two other branches defined by the character states, “low OIII”, “High OIII” for “Low Brie” and “low B-R”, “High-B-R” for “High-Brie”. The resulting tree is shown in Figure 11c.
The main splitting character is Brie for which our discretization separates our sample in two roughly equal bins. That is not the case for OIII and B-R for which low OIII and high B-R are two small and distinct groups. All high Brie galaxies are in the high OIII bin. Indeed, a low OIII corresponds to an absorption feature, while a high OIII indicates an emission line due to star formation. As a consequence, in this limited sample, low surface brightness galaxies (main left branch) do have star formation, and some high surface brightness objects show only an OIII absorption feature (rightmost branch). All high B-R galaxies have high Brie and high OIII. This means that in this sample, the red objects have a low surface brightness, but they have some star formation. They are thus not simply ageing galaxies, but probably form stars with high metallicity. Conversely, all low OIII galaxies of our sample have a low B-R, so that blue objects do not necessarily form a lot of stars.
A better understanding of the groupings and their physical implications would require the investigation of other properties of the objects. The relative complexity of the correlations between our three characters implies that a correct classification cannot be made by dichotomizing the variables beforehand. A more objective and multivariate point of view is necessary to precise the separating value between for instance “high” and “low” as in our present study. Indeed, the discretization is here used only to depict more easily the multivariate and continuous ordering of the objects in the sample. Figure 11c is a synthetic classification shown by the distance matrix 11b and obtained from the Minimum Contradiction method using fully continuous information.
The Minimum Contradiction approach furnishes an objective justification to using continuous variables or characters in phylogenetic studies. Provided the taxa can be ordered so that each character fulfils the Kalmanson inequalities then there exists a split network or a tree representing exactly the distance matrix. We have shown that the Kalmanson inequalities are fulfilled if the values of each character can be embedded into a function with at most a local maximum and a local minimum, and crossing any horizontal line at most twice. In practical applications the level of contradiction of the minimum contradiction matrix furnishes an objective measure of the deviations to a tree or split network. This approach was applied to a set of continuous characters, representing faciocranial landmarks of hominids, already analyzed with a maximum parsimony approach. 6 While the results are found to be very similar to the maximum parsimony approach, the Minimum Contradiction method furnishes supplementary information: i) Problematic relationships between taxa are visualized. ii) Characters supporting quite well a split can be discovered as they correspond to single characters fulfilling very well the Kalmanson inequalities. iii) Our approach can also select the best outgroup (reference taxon). The best outgroup leads to the order with the smallest level of contradiction.
Discovering the structuring characters among a set of continuous characters is a notoriously difficult task. The search for structuring characters can be greatly facilitated by looking for subsets of characters that satisfy best the Kalmanson inequalities. This approach was applied to a set of 40 characters on 100 galaxies to extract the structuring characters. Quite interestingly, while discretization of continuous characters is often problematic, discretization with the Minimum Contradiction method can help removing contradictions from a split network or tree structure.
Disclosure
The authors report no conflicts of interest.
Footnotes
Acknowledgements
We thank Emmanuel Davoust for the compilation of the data from the Ogando et al
23
paper and from the Hyperleda database (
). Our thanks go also to Dr. R. González-José for his helpful comments.
