Abstract
I examined hypotheses about lateral transfer of type II antifreeze protein (AFP) genes among “distantly” related teleost fish. The effects of episodic directional selection on amino acid evolution were also investigated. The strict consensus results showed that the type II AFP and type II antifreeze-like protein genes were transferred from
Introduction
Many prokaryotes and eukaryotes that are found in cold environments survive freezing because their cells produce antifreeze proteins (AFPs).1–9 These molecules function by either preventing endogenous fluids from freezing, or by enabling survival despite the freezing of these fluids. Based on structural differences, five different types of AFPs have been described for teleost fish.5,10,11 Type II AFP,5,11 shows a high degree of structural similarities among “distantly” related taxa, such as between

A rooted species phylogeny showing 3 lateral gene transfers (LGTs).
Herring, rainbow and Japanese smelts produce type II AFPs that require calcium ions for their antifreeze activity (ie, Ca2+-dependent) whereas type II AFPs produced by the sea raven and longsnout poacher do not (ie, Ca2+-independent).5,11 Due to this, it has been speculated that the calcium-dependent and independent type II AFPs have different mechanisms for preventing ice formation. 5 It is also noteworthy that this group of freeze-preventing proteins, which are thought to have evolved from the sugar-binding domain of C-type (Ca2+-dependent) lectins, have 10 cysteine residues, giving rise to 5 disulfide bridges in identical positions.5,11,13,14 Five disulfide bridges are uniqueto the type II AFP, whereas C-type lectins typically form 2 to 3 disulfide bridges (see Fig. 1 by Graham and colleagues 11 ).
Evolutionary analyses have been conducted to further understand the remarkable structural similarity of the type II AFP genes in “distantly” related teleost fish species. One investigation, that was based on several independent pieces of evidence, proposed lateral gene transfer (LGT) as a possible explanation. 11 However, the common origin and independent evolution (ie, convergent evolution) hypotheses can potentially also account for the taxonomic distribution/conservation of type II AFP genes. The common origin hypothesis holds that the type II AFP gene evolved once in the most recent common ancestor of all the type II AFP producing species, after which it remained highly static and was lost independently at least 3 different times. 5 The convergent evolution hypothesis explains the taxonomic distribution of the similar type II AFP genes as being the result of at least three independent evolutionary events. In the current study, the notion of LGT was further tested using methods implemented in the program Prunier, 15 and further discussed in light of the convergent evolution and common ancestor hypotheses. In addition, effects of episodic directional selection on codon evolution, especially those sites coding for the 10 cysteines involved in disulfide bridge formation, were assessed using a mixed effects model of evolution (MEME) 16 available on the datamonkey website. 17 The results of the various analyses indicated that at least three lateral transfer events involving genes coding for the type II AFP/antifreeze-like protein (AFLP) have occurred, and that episodic directional selection has affected the evolution of the two cysteine-containing sites that are unique to the type II AFP.
Material and Methods
Sequences
The nucleotide sequences of all the taxa shown in Figure 2 in the paper by Liu and colleagues
5
were downloaded from the GenBank, except for

An unrooted maximum likelihood tree of the type II AFP/AFLPs and lectins.
Taxa and GenBank accession numbers/source references of the type II AFPs/AFLPs and lectins used in the study.
Data analyses
The software package DAMBE (Version 5.2.57) 18 was implemented to manage the data and to match the codons against the aligned amino acid sequences. The alignments of the amino acid (Table 2), which are available upon request, were obtained using the MAFFT (Version 6) 19 web server (available at http://mafft.cbrc.jp/alignment/server/). The alignment mode G-INS-i, 20 with an offset value of 0.2 and the MAFFT homolog function “turned on” (all the other parameter settings were default values), was used for the amino acids.
Amino acid alignment of type II AFPs/AFLPs and lectins.
A species phylogeny was created, using the ITOL web server, based on the NCBI taxonomic classification (available at http://itol.embl.de/other_trees.shtml) and the information reported by Liu and colleagues.
5
In the resulting tree, the phylogenetic relationships of
Prunier (version 2.0), 15 a program that works in conjunction with Treefinder, 25 was used to detect potential LGT events. Both the “slow” and the “fast” methods were implemented in this investigation. The “slow” method uses Kishino-Hasegawa 26 Shimodaira-Hasegawa 27 expected likelihood weights 28 and the approximately unbiased 29 tests to identify statistically significant topological differences between a gene and a species tree. LGT events are inferred in the part of the tree where statistically significant topological conflicts between the species and gene trees exist (see details by Abby and colleagues 15 ). The gene tree was inferred, in both the “fast” and “slow” Prunier analyses, based on the nucleotide alignment and the model GTR + G8 + I. The “fast” method works by finding a maximum statistical agreement forest (MSAF) between the gene and species trees (see details by Abby and colleagues 15 ). MSAF is defined as the minimum number of branches that are required to be eliminated in order to obtain statistical agreement between the two topologies. 15 LGT events were inferred in situations in which branches with statistically significant support (ie, support of 95% or higher) must be cut in order to achieve MSAF. The “fast” Prunier method calculates Expected-Likelihood Weights applied to Local Rearrangements (LR-ELW) edge support values 28 (see Fig. 2) for the gene tree. The aligned codon sequences and the four different species phylogenies (see previous paragraph) were provided as input for all Prunier analyses. For the “slow” method the default parameter values were used. In the “fast” analyses, the following parameters were assigned values that were different from the default settings: boot.thresh.conflict = 95 (ie, cut-off support value for topological conflict); fwd.depth = 1 (ie, maximal depth at which Prunier looks forward to find a significant LGT event when the current LGT is not significant).
A MEME 16 was implemented for the purpose of detecting possible episodic directional selection on codon sites that has occurred in a small number of branches in the type II AFP gene tree. MEME models variable dN/dS across lineages at a given codon site, such that a certain fraction of branches are allowed to evolve neutrally or under negative selection, whereas the remaining proportion is permitted to evolve under episodic directional selection. To test for evidence of this type of selection, a likelihood ratio test is performed between the aforementioned model and the nested null model that forces parameter values for episodic directional selection to vary between 0 and 1. Prior to running the MEME analysis the codon alignments of type II AFP were screened for recombination events using a program called GARD 30 with the following settings: Site-to-site variation = general discrete and rate classes = 4. Recombination has the potential to confound inferences of codon selection and, consequently, needs to be accounted for before running the analyses. No significant break points in the current alignment were detected by the GARD algorithm. The model selection tool, available in Datamonkey, 17 chose the following optimal model for the MEME analysis: 001102 with AIC of 15118.9.
Results
Species and gene trees
The rooted species phylogeny, generated based on the NCBI taxonomic classification and published systematic studies of teleost fish, is shown in Figure 1. As noted in the material and methods section, the phylogenetic positions of
Figure 2 shows the unrooted gene tree that was derived from the aligned nucleotide data in the 4 Prunier runs. A number of major conflicts, indicated by the arrows in Figure 2, are apparent between the gene and species trees. In the former, all of the type II AFP and AFLP producing taxa form a monophyletic group, which is not the case in the latter, in which
Lateral gene transfer
The results of the Prunier runs differed somewhat depending on the species phylogeny and method used. The solid arrows, which indicate the direction of LGTs (Fig. 1), constitute the strict consensus results obtained from all 8 analyses; that is, when implementing 4 different species trees for each method (ie, either the “fast” or “slow” method). These events included the transfer of type II AFP gene from
Episodic directional selection
The MEME 16 analysis showed evidence for 16 codon/amino acid sites in the alignment having been influenced by episodic directional selection (significance level = 0.05) (see Tables 2 and 3). Two of them are cysteine-containing sites (ie, sites 71 and 110; see Table 2) that are involved in forming the unique 5th disulfide bridge of the type II AFP.5,11
Codon (amino acid) sites found to be under episodic diversifying selection.
Discussion
Lateral gene transfer and the evolution of the type II AFP
The evolutionary history of the type II AFP gene appears to be rather complex, involving at least 3 events of LGT.
11
When it comes to improving our understanding of the evolution of this freeze-preventing protein, a key question involves the point in its phylogeny at which the 5th disulfide bridge evolved. The number of disulfide bridges in the C-type lectins and/or type II AFLPs vary quite a bit but are always less than 5.
11
Thus, The 5th disulfide bond, in addition to the other 4 disulfide bridges, makes the type II AFPs unique relative to the closely related C-type lectins and/or type II AFLPs.5,11 The evolution of the 5th disulfide bridge requires the appearance of a cysteine in amino acid position 110 (Table 2 and Fig. 1), provided that it already existed in site 71. In fact,
Based on the currently available data, it is not possible to determine—under the assumption of 3 LGTs—whether the calcium-dependent type II AFP gene evolved prior to the calcium-independent one or vice versa. Thus, two equally most-parsimonious scenarios, requiring a minimum of two changes when optimized on the species tree (Fig. 1), are possible. In the first situation the calcium-dependent type II AFP gene evolved first and was subsequently transferred from the ancestral lineage of the
In addition to the LGT scenario, two other hypotheses can be invoked to explain the distribution/conservation of type II AFP genes in the taxa included in the current study. Those are the common origin and the independent evolution (ie, convergent evolution) hypotheses. The common origin hypothesis, which is advocated by Liu and colleagues,
5
holds that the type II AFP gene evolved once in the most recent common ancestor of all the type II AFP producing species after which it was lost independently at least 3 different times.
5
To explain the remarkable taxonomic conservation/distribution and the evolution of the calcium-independent type II AFP gene, the common origin hypothesis assumes two independent losses of calcium-binding sites
5
and that the gene has remained highly static since it evolved about 280 million years ago (ie, the estimated divergence time between Ostarioclupeomorpha and Euteleostei
31
). This is especially true with regard to the
The convergent evolution hypothesis explains the taxonomic distribution/conservation of the type II AFP genes analyzed here as being the result of a number of independent origination events. Independent evolution of a gene in “distantly” related lineages can sometimes be difficult to discern from LGT because both processes can give rise to sequences with similar nucleotide compositions. However, transferred nucleotide sequences (ie, as opposed to convergent evolution) are more likely to show distinctive evolutionary patterns relative to the other genes (regions) in the genome of the putative recipient species, while at the same time exhibiting substitution patterns more similar to those in the assumed donor genome. In fact, Graham and colleagues
11
investigated molecular evolutionary patterns, such as intron substitution rates and codon usage bias in exons, in the
Episodic directional selection
The MEME analysis showed that the cysteine containing sites 71 and 110 (Table 2), which are responsible for the formation of the 5th disulfide bridge, have been affected by episodic directional selection (Table 3). This result suggests that this mechanism was instrumental in driving the evolution of the type II AFP gene from a C-type lectin precursor. Such an evolutionary event is highly likely to have been adaptive in nature since directional positive selection is sometimes thought to be associated with adaptive (functional) changes in protein-coding genes;32,33 that is, by favoring fitness-enhancing mutations. An earlier study suggested that episodic positive selection also was important in driving the differentiation of duplicated antifreeze protein genes in two Fragilariopsis species after the ancestral lineage acquired the sequence from the basidiomycetes.
33
It is also noteworthy that none of the ice-binding sites (here Thr104 and Thr106) and Ca+-coordinating residues (here Asp102 and Glu109) identified in the
Funding Sources
Edinboro University of Pennsylvania provided financial support for the research project.
Competing Interests
Author(s) disclose no potential conflicts of interest.
Author Contributions
Conceived and designed the experiments: US. Analyzed the data: US. Wrote the first draft of the manuscript: US. Contributed to the writing of the manuscript: US. Agree with manuscript results and conclusions: US. Jointly developed the structure and arguments for the paper: US. Made critical revisions and approved final version: US.
Footnotes
Acknowledgements
I thank Laurie Graham, Peter L. Davies and the authors (ie, Yang Liu, Zhengjun Li, Qingsong Lin, Jan Kosinski, J. Seetharaman, Janusz M. Bujnicki, J. Sivaraman, Choy-Leong Hew) of the paper entitled “Structure and Evolutionary Origin of Ca2+-Dependent Herring Type II Antifreeze Protein” for sending me the data used in their studies of the type II antifreeze protein. The author is grateful for the constructive comments provided by two anonymous reviewers.
As a requirement of publication author(s) have provided to the publisher signed confirmation of compliance with legal and ethical obligations including but not limited to the following: authorship and contributorship, conflicts of interest, privacy and confidentiality and (where applicable) protection of human and animal research subjects. The authors have read and confirmed their agreement with the ICMJE authorship and conflict of interest criteria. The authors have also confirmed that this article is unique and not under consideration or published in any other publication, and that they have permission from rights holders to reproduce any copyrighted material. Any disclosures are made in this section. The external blind peer reviewers report no conflicts of interest.
