Abstract
With the discovery of increasingly more functional noncoding RNAs (ncRNAs), it becomes eminent to more strongly consider them as important players during species evolution. Although tests for negative selection of ncRNAs already exist since the beginning of this century, the
To understand how species evolve and adapt to their environment, tests for natural selection have been developed. The common assumption is that parts of the genome that are responsible for adaptive phenotypic changes evolve faster than other parts. Most proteins and nucleic acids exert their biological function by means of well-defined interactions. The specificity of functional interactions as well as the need to avoid undesired binding activities translates into selection pressures on both the sequence and the 3-dimensional structure of proteins and nucleic acids. A relatively simple test for estimating selection pressures on protein-coding genes has been developed in the 1980s1,2 and relates the rate of nucleotide changes that cause an amino acid change (non-synonymous changes) to the rate of silent nucleotide changes (synonymous changes), referred to as Ka/Ks or dN/dS ratio. Ratios much smaller than 1 indicate negative selection, i.e., conservation of the protein sequence. Higher ratios are usually interpreted as relaxed constraint. If that ratio is positive, the excess of amino acid changing Mutations is compatible with accelerated evolution or a sign of positive selection. Despite the increasing acknowledgment that ncRNAs are functional, a comparable test for noncoding RNA (ncRNA) genes did not exist until recently. 3
Importantly, in the case of RNAs, structure-formation is dominated both thermodynamically and kinetically by the secondary structure, i.e., the pattern of base pairs and unpaired bases. The simplicity of RNA secondary structures, and their discrete combinatorial nature, makes it possible to describe selection pressures acting on the structure in terms of comparably simple rules that pertain to the preservation and turnover of base pairs. Sequence variations that locally maintain base pairing patterns are indicative of negative selection, in particular compensatory substitutions, such as the replacement of a GC pair by a CG, AU, or UA pair. On the other hand, substitutions that disrupt base pairs hint at relaxed constraints or positive selection. Conceptually, this is not different from synonymous and non-synonymous substitutions in the open reading frames (ORFs) of protein-coding genes. There is, however, an important practical difference between ORFs and RNA secondary structures: although codons are local in sequence, secondary structures are inherently nonlocal, usually involving pairs that are long-range with respect to the sequence. As a consequence, this assessment of selection pressures on secondary structure requires completely different computational tools.
It is important to realize that molecules are typically subject to multiple, superimposed selection pressures. For protein-coding genes, e.g., functional elements such as SElenoCystein Insertion Sequences (SECIS) or Internal Ribosomal Entry Sites (IRES) require tightly constrained RNA secondary structures within protein-coding sequences. This specific type of superimposed selective pressures yields substitution patterns that are recognizable by specialized computational tools. 4 Similar situations are observed in ncRNAs. For tRNAs, e.g., the clover-leaf secondary structure and the 3-dimensional L-shape are required for loading into the ribosome and recognition for charging essentially independent of the sequence. On the other hand, tRNAs have an internal pol-III promoter, whose sequence must be maintained to ensure expression. Selection may also act on the expression level. For instance, the choice of rare codons as well as highly stable mRNA secondary structure may hamper translation. Carlini et al. 5 proposed that the balance between codon bias and mRNA secondary structure is mediated through the third codon position: here, natural selection might favor high GC or AT content to increase base pairing for weakly expressed genes and the opposite for highly expressed genes. It is a nontrivial, and largely unsolved task to disentangle such superimposed selective force. Presently, available tools only model a single effect or at most a pair of specific selection pressures.
Selection pressures that independently act to maintain superimposed sequence and secondary structure features can lead to incongruent conservation of sequence and structure: in this case, sequence patterns and structural elements are shifted relative to each other. As a consequence, analogous base pairs no longer correspond to homologous sequence positions. This type of incongruent evolution violates the basic assumptions of all tools that measure secondary structure conservation: the secondary structure will not appear conserved in a sequence-based alignment, whereas in structure-based alignments nonhomologous nucleotides are aligned thus leading to an exaggerated estimate of compensatory base pairs. Tools to identify such cases are only in an exploratory stage of development at best. 6
Over the last two decades, several methods have become available to evaluate negative/stabilizing selection of secondary structures, mostly aimed at classical structured RNAs such as tRNAs, rRNA, or snRNAs. A common assumption of all these methods is that selection acts to preserve individual base pairs. The difference between the strict consensus model of
Probabilistic models can be used to determine the type of selection acting at a given locus. For negative selection, the expectation is that the rate of change is (very) low. Higher change rates can indicate either accelerated or positive evolution. Accelerated evolution is characterized by higher accumulation of changes in a short amount of time. 11 To identify accelerated regions, one should first identify negative selection for the orthologous locus in other species and then test for accumulation of species-specific changes. 12 Analyzing human accelerated regions (HARs), it seems likely that more than one evolutionary force shapes them, 11 including positive selection.11,13
In contrast to accelerated evolution, positive evolution occurs when the changed locus yields an advantage to the organism, being actively selected for throughout evolution in a longer time frame.
Although accelerated evolution is detectable at the primary sequence level alone, it is necessary to consider a phenotypic level for the detection of positive selection, to identify an advantage over the ancestral state. For ncRNAs and proteins, one should account for changes in the structure. This poses challenges for ncRNAs. Although it suffices for proteins to distinguish synonymous from non-synonymous substitutions, such a binary classification does not appear to work well for RNA secondary structures.
14
As a remedy, the
Researchers who wish to investigate selective pressures on ncRNAs should be mindful of the biological question and choose the most suitable approach and software (Table 1), keeping in mind the different selection pressures (Figure 1).
Types of selective pressures on noncoding RNAs and how to detect them.

Types of selection pressures in ncRNAs: (1) positive selection, acting on the structure, in which one species acquires a structural change in the orthologous ncRNA with an advantage over the ancestral structure; (2) accelerated evolution, acting on the primary sequence, in which the sequence of a ncRNA accumulates a relatively high number of changes compared with its orthologs over a short time span; and (3) negative selection, acting on the structure, in which the ncRNA structure is maintained across orthologs over relatively long evolutionary time.
There are several advantages to the approach taken by the
The
Although the
In some cases, it is possible not only to identify a locus under positive selection but also to reconstruct the evolutionary history itself with some accuracy. This amounts to determining the order of substitution events and can be achieved under the assumption that the structural differences between extant and ancestral structure represent the direction of the selective force. 13
Taken together, the time has come to learn more about the evolutionary history of various ncRNA genes and their role in species evolution. The
Footnotes
Funding:
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by CNPq Brasil/scholarship of Science without Borders (246039/2012-4) (MBWC), the Volkswagen Foundation within the initiative “Evolutionary Biology” (KN), the Deutsche Forschungsgemeinschaft as part of the SPP 1738 (MBWC, KN, and PFS), and in part by the German Academic Exchange Service (DAAD), proj.no. 57390771.
Declaration of conflicting interests:
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Author Contributions
KN, MBWC, CHzS and PFS conceived and wrote the manuscript.
