Abstract
Analysis of gene sets can implicate activity in signaling pathways that is responsible for cancer initiation and progression, but is not discernible from the analysis of individual genes. Multiple methods and software packages have been developed to infer pathway activity from expression measurements for set of genes targeted by that pathway. Broadly, three major methodologies have been proposed: over-representation, enrichment, and differential variability. Both over-representation and enrichment analyses are effective techniques to infer differentially regulated pathways from gene sets with relatively consistent differentially expressed (DE) genes. Specifically, these algorithms aggregate statistics from each gene in the pathway. However, they overlook multivariate patterns related to gene interactions and variations in expression. Therefore, the analysis of differential variability of multigene expression patterns can be essential to pathway inference in cancers. The corresponding methodologies and software packages for such multivariate variability analysis of pathways are reviewed here. We also introduce a new, computationally efficient algorithm, expression variation analysis (EVA), which has been implemented along with a previously proposed algorithm, Differential Rank Conservation (DIRAC), in an open source R package, gene set regulation (GSReg). EVA inferred similar pathways as DIRAC at reduced computational costs. Moreover, EVA also inferred different dysregulated pathways than those identified by enrichment analysis.
Introduction
Cellular signaling generates a chain of protein–protein interactions, often terminating in the activation of transcription factors. Such signaling in molecular pathways induces and advances many human cancers. In principle, targeting the specific signaling pathways responsible for individual malignancies would yield an effective treatment. However, identifying the key signaling pathways relies on first inferring the signaling activity in that tumor. Ideally, coordinated changes in the phosphorylation state in network proteins could be measured to directly implicate specific signaling pathways in a malignancy, and the technology to measure such protein states is rapidly advancing. In the meantime, however, many algorithms use the existing transcriptional data to infer differentially regulated pathways. The accuracy of such inference relies in large part on the sets of genes annotated to each pathway (reviewed in Ref. 1–6). In analyses of gene expression data, it is essential to select sets of genes whose expression is altered because of pathway activation. For example, the TRANScription FACtor (TRANSFAC) database 2 assembles experimentally validated sets of genes resulting from transcription factor activation. Using these data with set statistics to infer coordinated changes in targets of transcription factors downstream of cell signaling pathways has been an effective substitute for directly inferring differential pathway signaling (eg, Ochs et al. 3 , and Fertig et al. 4 ). Regardless of the measurement technology, inference of signaling pathways thus requires statistical techniques to be able to account for changes in multiple molecular species.
Historically, analysis of differential pathway regulation from transcriptional data has been divided into two major classes of methodologies (reviewed in Irizzary et al. 5 ): over-representation methods and enrichment methods. Over-representation methods compare sets of genes annotated to pathways to a list of those genes that are significantly differentially expressed (DE) between two phenotypes. Enrichment methods employ a “soft” version of over-representation based on a summary statistic to characterize the level of differential expression of genes in the pathway relative to a null distribution. These methods have been extended to infer pathway members or networks from transcriptional data (eg, Tarca et al. 6 ).
Both over-representation and enrichment methods for detecting differential pathway regulation are robust at inferring consistent up- or down-regulation of pathway genes. However, alterations in cell signaling pathways may be associated with complex changes in gene expression because of pathway interactions. 7 Moreover, expression in individual genes is highly variable in human tumors8,9 in part because of the distinct evolution of individual tumors from the same cancer subtype. Thus, individual genes may contribute differently to alterations in the same pathway. As a result, pathways that are dysregulated in human tumors may exhibit complex, multivariate changes in variability that are not captured by the aggregation of statistics of individual genes in over-representation or enrichment analyses. Here, we review more recent methods for detecting differential regulation based directly on multivariate measures of pathway variability. Specifically, we focus on Differential Rank Conservation (DIRAC), 10 and a more computationally efficient alternative algorithm, expression variation analysis (EVA). We also introduce a new R package, gene set regulation (GSReg), 11 that implements these algorithms to facilitate inference of pathway dysregulation.
Pathway Analysis Methodologies
In this section, we briefly review algorithms for pathway analysis from transcriptional data. Currently, all such algorithms identify significantly perturbed pathways by applying gene set statistics to compare gene expression of pathway targets in one phenotype to gene expression of pathway targets in another phenotype. As a result, they rely critically on the numerous curated databases that annotate genes to pathways. 1 Regardless of the pathway targets, algorithms for pathway analysis can be divided into three major classes: over-representation, enrichment, and differential variability analyses. We list software that implements each technique in Table 1 and refer the reader to Irizarry et al. 5 , Khatri et al. 12 , and Maciejewski. 13 for more reviews and comparisons of over-representation and enrichment analyses. An overview is provided in Figure 1.

Pathway analysis methodologies from gene expression: (A) Over-representation analysis first performs a statistical test for each gene by comparing expression values in phenotypes to identify a set of significantly DE genes, obtaining a gene count
Examples of software available for gene set analysis, divided into three major families of algorithms: over-representation, enrichment, and differential variability analyses.
The first methodology, over-representation analysis, assesses similarity between the set of all DE genes and the set of genes annotated to a pathway (Fig. 1A), and was introduced in Khatri et al. 14 First, significantly DE genes between specified phenotypes are identified. For example, a set of DE genes may be defined by computing a Wilcoxon test to compare expression in two phenotypes for each gene measured and selecting significant genes as those having a false discovery rate below a threshold value of 5%. Then a gene set statistic is calculated for each pathway by applying a statistical test (eg, Fisher's test) that compares each set of pathway genes to the set of DE genes. Pathways whose members are significantly enriched for DE genes are called significant. The methods listed in top of the Table 1 are examples from this family.
Whereas over-representation analysis compares discrete sets of genes, the second methodology - enrichment analysis - formulates set statistics that summarize the overall level of differential expression for the pathway genes between the phenotypes. The first method of this class was gene set enrichment analysis (GSEA).
15
Generally, enrichment analysis calculates the differential activity of genes across phenotypes using a differential expression statistic (eg, a
Although they use different statistics, both over-representation and enrichment methods infer coordinated, average expression changes between phenotypes in sets of genes annotated to a pathway. Because they do not rely on a hard threshold, enrichment methods are more sensitive than over-representation methods at inferring coordinated expression changes in sets of genes. However, they may yield many more false positives. Regardless of their relative advantages, the false-positive rate of both tests may be dependent upon the number of genes in the set. 41 Moreover, both methods perform the best when changes in genes annotated to a pathway are consistent and relatively homogeneous in each phenotype; for example, sharply different expression values for a given gene are seen in most samples. However, tumor pathway dysregulation based on interactions among multiple genes may cause differential variability in gene expression between phenotypes. Therefore, the third family of the methods, differential variability analysis, is a multivariate approach that assesses variability within a pathway for a given phenotype and then compares these measures across phenotypes. This emerging methodology, pioneered in Eddy et al. 10 , and Zhang et al. 38 , has been extended to a broad set of algorithms summarized in Table 1, and is the focus of the remainder of this paper.
Differential Variability Analysis
Differential variability analysis first measures the level of variability in gene expression in a pathway for a given phenotype and then compares these levels for different phenotypes to determine differential pathway regulation. For example, different pathway genes may have expression outliers in distinct tumor samples relative to normal controls, captured with methods such as Open Grid Services Architecture (OGSA). 42 Such distinct alterations in individual tumors may also increase variability of expression in individual genes, motivating approaches that apply over-representation or enrichment analysis to variability statistics for individual genes. 43 Nonetheless, in general, alterations in expression may depend strongly on interactions among the genes in the pathway. Consequently, new algorithms employing multivariate statistics are emerging in order to model such complex shifts in variability from one phenotype to another.
Zhang et al.
38
developed one of the first methods of this type. Their algorithm calculates correlations between all pairs of genes within a pathway given a phenotype as a measure of pairwise interactions, and then a
Regrettably, none of the algorithms described above provide a robust software package to facilitate application to new data. They also rely on continuous, normalized gene expression measurements. We have previously shown that rank-based techniques (ie, methods that depend only on the relative ordering of expression values) (i) are more robust to the preprocessing and normalization of data
44
than techniques relying on normalized gene expression, (ii) are competitive with the best classification methods in discriminating among phenotypes (eg, Geman et al.
45
), and (iii) can be far simpler to explain and interpret in biological terms.46,47 Therefore, Eddy et al proposed DIRAC
10
as an ordering-based method for differential variability analysis. Given a pathway and a phenotype, DIRAC generates a binary template (one component for each pair of genes) for the ordering of the expression values for the genes in the pathway, and then calculates the average “distance” between training samples and the template as the measure of the pathway variability of the phenotype. The “distance” used in DIRAC involves the Hamming distance over the pairwise comparisons. Permutation tests are used to estimate
Although DIRAC is effective in inferring dysregulated pathways, the permutation test on which it is based is computationally inefficient, and becomes infeasible when applied to large numbers of pathways and samples. Therefore, we propose an alternative approach called EVA.
36
Given a phenotype and pathway, EVA measures the average distance between two randomly chosen expression profiles for the phenotype. More specifically, the EVA variability statistic is the expected Kendall-τ distance
48
between the rank vectors corresponding to two independent copies of expression profiles over the pathway. Kendall-τ is a distance that quantifies the difference between the orderings of two vectors. In this case, the permutation distance is defined for the rank vectors of gene expression profiles for pathway genes. The Kendall-τ distance between the two gene expression profiles is essentially the number of disagreeing comparisons between all pairs of genes in the pathway, analogous to the change of rank in DIRAC. To estimate a variability statistic from samples in each phenotype, the EVA algorithm then averages the Kendall-τ distance between each pair of samples from that phenotype. These variability statistics are then compared between two phenotypes for each pathway to estimate pathway dysregulation between phenotypes. The
Gsreg Package
We develop GSReg R package to perform differential variability analysis using DIRAC and EVA, available through Bioconductor. Here, we demonstrate our software by reproducing the results from the DIRAC paper and replicating these results with EVA. Since the original data of DIRAC paper was in Matlab format, we provide the data in a complementary R package, GSBenchMark, 50 also available through Bioconductor.
Figure 2 shows the results of variability pathway analysis comparing head and neck squamous cell carcinoma samples to matched normal controls.
51
Figure 2 compares variability statistics of pathways in tumors (

Comparison of dysregulated pathways identified by (A) DIRAC and (B) EVA in comparing head and neck squamous cell carcinoma samples (
DIRAC and EVA have been shown mathematically similar.
36
The main advantages of the EVA are efficiency in calculation and a more straightforward interpretation that does not involve a “template” but rather is simply the average distance between two samples. To illustrate the computational advantage, for the head and neck cancer data, using a Lenovo ThinkPad with Core™ i7-3720QM Intel CPU at 2.6 GHz and only 1000 permutations of phenotype labels, the DIRAC analysis required 207 seconds while the EVA analytical computation only took 0.3 seconds. Figure 3 compares the corresponding


Comparison of dysregulated pathways identified by EVA and LIMMA analysis. Each point represents a pathway, and we compare samples from head and neck squamous cells (
To illustrate the difference between the outcomes of EVA and enrichment analysis, we chose awell-known enrichment method, the Wilcoxon gene set test implemented in Linear Models for Microarray Data (LIMMA).
31
For these analyses, we apply the Benjamini–Hotchberg procedure
52
to account for multiple hypothesis testing, which was not feasible in the previous comparison with the DIRAC analysis because of the relatively coarse resolution of
Conclusion
Cancer is known to be the result of the perturbations in signaling pathways. Many algorithms have been proposed to identify and analyze these perturbations from transcriptional data. We reviewed three major families of pathway analysis methods, each having different criteria for calling a pathway perturbed: over-representation of DE genes, enrichment of large DE statistics in pathway genes, and significant difference in variability of gene expression. This last class of methods is particularly adept at inferring dysregulated pathways with differential variability in multivariate gene expression patterns. Here, we implemented one such variability analysis algorithm, DIRAC, 10 and a novel, more efficient alternative EVA in an R package GSReg.
For future work, methods that incorporate more information about biological mechanism may enhance interpretation and reproducibility of learned dysregulated pathways. Also, methods that can assess variability across more than two phenotypes are needed to infer dysregulated pathways in distinct tumor subtypes. Moreover, existing methods for gene set analysis either detect the differential expression or differential variability to identify differential regulation across phenotypes. A more versatile methodology might be a combination of both types of pathway analyses. These combinations may be implemented by using the Kendall-τ distance to compare two independent samples, but from two different phenotypes. Thus, extending the sample comparisons in EVA would provide an algorithm to compare pathway variability within phenotypes with pathway variability between phenotypes.
Author Contributions
Conceived and designed the experiments: BA, DG, EJF Analyzed the data: BA. Wrote the first draft of the manuscript: BA. Contributed to the writing of the manuscript: BA, DG, EJF. Agree with manuscript results and conclusions: BA, DG, EJF. Jointly developed the structure and arguments for the paper: BA, DG, EJF. Made critical revisions and approved final version: DG, EJF. All authors reviewed and approved of the final manuscript.
