Abstract
Acute cardiac allograft rejection is a serious complication of heart transplantation. Investigating molecular processes in whole blood via microarrays is a promising avenue of research in transplantation, particularly due to the non-invasive nature of blood sampling. However, whole blood is a complex tissue and the consequent heterogeneity in composition amongst samples is ignored in traditional microarray analysis. This complicates the biological interpretation of microarray data. Here we have applied a statistical deconvolution approach, cell-specific significance analysis of microarrays (csSAM), to whole blood samples from subjects either undergoing acute heart allograft rejection (AR) or not (NR). We identified eight differentially expressed probe-sets significantly correlated to monocytes (mapping to 6 genes, all down-regulated in ARs versus NRs) at a false discovery rate (FDR) ≤ 15%. None of the genes identified are present in a biomarker panel of acute heart rejection previously published by our group and discovered in the same data.
Introduction
Cardiac transplantation is the final treatment option for heart failure. Acute cardiac allograft rejection is a serious complication of heart transplantation that, left untreated, will result in organ failure. Furthering our understanding of the biological processes at work during rejection is thus very valuable. The use of peripheral blood as a readily available biological sample to investigate pathobiological mechanisms of acute allograft rejection is an active area of research.1–6 Drawing useful biological inferences from such studies is not trivial, however. While there is no doubt that the various cell populations in whole blood are undergoing important and complex regulatory changes during an acute rejection event, any signal we detect in whole blood cannot be directly attributed to differential expression in any of these individual cell populations. It is a consequence of both differences in the mean expression of all cell types between groups, as well as the differences in relative cell type abundances between these same groups. Traditional microarray analysis does not account for this. 7 This problem is often acknowledged,8–11 but seldom addressed despite several strategies having been proposed.7,12,13 Perhaps the most straightforward would be to separate the individual cell types experimentally (e.g., fluorescence-activated cell sorting or enrichment). Physical separation can be prohibitively expensive, however, and may itself affect transcript abundance levels.14,15 In addition, certain cell-types may not be readily isolated by traditional approaches. Sub-populations may not be uniquely identified by their surface signature alone or suitable antibodies may not be available.14–16 More recently, statistical methods for deconvolving cell-specific signal from whole blood microarray experiments have been proposed.7,12,13,17 In this report we have applied one of these statistical deconvolution approaches, cell-specific significance analysis of microarrays (csSAM), 7 to whole blood samples from subjects undergoing acute cardiac allograft rejection. CsSAM combines information from complete blood count (CBC), including white blood cell differentials, and whole blood microarray data to deconvolve cell-specific expression measurements that can then be compared across groups. The algorithm's ability to produce biologically relevant inference was originally assessed in the context of pediatric kidney transplantation and though it was able to identify ~200 monocyte-specific differentially expressed genes; the specific genes identified were not reported. We hypothesized that csSAM could be used to enrich whole blood microarray data in the context of acute cardiac allograft rejection. To our knowledge, this is the first report that has used a deconvolution approach such as csSAM in this context and reported individual genes to be differentially expressed.
Results
Cohort Characteristics
Patient demographics are listed under Table 1. Briefly, a single treatable acute rejection (International Society for Heart & Lung Transplantation, ISHLT grade ≥2R) sample from each of 10 subjects and one non rejection sample (ISHLT grade = 0R) from 16 subjects who did not have treatable acute rejection at any time within the first six months post-transplant were selected for analysis. Groups had similar mean age (48 ± 14 and 50 ± 16 in ARs and NRs, respectively) and were mainly composed of Caucasian subjects. NRs were more predominantly male (14/16 subjects) compared to ARs (6/10 subjects).
Demographics of cardiac transplant subject cohorts.
CBC Differentials
No statistically significant differences (P < = 0.05) were found between AR and NR subjects for any cell type or total leukocyte absolute counts. Similarly, no statistically significant differences in relative cell-type abundances were observed, though relative lymphocyte counts were generally lower in AR compared to NR. This was not statistically significant, however (Fig. 1, cell counts can be found in Supplementary Table 1).

Complete blood count leukocyte differentials reveal no statistically significant differences between groups in any cell sub-population.
Differential Gene Expression in Whole Blood
5063 probe-sets (5059 down-regulated in ARs and 4 up-regulated in ARs) were identified as differentially expressed in whole blood (Significance Analysis of Microarrays [SAM], false discovery rate [FDR] ≤ 5%—Fig. 2).

Whole blood differential expression signal is reduced once we account for sample heterogeneity.
We compared these probe-sets to our previously published panel and found that all panel members published in Lin et al. were present and highly ranked in the 5063 probe-set list. 1
CsSAM: Adjusted Whole Blood Expression
After adjusting for sample cell-type composition, 1474 probe-sets are called as differentially expressed (all down-regulated in ARs [FDR ≤ 20%]—Fig. 2). When we compared the probe-sets in this adjusted list against those identified in the traditional analysis (via SAM, above), 1105 probe-sets were common to both lists (not shown). None of the probe-sets unique to the adjusted list were present in the Lin et al. panel. 1
CsSAM: Cell-specific Differential Gene Expression
No cell-specific signal was detected at a reasonable FDR cutoff in neutrophils, lymphocytes, eosinophils or basophils. However, deconvolution to cell-specific expression yielded 8 probe-sets identified as differentially expressed (down-regulated in ARs) in monocytes (FDR ≤ 20%—Fig. 3). These probe-sets were mapped to 6 genes (using Affymetrix HG-U133 plus 2.0 annotation release 31). One probe-set lacked annotation (238320_at), while 2 others (202855_s_at and 202856_s_at) mapped to the same gene, SLC16A3. These are summarized in Table 2.

csSAM identifies cell type-specific differential expression in monocytes during acute cardiac allograft rejection.
Summary of differentially expressed probe-sets.
Cell-specific Functional Enrichment
The test statistic produced by the deconvolution algorithm was used to create a cell-specific ranked list of probe-sets for submission to Gene Set Enrichment Analysis (GSEA). 18 Enrichment analysis of the deconvolved monocyte expression using KEGG canonical pathways as input yielded 93 gene sets down-regulated in ARs relative to NRs in monocytes and 4 gene set up-regulated using an FDR cutoff of ≤25% (default value, as recommended by the Broad Institute). We found no gene sets identified as enriched in either ARs or NRs in the unadjusted whole blood analysis (FDR ≤ 25%). KEGG canonical pathways enriched in monocytes in ARs were largely associated with cancer (10 pathways), but also immune cell activity and infection (7 and 4 pathways, respectively) and cell migration/recruitment (3 pathways). The most significant gene sets (FDR ≤ 1%, Familywise error rate [FWER] P ≤ 0.05) are summarized in Supplementary Table 2.
Discussion
We have identified 6 genes whose transcript abundance is modeled to be significantly lower in the circulating monocytes of subjects diagnosed with acute cardiac allograft rejection. While this number may seem low, we nevertheless regard it as a biologically significant finding, particularly in light of the limitations of deconvolution using CBCs and white blood cell differentials as input (incomplete or inappropriate sample composition information may significantly affect our ability to detect cell specific differential expression; discussed below). The candidates are plausible for monocytes in this disease context. Furthermore, the functional enrichment results are also consistent for monocytes. The following genes were identified: SLC16A3, SLC6A6, ADAM8, CFLAR, ALPL and PRKDC. All genes were down regulated in AR.
SLC16A3 (MCT4—monocarboxylate transporter 4) transports lactic acid and pyruvate across the cell membrane. It is expressed in monocytic lineage cells, and is relatively more abundant in these cells when compared to other tissues (BioGPS).19,20 MCT4 is expressed at the cell membrane and is tightly bound to MCT1, 21 a related monocarboxylate transporter and known target of current immunosuppressive therapies. 22 Inhibition of MCT1 has been shown to reduce acute and chronic allograft rejection rates in rats.23,24 What function MCT4 may play in circulating, naïve monocytes in the context of acute allograft rejection is unclear, though it may relate to regulation of intracellular lactate homeostasis. 25 Intracellular lactate has been shown to enhance TLR4 signaling in monocytes and this was dependent on monocarboxylate transporter activity. 26 Interestingly, TLR4 signaling has been implicated in cardiac allograft rejection following transplantation, 27 and the canonical TL4 signaling pathway was enriched in our functional enrichment analysis (GSEA).
SLC6A6 (TAUT—taurine transporter) is a widely expressed metabolite transporter (BioGPS).19,20 Taurine plays an important cytoprotective role in phagocytic cells such as macrophages, by readily reacting with the reactive oxygen species formed by the myeloperoxidase system. 28 The resulting taurine chloramine is a potent anti-inflammatory.29–32 Reduced intra-cellular availability of taurine leading to increased pro-inflammatory signaling would be consistent with the down-regulation of SLC6A6 that we observe in monocytes in AR subjects.
ADAM8 (ADAM metallopeptidase domain 8) is thought to be involved in cell adhesion and cell-matrix interaction and has been implicated in inflammatory response and leukocyte migration following inflammatory response. 33 It is highly expressed in monocytic cell lineages, 34 and has been reported as a key regulator of inflammation in many contexts. 33 More recently however, ADAM8 was shown to limit inflammation in mice by increasing macrophage apoptosis, 35 which would be consistent with it being down-regulated in monocytes in AR subjects.
CFLAR (CASP8 and FADD-like apoptosis regulator) is reported to inhibit death-receptor induced gene induction, including pro-inflammatory signaling via NF-KB. 36 Less abundant CFLAR transcripts in AR subjects could therefore lead to both increased apoptosis and pro-inflammatory signaling.
The remaining two genes are broadly expressed. The exact physiological function of tissue nonspecific alkaline phosphatase (ALPL) is not known. DNA-dependent protein kinase, catalytic polypeptide (PRKDC) is involved in double stranded DNA break repair and recombination, but how this may relate to monocytic lineage cells specifically in this disease context is unclear.
Given the very disruptive nature of acute allograft rejection, it is not surprising that many differentially expressed probe-sets (between AR and NR subjects) were found in whole blood, as discussed previously. 12 More pertinent to our discussion here is the fact that, once we correct for the compositional variability that exists across our samples, and in the absence of any pre-filtering of probe-sets prior to statistical analysis, no differentially expressed probe-sets are detected unless we increase our FDR threshold to ≤20%. We might interpret this as meaning that any signal we detected in whole blood was less robust with respect to cell count differences in our samples. This can be a limitation when performing differential gene expression analysis in heterogeneous tissues (such as whole blood). It is often acknowledged,8–11 but seldom addressed. The fact that GSEA failed to detect any enriched gene sets in our whole blood analysis (FDR ≤ 25%) similarly suggests an absence of a biologically interpretable functional signal in the whole blood data. Conversely, GSEA carried out on the monocyte specific ranked-list of probe-sets yielded an abundance of down-regulated gene-sets in AR subjects (Supplementary Table 2).
An important limitation in our analysis is the possibility that we may have deconvolved to inappropriate cell sub-populations; adding to the noise in our data and reducing our ability to detect differentially expressed probe-sets. Since we assume that there is a unique expression profile that can identify each of the cell sub-populations that we are deconvolving to, heterogeneous sub-populations violate our underlying model. Deconvolution, then, should be to biologically relevant, functionally uniform sub-populations. It follows that deconvolving to non-homogeneous cell sub-populations, while possibly providing an incremental improvement over traditional whole blood analysis, is not optimal. Neutrophils, lymphocytes or monocytes are anything but functionally uniform and we should expect that sub-populations of these cells be engaged in functionally distinct activities during an acute allograft rejection event, even in the circulating state. We would argue that monocytes represent the most functionally uniform of the more abundant groups we deconvolved to (excluding eosinophils and basophils) and perhaps this is why we failed to discover any signal in any other cell type.
CBCs were selected because they were readily available in the clinic, but more granular alternatives exist in a research setting. Flow cytometry might offer a higher quality measure of the relative abundance of cell sub-populations and allow higher granularity. We could potentially deconvolve to dozens of different, and functionally distinct cell sub-populations and break down the monolithic neutrophil, lymphocyte and monocyte groups into more biologically relevant units. Alternatively, another family of deconvolution strategies may provide an even more sensitive and granular measure of sample composition using mixed and pure cell sub-population whole transcriptome profiles.16,37,38 Most importantly, such approaches are not limited by the existence of unique cell surface markers to identify a particular cell sub-population of interest.
Sample sizes were also problematic, particularly because of the use, by the csSAM algorithm, of a least-square regression approach that is inherently sensitive to outliers (though robust alternatives exist). Larger group sizes would both reduce any impact outliers may have on the csSAM's multiple linear regression stage and allow us to apply more stringent pre-filtering criteria. In addition, some subjects could be set aside for validation of the model in order to assess whether overfitting is a concern. Finally, it is important to bear in mind that the purpose of such an exploratory analysis is hypothesis generation rather than hypothesis testing. As such, any differentially expressed genes discovered in this fashion should be validated (e.g., RT-qPCR) in the cell sub-population in question. This may be challenging: not all subpopulations may be experimentally separated (e.g., because of a lack of unique membrane markers or a lack of antibodies for existing markers) and separation may itself affect abundance of the RNA transcript of interest (either directly or by eliminating necessary stimulus from surrounding cells). Nevertheless, such confirmation is necessary and planned as part of future work.
This exploratory analysis demonstrates the kind of inference that deconvolution of whole blood microarray data allows. CsSAM yielded 6 genes whose transcript abundance is modeled to be significantly lower in the circulating monocytes of AR vs. NR subjects. These genes were ranked very poorly in the whole blood analysis, ranked much higher in the adjusted whole blood analysis (but at an FDR ≤ 20%) and would probably not have been deemed interesting in either context. They are either broadly expressed or highly expressed in monocytic lineage cells and their biological function is plausible in the context of circulating monocytes in cardiac allograft rejection, though their specific involvement requires validation in future studies. In summary, statistical deconvolution to cell-specific expression can enrich whole blood microarray data in the context of allograft rejection and allow us to draw additional meaningful biological inference.
Materials and Methods
Study Design and Patient Selection
This work builds on previously published work by our group in which we discovered and validated biomarker panels of acute cardiac allograft rejection. 12 The study was approved by the Providence Health Care Research Ethics Board and further details regarding the study design and cohort characteristic can be found in Hollander et al. 2 Briefly, 26 subjects, enrolled within the Biomarkers in Transplantation (BiT) initiative were selected for this study based on the availability of CBC, including leukocyte differentials. Ten of the subjects had at least one treatable acute rejection (ISHLT grade ≥2R; AR) within the first six months post-transplant while 16 did not (NR). RNA was extracted from one sample taken at an ISHLT grade ≥2R episode from each AR subject and from one sample taken during an ISHLT grade = 0R rejection from all NR subjects that were time-matched to the CBC. CBCs are routinely ordered as part of standard patient monitoring and were available for most subjects. All biopsies were over-read in a blinded manner by an experienced transplant cardiac pathologist using the revised ISHLT grading scale. 39 Patient demographics were comparable between groups and presented in Table 1.
RNA Extraction and Microarray Analysis
Blood samples were collected in PAXgene tubes (PreAnalytiX, Hombrechtikon, Switzerland) and stored at –80°C until analysis. Whole blood RNA was extracted for selected samples. These samples were thawed, RNA isolated using PAXgene Blood RNA Kits (QIAGEN Inc, Germantown, MD, USA), and quality checked using an Agilent BioAnalyzer (Agilent Technologies Inc, Santa Clara, CA, USA). Affymetrix Human Genome U133 Plus 2.0 (Affymetrix, Inc, Santa Clara, CA, USA) microarrays were utilized at the Microarray Core Laboratory at Children's Hospital, Los Angeles. The microarrays were checked for quality using the “affy” (version 1.16.0) and “affyPLM” (version 1.14.0) BioConductor packages,40–42 as well as an internally developed method (Mahalanobis Distance Quality Control [MDQC], 43 available in the “mdqc” Bionconductor package).
Statistical Methods
Statistical analysis of the resulting data was conducted on the R platform (version 2.13.0), using the “samr” (version 2.0) and “affy” BioConductor (version 2.8) packages. 40 The deconvolution code was adapted from the paper by Shen-Orr et al. 7
Microarrays were first background-corrected, normalized and summarized to probe-set level data using the robust multichip average (RMA) method. 44
All 54613 probe-sets were first submitted to a supervised, 2-class, univariate analysis (Significance Analysis of Microarrays [SAM] 45 ) to identify potential differentially expressed probe-sets in whole blood. A similar analysis was performed concurrently on the csSAM (sample composition) normalized whole blood expression data. We interpreted differentially expressed probe-sets identified in the second analysis as being differentially regulated between the two groups of subjects. We also used the csSAM algorithm to deconvolve and compare cell-specific expression between the AR and NR subjects.
FDR cutoffs were used in order to assess the significance of findings. The values used varied based on the nature of the analysis and recommended best practices. For the traditional 2 class, univariate analysis (SAM), an FDR cutoff of ≤5% was used (typically, 1% or 5% are used). This corresponds to an expectation that, within the set of genes below the threshold, 5% are false positive results. When considering the csSAM normalized whole blood results and csSAM deconvolved cell-specific results, a threshold of ≤20% was used. In the original publication, 7 Shen-Orr et al. suggested using a value of ≤25%, arguing that this was acceptable in the context of exploratory analysis. We elected to use ≤20% based on Figure 2c and 3e, where 20% approximately corresponds to the inflection point of the “FDR” against “#called” plot and, thus, maximized the number of probe-sets discovered while limiting the total number of false positives.
Biological Validation
Functional enrichment of the deconvolved cell-specific signals obtained from the csSAM algorithm was examined using GSEA 18 ; performed using the desktop Java application. The test statistic generated by the csSAM algorithm was used to generate a cell-specific ranked list of the probe-sets, which was then submitted to GSEA for Ranked List Analysis. KEGG canonical pathways were used as gene set inputs and the Broad Institute recommended value of ≤25%. Finally, significant probe-sets were collapsed to well-annotated genes, based on annotation provided by Affymetrix, and then subjected to literature analysis through PubMed.
Author Contributions
Conceived and designed the experiments: CPS, ZH, JWM, RB, RTN, RM, BMM, PAK, SJT. Analysed the data: CPS. Wrote the first draft of the manuscript: CPS. Contributed to the writing of the manuscript: CPS, SJT. Agree with manuscript results and conclusions: CPS, ZH, JWM, RB, RTN, RM, BMM, PAK, SJT. Jointly developed the structure and arguments for the paper: CPS, SJT. Made critical revisions and approved final version: CPS, SJT. All authors reviewed and approved of the final manuscript.
Disclosures and Ethics
As a requirement of publication author(s) have provided to the publisher signed confirmation of compliance with legal and ethical obligations including but not limited to the following: authorship and contributorship, conflicts of interest, privacy and confidentiality and (where applicable) protection of human and animal research subjects. The authors have read and confirmed their agreement with the ICMJE authorship and conflict of interest criteria. The authors have also confirmed that this article is unique and not under consideration or published in any other publication, and that they have permission from rights holders to reproduce any copyrighted material. Any disclosures are made in this section. The external blind peer reviewers report no conflicts of interest.
Footnotes
Acknowledgments
The authors thank the patients without whose tissue donations none of this work would be possible. Grateful acknowledgment also for funding from Genome Canada, Novartis Pharma AG, IBM, Genome British Columbia and the NCE CECR PROOF Centre of Excellence. We would like to thank the NCE CECR PROOF Centre of Excellence Team.
Supplementary Tables
Summary of gene set enrichment analysis in monocytes.
| KEGG canonical pathway | Set size | ES | NES | p-val | FDR q-val | FWER p-val |
|---|---|---|---|---|---|---|
| KEGG VIBRIO CHOLERAE INFECTION | 50 | 0.68 | 2.18 | 0.00000 | 0.00000 | 0.00000 |
| KEGG ACUTE MYELOID LEUKEMIA | 56 | 0.66 | 2.18 | 0.00000 | 0.00000 | 0.00000 |
| KEGG EPITHELIAL CELL SIGNALING IN | 62 | 0.64 | 2.10 | 0.00000 | 0.00000 | 0.00000 |
| HELICOBACTER PYLORI INFECTION | ||||||
| KEGG FC GAMMA R MEDIATED PHAGOCYTOSIS | 87 | 0.59 | 2.00 | 0.00000 | 0.00017 | 0.00100 |
| KEGG ERBB SIGNALING PATHWAY | 85 | 0.58 | 1.94 | 0.00000 | 0.00019 | 0.00200 |
| KEGG PATHOGENIC ESCHERICHIA COLI INFECTION | 45 | 0.64 | 2.01 | 0.00000 | 0.00020 | 0.00100 |
| KEGG NON SMALL CELL LUNG CANCER | 52 | 0.61 | 1.96 | 0.00000 | 0.00020 | 0.00200 |
| KEGG THYROID CANCER | 29 | 0.67 | 1.96 | 0.00000 | 0.00023 | 0.00200 |
| KEGG ADHERENS JUNCTION | 73 | 0.61 | 2.02 | 0.00000 | 0.00025 | 0.00100 |
| KEGG RENAL CELL CARCINOMA | 69 | 0.58 | 1.93 | 0.00000 | 0.00025 | 0.00300 |
| KEGG LEUKOCYTE TRANSENDOTHELIAL MIGRATION | 108 | 0.58 | 1.98 | 0.00000 | 0.00026 | 0.00200 |
| KEGG ENDOMETRIAL CANCER | 51 | 0.63 | 1.98 | 0.00000 | 0.00029 | 0.00200 |
| KEGG REGULATION OF ACTIN CYTOSKELETON | 198 | 0.53 | 1.88 | 0.00000 | 0.00036 | 0.00600 |
| KEGG PANCREATIC CANCER | 69 | 0.58 | 1.91 | 0.00000 | 0.00036 | 0.00500 |
| KEGG INSULIN SIGNALING PATHWAY | 131 | 0.54 | 1.88 | 0.00000 | 0.00038 | 0.00600 |
| KEGG CHRONIC MYELOID LEUKEMIA | 72 | 0.58 | 1.93 | 0.00000 | 0.00039 | 0.00500 |
| KEGG NEUROTROPHIN SIGNALING PATHWAY | 122 | 0.55 | 1.89 | 0.00000 | 0.00041 | 0.00600 |
| KEGG VASOPRESSIN REGULATED WATER | 44 | 0.60 | 1.87 | 0.00106 | 0.00051 | 0.00900 |
| REABSORPTION | ||||||
| KEGG FOCAL ADHESION | 189 | 0.53 | 1.86 | 0.00000 | 0.00053 | 0.01100 |
| KEGG PROSTATE CANCER | 87 | 0.55 | 1.86 | 0.00000 | 0.00056 | 0.01100 |
| KEGG T CELL RECEPTOR SIGNALING PATHWAY | 107 | 0.54 | 1.85 | 0.00000 | 0.00057 | 0.01300 |
| KEGG B CELL RECEPTOR SIGNALING PATHWAY | 71 | 0.56 | 1.86 | 0.00000 | 0.00059 | 0.01100 |
| KEGG ARRHYTHMOGENIC RIGHT VENTRICULAR | 73 | 0.56 | 1.85 | 0.00000 | 0.00060 | 0.01300 |
| CARDIOMYOPATHY ARVC | ||||||
| KEGG COLORECTAL CANCER | 61 | 0.56 | 1.84 | 0.00000 | 0.00063 | 0.01500 |
| KEGG TOLL LIKE RECEPTOR SIGNALING PATHWAY | 98 | 0.54 | 1.82 | 0.00000 | 0.00077 | 0.01900 |
| KEGG LEISHMANIA INFECTION | 62 | 0.55 | 1.80 | 0.00000 | 0.00101 | 0.02600 |
| KEGG GLIOMA | 63 | 0.55 | 1.81 | 0.00000 | 0.00101 | 0.02500 |
| KEGG ENDOCYTOSIS | 158 | 0.50 | 1.76 | 0.00000 | 0.00178 | 0.05000 |
| KEGG APOPTOSIS | 82 | 0.53 | 1.77 | 0.00000 | 0.00181 | 0.04900 |
