Whole exome sequencing identifies compound heterozygous variants of CR2 gene in monozygotic twin patients with common variable immunodeficiency

Abstract

Objectives:

A pair of female Malay monozygotic twins who presented with recurrent upper respiratory tract infections, hepatosplenomegaly, bronchiectasis and bicytopenia were recruited in this study. Both patients were suspected with primary immunodeficiency diseases. However, the definite diagnosis was not clear due to complex disease phenotypes. The objective of this study was to identify the causative gene mutation in these patients.

Methods:

Lymphocyte subset enumeration test and whole exome sequencing were performed.

Results:

We identified a compound heterozygous CR2 mutation (c.1916G>A and c.2012G>A) in both patients. These variants were then confirmed using Sanger sequencing.

Conclusion:

Whole exome sequencing analysis of the monozygotic twins revealed compound heterozygous missense mutations in CR2.

Keywords

Respiratory medicine allergy/immunology primary immunodeficiencies paediatric whole exome sequencing

Introduction

Primary immunodeficiency diseases (PID) are rare genetic diseases that result in defective immune function.¹ Common variable immunodeficiency (CVID) is a type of PID with estimated incidence of 3.8 per 100,000 live births in Denmark, 3.1 in Iceland, 2.6 in Norway, 2.17 in Taiwan, 0.7 in France and 0.6 in Spain.^2,3 CVID comprises heterogeneous immune disorders such as antibody deficiency, increased susceptibility to autoimmunity and lymphoproliferation.^4,5 The diagnosis and management of the disease are difficult due to the polygenic nature of CVID and similar clinical symptoms shared with the other PID.^6,7 Moreover, CVID cases have been reported to inherit in both autosomal dominant and autosomal recessive patterns.⁸ Owing to the wide clinical spectrum and genetic heterogeneity, genotype–phenotype relationship of CVID remains unclear.⁹

Advancement of next-generation sequencing (NGS) provides a better insight into the genetic diagnosis of complex diseases.^10,11 Whole exome sequencing (WES), an NGS method that targets the protein-coding regions in the genome, had been widely used in the identification of genetic mutation in rare disorders.^12–14 The use of WES has enabled the identification of a single hemizygous, missense XIAP variant in X-linked lymphoproliferative disease 2 which may not be easily diagnosed using traditional genetic techniques due to unusual clinical manifestation.^15,16

A pair of monozygotic (MZ) twins (P1 and P2) of non-consanguineous Malay parents (Figure 1) were recruited in this study. In the first few years of life, they had several episodes of upper respiratory tract infections, which were resolved without requiring hospital admission. The patients were first hospitalised for respiratory tract infection at the age of 7, when they presented with tachypnoea and were noted to have digital clubbing, hepatosplenomegaly, failure to thrive, bronchiectasis and bicytopenia. However, bone marrow aspiration did not suggest any malignancy infiltration. The parents and three other siblings appeared healthy (Figure 1). These patients were suspected with PID. Therefore, WES was performed in both patients to determine the causative gene responsible for the disease phenotype.

Figure 1.

Family pedigree of the patients. Roman numerals i and ii indicate generations, squares indicate males, circles indicate females, filled circles indicate the affected MZ twins and crossed-out circles indicate the deceased. The unaffected family members were not studied. The values under each symbol represent the age in years.

Materials and methods

Subjects and ethics statement

The study protocol was approved by the Medical Research and Ethics Committee (MREC), Ministry of Health Malaysia (KKM/NIHSEC/P16-837). MZ twins suspected with PID were recruited in this study.

Lymphocyte subset enumeration

Whole blood (10 mL) from each patient was obtained in ethylenediaminetetraacetic acid (EDTA) blood tube for lymphocyte subset enumeration and DNA extraction. The patient’s whole blood (50 µL) was stained with two cocktails containing monoclonal antibodies in a BD Trucount tube, respectively: BD Multitest™ CD3 FITC (clone SK7)/CD8 PE (clone SK1)/CD45 PerCP (clone 2D1(HLe-1))/CD4 APC (clone SK3) and BD Multitest CD3 FITC (clone SK7)/CD16+ CD56 PE (clone B73.1 and clone NCA16.2)/CD45 PerCP(clone2D1(HLe-1))/CD19 APC (clone SJ25C1). The tubes were incubated for 15 min in dark. Then, the mixture was lysed with BD FACS™ Lysing solution and incubated for 10 minutes. The percentage and absolute count of the lymphocyte subsets were analysed using BD FACSDiva™ software on a BD FACS Canto™ II flow cytometer (Becton Dickinson, USA).

DNA extraction

Peripheral blood mononuclear cells (PBMCs) were isolated from whole blood using the standard Ficoll-Paque centrifugation procedure. Genomic DNA was extracted using QIAamp^® DNA Mini Kit (Qiagen, Germany) following the manufacturer’s instructions. The quality of DNA samples was tested using 1% agarose gel electrophoresis. Qubit 2.0 Fluorometer (Life Technologies, USA) was used for DNA quantification.

Exome capture and sequencing

Both patients’ genomic DNA was subjected to WES, with average sequencing depth of 100× coverage, using Agilent SureSelect Human All Exon V5 (Agilent, USA) that targets 50-Mb exonic sequence.

Quality control check and processing of sequencing reads

Quality of raw sequence data from all FASTQ files was assessed using FastQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). Trimmomatic was used to trim raw reads with poor base quality and sequence adapters.¹⁷ Remaining clean reads were aligned to the human reference genome GRCh38 (https://www.ncbi.nlm.nih.gov/assembly/GCA_000001405.27) using Burrows-Wheeler Aligner (BWA).¹⁸ SAMtools was used for sorting reads and collecting statistical metrics.¹⁹ Next, Picard tools (http://picard.sourceforge.net/) were utilised to mark the duplicate reads and collect sequencing artefact metrics.

Variant calling and annotation

Variant calling for single nucleotide polymorphisms (SNPs) and insertions/deletions (Indels) was performed using Genome Analysis Toolkit (GATK).²⁰ These variants were subsequently annotated using web-based ANNOVAR (wANNOVAR).²¹

Genotype concordance and discordance analysis

To determine the concordant and discordant variants in both twins, all exonic SNPs from their variant call format (VCF) files were compared. The concordant and discordant variants were extracted and annotated using BCFtools (http://samtools.github.io/bcftools) and wANNOVAR, respectively. The variants were considered concordant if the genotypes are identical in both VCF files. Whereas variants with genotypes that are unique to only one twin were categorised as discordant variants.

Variant filtering and prioritisation

Annotated variants were filtered and prioritised based on three criteria. First criterion was variant classification such as synonymous, non-synonymous, frameshift and non-frameshift mutations. Second criterion was allele frequency reported by Genome Aggregation Database (gnomAD). Third criterion was in silico protein functional prediction for variants by Sorting Intolerant From Tolerant (SIFT),²² Polymorphism Phenotyping v2 (PolyPhen-2)²³ and MutationTaster.²⁴

Polymerase chain reaction and Sanger sequencing

The CR2 variant identified by WES was confirmed by Sanger sequencing. The results were viewed using CLC Genomics Workbench v7.5.1 software.

Results

Immunological analysis

The initial diagnostic assessment revealed reduced T, B and natural killer (NK) cell counts in both patients (Figure 2 and Table 1). We found that the total T-cell values were low, 671 × 10⁶/L and 545 × 10⁶/L for P1 and P2, respectively. In addition, the total B-cell values were also low, 26 × 10⁶/L and 16 × 10⁶/L for both patients.

Figure 2.

Flow cytometry data on lymphocyte subset enumeration. Dot plots show the distribution of lymphocyte sub-populations in the MZ twins and a normal control in (a) CD3+ T cells versus SSC; (b) CD3+ CD8+ T cells versus CD3+ CD4+ T cells; and (c) CD16+ CD56+ NK cells versus CD19+ B cells. Each dot represents an individual cell analysed by the flow cytometer.

Table 1.

Clinical and laboratory findings of the patients.

Features	P1	P2
Age at presentation	7 years	7 years
Sex	Female	Female
Geographic location	Malaysia	Malaysia
Race	Malay	Malay
Infection	URTI	URTI
Lymphoproliferation	Hepatosplenomegaly	Hepatosplenomegaly
Lung disease	Bronchiectasis	Bronchiectasis
Bone marrow aspiration	Not suggestive for any malignancy infiltration
Others	Anaemia, digital clubbing, persistent tachypnoea, failure to thrive
Treatment	IVIG	IVIG
Status	Passed away at 10 years	Passed away at 11 years
Immunoglobulins	Low IgG and IgA	Low IgG and IgA
IgG (g/L)	4.22 (N = 5.2–15.6)	4.16 (N = 5.2–15.6)
IgA (g/L)	<0.02 (N = 0.54–3.6)	<0.02 (N = 0.54–3.6)
IgM (g/L)	1.27 (N = 0.13–2.4)	9.97 (N = 0.13–2.4)
Lymphocyte populations	Lymphopenia	Lymphopenia
T cells (×10⁶/L)	671 (N = 1400–2000)	545 (N = 1400–2000)
B cells (×10⁶/L)	26 (N = 300–500)	16 (N = 300–500)
NK cells (×10⁶/L)	79 (N = 200–600)	95 (N = 200–600)

IVIG: intravenous immunoglobulin therapy; URTI: Upper respiratory tract infection; N: age-matched reference range; NK: natural killer.

WES analysis

WES generated 78,835,626 and 77,270,546 paired-end reads in P1 and P2, respectively. The reads were mapped to the human reference genome GRCh38. Both patients had achieved high percentage of mapping, which were 99.26% for P1 and 99.39% for P2. We investigated variations such as SNPs and Indels in both patients as these are the two largest types of human genetic variations.²⁵

Evaluation of genotype concordance and discordance

We identified approximately 22,000 exonic SNPs in both twins, 93.4% and 90.5% of which were concordant between the twins, respectively (Figure 3(a)). Besides, we also investigated the number of private and shared SNPs between each twin and an unrelated control. Conversely, when compared with an unrelated control, only 59%–64% of exonic SNPs were found concordant (Figure 3(b) and (c)).

Figure 3.

Genotype concordance and discordance of two samples. Venn diagrams depict the number of shared and private SNPs identified between (a) P1 and P2, (b) P1 and unrelated control and (c) P2 and unrelated control.

PID gene screening

We screened the WES datasets for 240 PID-related genes as reported by the International Union of Immunological Societies for Primary Immunodeficiency to identify the genetic variations in PID-related genes.²⁶ We interpreted the candidate variants according to the standards and guidelines as recommended by American College of Medical Genetics and Genomics and the Association for Molecular Pathology.²⁷ Since the patients’ clinical symptoms resembled CVID, we narrowed the analysis to the 31 genes known to be associated with CVID. We identified 17 missense SNPs (Table 2) and 15 silent SNPs in CVID-associated genes from the WES datasets of both patients. The deleterious effect of missense SNPs on resulting proteins was evaluated using SIFT, PolyPhen-2 and MutationTaster. Homozygous SNPs were identified in CD19, CD27, CD86, ADAM28 and STXBP2. However, the damaging variant prediction tools predicted them as non-damaging variants (Table 2). Three missense SNPs (two SNPs in CR2 and one SNP in FANCA) were predicted to be damaging variants by at least one of the variant prediction softwares (Table 2). A heterozygous SNP in FANCA (rs1800282) identified in both patients was not considered as disease-associated variant because FANCA has been associated with autosomal recessive inheritance.²⁸ Two SNPs in CR2, c.1916G>A (rs17615) and c.2012G>A (rs17616), were identified in both patients (Table 2). Subsequent Sanger sequencing revealed that the patients’ father carried c.2012G>A (rs17616) SNP, while the mother carried c.1916G>A (rs17615) SNP (Figure 4). This indicated that both patients inherited the alleles from each of their parents leading to compound heterozygosity status.

Table 2.

Summary of missense variants found in the CVID-related genes.

Gene (dbSNP)	RefSeq	Chr	Exon	Nucleotide change	AA change (change in polarity)	Zygosity	SIFT	PolyPhen-2	MutationTaster	MAF in gnomAD
CD19 (rs2904880)	NM_001178098	16	3	c.520C>G	p. Leu174Val (both are NP)	Hom	T	B	P	0.7183
CD27 (rs2532502)	NM_001242	12	6	c.698A>G	p. His233Arg (both are Pr)	Hom	T	B	P	0.9928
CD86 (rs2681417)	NM_001206924	3	3	c.217G>A	p. Val73Ile (both are NP)	Hom	T	B	P	0.9211
CR2 (rs17615)	NM_001006658	1	10	c.1916G>A	p. Ser639Asn (both are Pr)	Het	T	D	P	0.2609
CR2 (rs17616)	NM_001006658	1	11	c.2012G>A	p. Arg671His (both are Pr)	Het	T	D	P	0.2597
LRBA (rs1782360)	NM_001199282	4	23	c.3269C>G	p. Ala1090Gly (both are NP)	Het	T	B	P	0.1245
ADAM28 (rs7814768)	NM_014265	8	22	c.2293G>A	p. Val765Met (both are NP)	Hom	T	B	P	0.9875
SDK1 (rs671694)	NM_001079653	7	7	c.383A>G	p. His128Arg (both are Pr)	Het	T	B	P	0.7519
SDK1 (rs138116831)	NM_152744	7	15	c.2161G>A	p. Val721Ile (both are NP)	Het	T	B	P	0.0008
CTLA4 (rs231775)	NM_001037631	2	1	c.49A>G	p. Thr17Ala (Pr>NP)	Het	T	B	P	0.415
STXBP2 (rs6791)	NM_001127396	19	18	c.1567A>G	p. Ile523Val (both are NP)	Hom	T	B	P	0.6342
FANCA (rs1800282)	NM_000135	16	1	c.17T>A	p. Val6Asp (NP>Pr)	Het	D	B	N	0.0769
FANCA (rs11646374)	NM_000135	16	14	c.1235C>T	p. Ala412Val (both are NP)	Het	T	B	P	0.0657
FANCA (rs2239359)	NM_000135	16	16	c.1501G>A	p. Gly501Ser (NP>Pr)	Het	T	B	P	0.5015
FANCA (rs17232910)	NM_000135	16	22	c.1927C>G	p. Pro643Ala (both are NP)	Het	T	B	P	0.0669
FANCA (rs7195066)	NM_000135	16	26	c.2426G>A	p. Gly809Asp (NP>Pr)	Het	T	B	P	0.471
FANCA (rs9282681)	NM_000135	16	40	c.3982A>G	p. Thr1328Ala (Pr>NP)	Het	T	B	P	0.0653

Chr: chromosome; AA: amino acid; NP: non-polar; Pr: polar; Hom: homozygous; Het: heterozygous; T: tolerated; D: damaging; P: polymorphism; N: neutral; MAF: minor allele frequency; gnomAD: Genome Aggregation Database; B: benign.

Figure 4.

Confirmation of compound heterozygous CR2 variants identified from WES using Sanger sequencing. DNA sequence chromatograms illustrate the variants in CR2 at (a) c.1916G>A and (b) c.2012G>A for normal control, P1, P2, mother and father of the MZ twins. The patients’ mother has a heterozygous c.1916G>A variant whereas the father has a heterozygous c.2012G>A variant, leading to the compound heterozygosity in patients.

Discussion

Here, we performed WES analysis on a pair of MZ twins suspected with PID to identify the causative genetic variation responsible for their condition. WES results demonstrated that CR2 gene could be the possible disease-associated gene in both patients. CR2 deficiency is known to be a causative factor for CVID phenotype.²⁹ CVID is a primary immunodeficiency characterised by a decrease in serum IgG level, a decrease in either IgA or IgM, and a poor response to vaccines in children of at least 2 years of age, after excluding other causes of hypogammaglobulinemia.³⁰

Complement receptor 2 (CR2), an essential receptor for Epstein–Barr virus, that expressed primarily on B cells and follicular dendritic cells was also reported on T cells.³¹ This 145-kDa membrane protein is encoded by CR2 gene with 19 exons at chromosomal location of 1q32.³² CR2 protein functions as a coordinator for humoral response and complement system.³³ The CR2 protein is composed of 15 or 16 extracellular short consensus repeats (SCRs), one transmembrane domain and an intracytoplasmic region.³⁴ Various types of mutations (nonsense, frameshift, splice site and missense) have been reported across all domains of CR2 protein.³² The mutation sites (c.1916G>A and c.2012G>A) found in our patients occurred in the exon 10 and exon 11 of CR2 gene. These mutations were predicted to cause amino acid changes in the protein-coding sequence (p. Ser639Asn and p. Arg671His), respectively, in which the resulting amino acid residues are located in the conserved sushi domain or SCRs of CR2 protein. However, protein expression analysis was unable to be conducted because both patients deceased in the middle of the study. Similarly, functional assays (in vitro or in vivo) which are crucial to evaluate the pathogenicity of causative gene especially in compound heterozygous case were unable to be performed.

CR2 deficiency has been reported as autosomal recessive disorder.²⁶ Besides, a compound heterozygous mutation in CR2 gene was recently described to be associated with CR2 deficiency. Hypogammaglobulinemia and reduced memory B cells were noted in the patient with CR2 deficiency.³² In our study, Sanger sequencing showed that each of our patients inherited a mutated allele from each parent, suggesting compound heterozygous status in both patients. Therefore, these variants (c.1916G>A and c.2012G>A) are most likely to be associated with the disease phenotypes seen in our patients. In addition, copy number variation (CNV) analysis in patients with complex phenotypes could be used to further investigate the concordant and discordant CNVs between MZ twins.

Conclusion

In summary, this report describes an analysis based on WES dataset in a pair of Malay-descendant MZ twins. Based on evidence from Sanger sequencing and bioinformatics analysis, the compound heterozygous missense mutations (exon 10: c.1916G>A and exon 11: c.2012G>A) in CR2 gene resulted in a change to unrelated amino acids (p. Ser639Asn and p. Arg671His) that may cause CR2 deficiency in both patients.

Footnotes

Acknowledgements

The authors would like to thank the Director General of Health Malaysia for his permission to publish this article.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Ethical approval

Ethical approval for this study was obtained from the Medical Research and Ethics Committee (MREC), Malaysia (Ethic approval no: KMM/NIHSEC/P16-837).

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research work was financially supported by the Ministry of Health, Malaysia research grant (NMRR-16-892-31023).

Informed consent

Written informed consent was obtained from all subjects before the study

ORCID iD

Hamidah Ghani

References

Sheikhbahaei

Sherkat

Roos

, et al. Gene mutations responsible for primary immunodeficiency disorders: a report from the first primary immunodeficiency biobank in Iran. Allergy Asthma Clin Immunol 2016; 12: 62.

Tseng

Lai

Chen

, et al. The incidence and prevalence of common variable immunodeficiency disease in Taiwan: a population-based study. PLoS ONE 2015; 10(10): e0140473.

Selenius

Martelius

Pikkarainen

, et al. Unexpectedly high prevalence of common variable immunodeficiency in Finland. Front Immunol 2017; 8: 1190.

Chapel

Lucas

Lee

, et al. Common variable immunodeficiency disorders: division into distinct clinical phenotypes. Blood 2008; 112(2): 277–286.

Cunningham-Rundles

The many faces of common variable immunodeficiency. Hematol Am Soc Hematol Educ 2012; 2012: 301–305.

Grimbacher

Hutloff

Schlesier

, et al. Homozygous loss of ICOS is associated with adult-onset common variable immunodeficiency. Nat Immunol 2003; 4(3): 261–268.

Gulez

Aksu

Berdeli

, et al. X-linked lymphoproliferative syndrome and common variable immunodeficiency may not be differentiated by SH2D1A and XIAP/BIRC4 genes sequence analysis. Case Rep Med 2011; 2011: 121258.

Kopecky

Lukesova

Genetic defects in common variable immunodeficiency. Int J Immunogenet 2007; 34(4): 225–229.

Salzer

Warnatz

Peter

HH.

Common variable immunodeficiency: an update. Arthritis Res Ther 2012; 14(5): 223.

10.

Nigro

Piluso

Next generation sequencing (NGS) strategies for the genetic testing of myopathies. Acta Myol 2012; 31(3): 196–200.

11.

Mason

CC.

Four study design principles for genetic investigations using next generation sequencing. BMJ 2017; 359: j4069.

12.

Buckingham

Lee

, et al. Exome sequencing identifies the cause of a Mendelian disorder. Nat Genet 2010; 42(1): 30–35.

13.

Mirabello

Macari

Jessop

, et al. Whole-exome sequencing and functional studies identify RPS29 as a novel gene mutated in multicase Diamond-Blackfan anemia families. Blood 2014; 124(1): 24–32.

14.

Kelsen

Dawany

Moran

, et al. Exome sequencing analysis reveals variants in primary immunodeficiency genes in patients with very early onset inflammatory bowel disease. Gastroenterology 2015; 149(6): 1415–1424.

15.

Worthey

Mayer

Syverson

, et al. Making a definitive diagnosis: successful clinical application of whole exome sequencing in a child with intractable inflammatory bowel disease. Genet Med 2011; 13(3): 255–262.

16.

Warr

Robert

Hume

, et al. Exome sequencing: current and future perspectives. G3 (Bethesda) 2015; 5(8): 1543–1550.

17.

Bolger

Lohse

Usadel

Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 2014; 30(15): 2114–2120.

18.

Durbin

Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 2009; 25(14): 1754–1760.

19.

Handsaker

Wysoker

, et al. The sequence alignment/map format and SAMtools. Bioinformatics 2009; 25(16): 2078–2079.

20.

McKenna

Hanna

Banks

, et al. The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 2010; 20(9): 1297–1303.

21.

Chang

Wang

wANNOVAR: annotating genetic variants for personal genomes via the web. J Med Genet 2012; 49(7): 433–436.

22.

Kumar

Henikoff

PC.

Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat Protoc 2009; 4(7): 1073–1081.

23.

Adzhubei

Jordan

Sunyaev

SR.

Predicting functional effect of human missense mutations using PolyPhen-2. Curr Protoc Hum Genet 2013; 7(7): 20.

24.

Schwarz

Rodelsperger

Schuelke

, et al. MutationTaster evaluates disease-causing potential of sequence alterations. Nat Methods 2010; 7(8): 575–576.

25.

Haraksingh

Snyder

MP.

Impacts of variation in the human genome on gene regulation. J Mol Biol 2013; 425(21): 3970–3977.

26.

Picard

Al-Herz

Bousfiha

, et al. Primary immunodeficiency diseases: an update on the classification from the international union of immunological societies expert committee for primary immunodeficiency 2015. J Clin Immunol 2015; 35(8): 696–726.

27.

Richards

Aziz

Bale

, et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med 2015; 17(5): 405–424.

28.

Rodriguez

DEA

Lima

CSP

Lourenço

, et al. Molecular analysis of the most prevalent mutations of the FANCA and FANCC genes in Brazilian patients with Fanconi anaemia. Genetic Mol Biol 2005; 28: 205–209.

29.

Ameratunga

Woon

Gillis

, et al. New diagnostic criteria for common variable immune deficiency (CVID), which may assist with decisions to treat with intravenous or subcutaneous immunoglobulin. Clin Exp Immunol 2013; 174(2): 203–211.

30.

Alkan

Keles

Reisli

Evaluation of clinical and immunological characteristics of children with common variable immunodeficiency. Int J Pediatr 2018; 2018: 3527480.

31.

Braun

Melchers

Peter

, et al. Human B and T lymphocytes have similar amounts of CD21 mRNA, but differ in surface expression of the CD21 glycoprotein. Int Immunol 1998; 10(8): 1197–1202.

32.

Wentink

Lambeck

van Zelm

, et al. CD21 and CD19 deficiency: two defects in the same complex leading to different disease modalities. Clin Immunol 2015; 161(2): 120–127.

33.

Frank

MM.

CD21 deficiency, complement, and the development of common variable immunodeficiency. J Allergy Clin Immunol 2012; 129(3): 811–813.

34.

Jabs

Paulsen

Wagner

, et al. Analysis of Epstein-Barr virus (EBV) receptor CD21 on peripheral B lymphocytes of long-term EBV- adults. Clin Exp Immunol 1999; 116(3): 468–473.