Toward a Molecular Approach to Chronotype Assessment

Abstract

The aim of the present study was to develop a Polygenic Score–based model for molecular chronotype assessment. Questionnaire-based phenotypical chronotype assessment was used as a reference. In total, 54 extremely morning/morning (MM/M; 35 females, 39.7 ± 3.8 years) and 44 extremely evening/evening (EE/E; 20 females, 27.3 ± 7.7 years) individuals donated a buccal DNA sample for genotyping by sequencing of the entire genetic variability of 19 target genes known to be involved in circadian rhythmicity and/or sleep duration. Targeted genotyping was performed using the single primer enrichment technology and a specifically designed panel of 5526 primers. Among 2868 high-quality polymorphisms, a cross-validation approach lead to the identification of 83 chronotype predictive variants, including previously known and also novel chronotype-associated polymorphisms. A large (35 single-nucleotide polymorphisms [SNPs]) and also a small (13 SNPs) panel were obtained, both with an estimated predictive validity of approximately 80%. Potential mechanistic hypotheses for the role of some of the newly identified variants in modulating chronotype are formulated. Once validated in independent populations encompassing the whole range of chronotypes, the identified panels might become useful within the setting of both circadian public health initiatives and precision medicine.

Keywords

chronotype single-nucleotide polymorphisms polygenic score single primer enrichment technology morningness eveningness

Chronotype is the most obvious behavioral expression of human endogenous circadian rhythmicity. It is the complex outcome of a number of molecular and physiological processes under circadian clock control. Nevertheless, from a behavioral standpoint, it can be easily described as the natural inclination to place one’s activity/sleep in different intervals of the 24-h day. Chronotype shows considerable variability in the population, ranging from early types, who are active in the morning hours and prefer to go to bed early, to late types, who find it more difficult to get up in the morning and easier to be active in the evening/early night hours. A significant proportion of the population has a less pronounced inclination (Horne and Ostberg, 1976; Roenneberg et al., 2003, 2007). Chronotype is often assessed on a phenotypic basis by use of self-administered questionnaires (Horne and Ostberg, 1976; Roenneberg et al., 2003; Ghotbi et al., 2020). The Horne-Östberg questionnaire, which explores a subject’s preference on when to perform specific activities over the course of the 24-h day (vide infra), is a good marker of chronotype when defined as diurnal preference (Vetter, 2018). The variable midsleep (i.e., the midpoint, expressed as clock time, between sleep onset and sleep offset) derived from the Munich Chronotype Questionnaire (Ghotbi et al., 2020; Roenneberg et al., 2003) is a good marker of chronotype defined as a proxy for the phase angle of entrainment (Vetter, 2018).

Twin and family studies suggest that approximately 50% of morning/evening preference can be explained by genetic factors (Klei et al., 2005; Koskenvuo et al., 2007; Barclay et al., 2010; Hsu et al., 2015). Nevertheless, the genetic bases of chronotype and its variability remain largely unknown. Only few genetic variations associated with chronotype have been investigated in depth, mainly because of their role in extreme sleep phenotypes such as advanced sleep-wake phase and delayed sleep-wake phase (Curtis et al., 2019; Ashbrook et al., 2020). For example, a mutation in the CASEIN KINASE 1E (CSNK1E) binding site of the PER2 gene (rs121908635; S662G) increases the stability of the protein resulting in an approximately 4 h phase advance (Toh et al., 2001) in individuals with familial advanced sleep-wake phase disorder. Interestingly, a similar phenotype results from a mutation (rs104894561; T44A) affecting the kinase activity of CASEIN KINASE 1D (CSNK1D; Xu et al., 2005). By contrast, a mutation in a splicing site in the CRY1 gene (rs184039278) results in the exon 11 skipping, which leads to a more persistent inhibitory activity of CRY1Δ within the negative feedback loop, at the core of the circadian clock, and therefore in a phase delay (Patke et al., 2017). However, less dramatic, non-pathological phenotypes have been associated with other genetic variations, the contribution of which is minor, less consistent, and often questioned. For example, a variable number tandem repeat (VNTR) polymorphism in exon 18 of the PER3 gene (rs57875989), with 2 alleles encoding four or five 18 aa repeats, has also been associated with chronotype. The longer allele (PER⁵) has been associated with morningness, while the shorter (PER⁴) with eveningness (Archer et al., 2003). Since each repeat contains several predicted CSNK1D phosphorylation sites, it has been proposed that the number of repeats could affect the phosphorylation state of the protein, with potential effects on its stability (Archer et al., 2003). Interestingly, this VNTR polymorphism has also been implicated in light sensitivity, with the 2 variants differently affecting melatonin suppression (Chellappa et al., 2012). More recently, we have simultaneously assessed the PER3 VNTR polymorphism and a missense single-nucleotide polymorphism (SNP) (rs228697) in the coding region of the same gene (Turco et al., 2017). This single-nucleotide variation results in an amino acid substitution (P864A) altering the secondary structure of a conserved domain relevant to the binding with NCK, an adaptor protein involved in the interaction between PER3 and CSNK1 (Akashi et al., 2002; Lussier and Larose, 1997; Turco et al., 2017). The PER3^G variant showed a significant association with morningness, both as a stand-alone and in combination with the PER3⁴ allele (Turco et al., 2017). All these studies suggest that chronotype is a complex trait—which is also supported by its being normally distributed (Ashbrook et al., 2020)—the definition of which is likely to benefit from a genome-wide approach.

The first 2 genome-wide association studies (GWAS; Gottlieb et al., 2007; Hu et al., 2016) have shown that most chronotype-associated loci are in close proximity to genes involved in circadian and sleep regulation. A recent meta-analysis of nearly 700,000 individuals (from the 23andMe Inc. data set, https://research.23andme.com/; and the UK Biobank, https://www.ukbiobank.ac.uk/) has identified 351 genomic loci associated with chronotype (Jones et al., 2016, 2019) including SNPs on some of the main circadian clock genes (PER1-3, CRY1, FBXL3, and ARNTL).

In all these GWAS studies, chronotype was self-reported and genotyping performed by custom high-throughput platforms which cannot detect rare SNPs, length polymorphisms, and short insertions or deletions (Höglund et al., 2019). Nevertheless, these could represent a valuable source of information to identify additional chronotype contributors, as was the case for the PER3 VNTR (Archer et al., 2003; Turco et al., 2017).

Our aim was to develop a Polygenic Score–based model for chronotype assessment based on genotyping by sequencing of the entire genetic variability of 19 target genes known to be involved in circadian rhythmicity and sleep duration, only in extremely morning/morning (MM/M) and extremely evening/evening (EE/E) individuals. This strategy has already been proven to increase the probability of finding novel genetic variants affecting complex continuous phenotypes by sequencing relatively small samples (Amanat et al., 2020).

Materials and Methods

A schematic overview of the experimental workflow is shown in Supplementary Figure S1.

Study Population

Between 2012 and 2019, a cohort of 679 healthy individuals were recruited during University of Padova popular science events (Turco et al., 2017) and educational workshops. Participants were asked to donate a buccal DNA sample and to complete a comprehensive sleep-wake assessment (Turco et al., 2017).

Chronotype Assessment

Chronotype was assessed by the self-administered Morningness-Eveningness Questionnaire (Horne and Ostberg, 1976; Tonetti and Natale, 2019), based on a set of questions asking the subject to indicate when, in specific periods of the 24-h day, they would prefer to place specific activities, if they were completely free to plan their day. Total scores range from 16 to 86: scores ≤ 41 define E types, with EE ≤ 30; scores ≥ 59 define M types, with MM ≥ 70; and scores between 42 and 58 define intermediate types. In our study population, 153 participants showed MM/M (22.5%) and 152 EE/E (22%). Analysis of variance (ANOVA; post hoc Tukey test) was used to compare age among groups.

Sample Collection, Processing, and Selection

Buccal DNA samples were self-collected by brushing the inside of the cheeks with a swab for about 15 sec. Samples were stored at room temperature, and DNA extracted within 12 h by the BuccalAmp QuickExtract kit (Epicentre, illumina.com) according to the manufacturer’s instructions. DNA was further purified by phenol/chloroform extraction (one volume of phenol:chloroform:isoamyl alcohol [25:24:1] and two volumes of EtOH 100%), and resuspended in water. DNA amount and integrity were assessed by NanoDrop 2000 spectrophotometer (Thermo Fisher Scientific; thermofisher.com) and agarose gel electrophoresis, respectively. Then, selected DNA samples were further quantified with the QuantiFluor dsDNA System (Promega; promega.com) to avoid sample overestimation due to RNA contamination. Among the 305 participants who provided a DNA sample and showed MM/M or EE/E, 96 DNA samples were selected for genotyping on the basis of DNA integrity and concentration. These included 10 MM, 42 M, 30 E, and 14 EE types (Table 1). Expected differences in age and sex were observed, with MM/M types being older (significant difference) and more frequently women (non-significant difference) than EE/E types (Table 1; Adan and Natale, 2002; Robilliard et al., 2002; Turco et al., 2017).

Table 1.

Characteristics of the study population (only the DNA samples selected for genotyping).

	All	Extremely Morning	Morning	Evening	Extremely Evening
	n = 96	n = 10 (10%)	n = 42 (44%)	n = 30 (31%)	n = 14 (15%)
MEQ score (mean ± SD)	50.5 ± 16.8	74.2 ± 4.8	62.7 ± 2.8	37.0 ± 3.0	26.0 ± 3.9
Age (mean ± SD)	33.0 ± 15.3	44.2 ± 20.5	35.3 ± 16.2	29.6 ± 13.1°	25.1 ± 2.4*
Female (%)	55 (57%)	7 (70%)	28 (67%)	13 (43%)	7 (50%)

p < 0.001 significance of the difference between extremely evening and extremely morning individuals.

p < 0.05 significance of the difference between evening and extremely morning individuals.

Abbreviations: MEQ = Morningness-Eveningness Questionnaire.

Target Selection

The entire genetic variability of 19 target genes, involved in circadian rhythmicity and sleep duration, was assessed. Among the main core components of the endogenous clock, 14 circadian genes, which have previously been associated with chronotype in candidate gene studies (Mishima et al., 2005; Carpen et al., 2006; Matsuo et al., 2007; He et al., 2009; Lee et al., 2011; Etain et al., 2014; Hida et al., 2014; Parsons et al., 2014; Dmitrzak-Węglarz et al., 2016; Hirano et al., 2016; Jankowski and Dmitrzak-Weglarz, 2017; Patke et al., 2017; Turco et al., 2017; Kurien et al., 2019) and genome-wide analyses (Gottlieb et al., 2007; Hu et al., 2016; Jones et al., 2016, 2019; Lane et al., 2016; Maukonen et al., 2020), were selected: PER1, PER2, PER3, CRY1, CRY2, ARNTL, CLOCK, TIMELESS, NPAS2, NCK, RORA, NR1D1, NR1D2, and CSNK1E. A list of the most relevant SNPs on clock genes which have previously been associated with chronotype is provided in Supplementary Table S1. In addition, 5 genes, known to play a significant role in sleep duration (Zhang and Fu, 2020), were also included: ABCC9, BHLHE41, DRD2, ADA, and FABP7.

S.P.E.T. Sequencing Strategy and Primer Design

Targeted genotyping by sequencing was performed using the Single Primer Enrichment Technology (S.P.E.T.; Barchi et al., 2019) based on the Illumina sequencing platform (illumina.com). A single primer is designed for the genotype calling of one selected SNP by sequencing a small region of about 150 bp around the variant site. Thus, a panel of thousands of primers allows genotype calling of thousands of SNPs at the same time. Furthermore, the technology allows the targeted sequencing of a gene by simply increasing the number of primers up to complete coverage. A specific panel of primers was designed by Tecan Genomics (lifesciences.tecan.com) on the sequence of the 19 target genes including exons, regulatory sequences up to 1000 bp upstream the transcription start site, and intronic regions 200 bp over the splicing site, to ensure coverage of exon peripheral regions and to include the canonical splice site consensus sequences such as branchpoints and B-boxes (Mercer et al., 2015). Primers were also designed to specifically assess 18 SNPs previously associated with chronotype but located far from the selected target regions (a complete list is available in Supplementary Table S2). The resulting 5526 primers (a complete list is available in Supplementary Table S3) were aligned to the reference genome (Homo sapiens, hg38, UCSC iGenomes) to verify the complete coverage of all the genomic regions of interest using the custom track function on the UCSC Genome Browser (genome.ucsc.edu).

Genotyping and Quality Control

Targeted genotyping by sequencing and quality controls (QCs) were performed by IGA Technology Services (igatechnology.com; Udine, Italy). Libraries were prepared using 20 ng/µL of DNA as input according to the Allegro Targeted Genotyping protocol (NuGEN Technologies; nugen.com). Libraries were quantified using the Qubit 2.0 Fluorometer (Thermo Fisher Scientific; thermofisher.com) and their size distribution checked by the Bioanalyzer High Sensitivity DNA assay (Agilent technologies; agilent.com). Diluted libraries were quantified through quantitative PCR (qPCR) using the CFX96 Touch Real-Time PCR Detection System (Bio-Rad; bio-rad.com) and run on the Illumina NovaSeq 6000 (Illumina; illumina.com) with 2 × 150 paired-end reads. Base calling and demultiplexing were performed using bcl2fastq v2.20 (Illumina; illumina.com). Raw reads quality assessment and adapter trimming were performed using Cutadapt (Martin, 2011) with default parameters. In further detail, not only the adapters but also the first 40 bp of each paired-end read (R1) were trimmed to exclude the primer sequence from the SNP calling. Reads alignment to the reference genome (Homo sapiens, hg38, UCSC iGenomes) and the selection of uniquely aligned reads (mapping quality > 10) were performed using ERNE (Del Fabbro et al., 2013) v1.4.6 and BWA-MEM (Li and Durbin, 2009) v0.7.17 with default parameters. SNP calling was performed using gatk-4.0 following the software best practice for germline short-variant discovery proposed by DePristo et al. (2011): (1) per-sample variants calling on target regions using HaplotypeCaller with default parameters; (2) multiple samples consolidation using GenomicsDBImport with default parameters; (3) joint genotyping using GenotypeGVCFs with default parameters; (4) selection of SNPs using SelectVariants and quality filtering of SNPs using VariantFiltration (filter expression used: Quality by Depth [QD] < 2.0, Mapping Quality [MQ] < 40.0, and MQRankSum < -12.5). SNPs were annotated according to the Single Nucleotide Polymorphism Database (dbSNP) b155 v2 (https://www.ncbi.nlm.nih.gov/snp/).

Genotyping Data Analysis

Quality control and association analyses were conducted with PLINK, versions 1.9 and 2.0 (Chang et al., 2015). The QC pipeline proposed by Marees et al. (2018) was applied on genotype data prior to conducting association analysis. In particular, results were considered acceptable if missingness of SNPs was < 10%, missingness of SNPs per individual was < 10%, minor allele frequency (MAF) was ≥1%, deviation from Hardy-Weinberg equilibrium had a significance of p>1e-10, and heterozygosity was within 3 SDs from mean heterozygosity. There were no related individuals among the 96 donors of the genotyped DNA samples. Finally, a multidimensional scaling analysis (MDS) was performed on the total number of high-quality SNPs to check for population stratification.

Polygenetic Risk Score Calculation

Association analysis was conducted by binary logistic regression (MM/M versus EE/E types) and adjusted (additive model) for sex, age, and the first 2 principal components of the MDS analysis. False discovery rate was calculated by Benjamini-Hochberg multiple testing correction (Benjamini and Hochberg, 1995). The individual genetic predisposition for morningness/eveningness was evaluated by Polygenetic Score (PS) analysis using PRSice-2 (Euesden et al., 2015). Significantly associated SNPs were linkage disequilibrium (LD)-clumped with an LD threshold (R²) = 0.5, and a range of 50 kb to reveal independent chronotype-associated SNPs (lead SNPs). To prevent overfitting, the proportion of variance explained by our data set was estimated as the average partial pseudo R² (Nagelkerke’s R² for binary chronotype; Nagelkerke, 1991) of the best models generated in a 10 × 5-fold cross-validation with a maximum p-value threshold of 0.05. A cross-validation approach (Manor and Segal, 2013) was used to obtain a variant ranking based on their predictive performance. In particular, lead SNPs were ranked based on the number of best models in which they were included. If 2 or more SNPs shared the same position, they were ordered based on their average association p value in the 10 × 5-fold cross-validation. The proportion of variance explained by different SNP sets was estimated as the average partial pseudo R² of the models generated in a new 10 × 5-fold cross-validation using the PRSset function in PRSice. The proportion of variance explained by the PS-based model is intended as an estimate of the predictive value of the model.

Results

Primer design resulted in 5526 primers (Supplementary Figure S2 shows how these were distributed within the Per3 gene, as a representative example). Targeted sequencing generated about 28 million paired-end reads with an average of 290,649 reads per sample and a mapping rate of 98% (reference genome: homo_sapiens_hg38-iGenomes).

Variant calling identified 25,030 variants including not only SNPs but also indels, polymorphic repetitive elements, and false positives. Thus, variants needed to be quality filtered. By applying Quality by Depth and Mapping Quality stringent criteria, 8968 high-quality SNPs were identified with a 95.4% total genotyping rate. QC resulted in a final sample of 93 individuals (50 MM/M, 43 EE/E) and 2868 variants, with an average call rate of 99%. MDS analysis revealed no significant population stratification with a maximum proportion of variance explained by a principal component of 4.97%. However, the first 2 components of the MDS analysis were included in the association model as covariates, in addition to sex and age, to account for a possible minor source of distortion. Chronotype (MM/M versus EE/E) association p value and chromosomal position of each variant were used in the LD-clumping procedure to remove 1365 correlated variants and retain the 1503 independent association signals (lead variants; Suppl. Table S4). Only 2 significant associations (rs3213578 and rs228699, 2 PER3 intron variants; Table 2) were detected when p values were corrected for multiple testing. However, higher p values did not impinge on the development of a predictive model based on clumping and thresholding methods, where the association p values are used to establish a ranking of the variants but their significance level is not relevant.

Table 2.

List of the 35 predictive variants (complete panel).

SNP ID	#Chrom.	Position (bp)	Gene	SO Term	REF	ALT	MAF	Allelic Distribution (ALT/REF)		p	OR	Reference
SNP ID	#Chrom.	Position (bp)	Gene	SO Term	REF	ALT	MAF	Morning	Evening	p	OR	Reference
rs228730	1	7,784,409	PER3	2KB upstream variant	G	A	0.13	22/74	2/80	1.6E-03	13.48	Archer et al. (2010)
rs3213578	1	7,801,098	PER3	intron variant	T	G	0.16	26/74	4/82	9.5E-05	12.84	Jones et al. (2019)
rs228667	1	7,809,320	PER3	intron variant	A	G	0.13	21/79	4/82	5.4E-04	9.19	Jones et al. (2019)
rs228690	1	7,819,262	PER3	intron variant	C	T	0.09	16/84	1/85	3.0E-03	25.65	Jones et al. (2019)
rs228692	1	7820652	PER3	intron variant	G	A	0.06	12/88	0/86	2.7E-02	30.34	Jones et al. (2019)
rs140974114	1	7,827,202	PER3	missense variant	G	A	0.04	0/100	8/78	3.4E-02	0.03
rs228697	1	7,827,519	PER3	missense variant	C	G	0.15	27/73	1/85	4.2E-04	50.10	Hida et al. (2014)
rs228699	1	7,828,109	PER3	intron variant	A	C	0.16	28/72	1/85	5.5E-05	95.14	Jones et al. (2019)
rs57875989	1	7,829,975	PER3	variable number tandem repeats	4 repeats	5 repeats	0.35	26/74	39/47	3.8E-03	0.32	Archer et al. (2003)
rs228654	1	7,837,168	PER3	intron variant	G	A	0.13	21/79	3/83	3.2E-04	13.08	Jones et al. (2019)
rs3811559	2	100,818,949	NPAS2	2KB upstream variant	G	T	0.05	7/91	3/83	4.4E-02	5.81
rs41280595	2	100,964,113	NPAS2	synonymous variant (intron)	G	A	0.18	21/79	12/74	3.9E-02	2.70
rs2304672	2	238,277,948	PER2	5 prime UTR variant	G	C	0.09	13/87	3/83	1.3E-02	6.09	Carpen et al. (2006)
rs113732211	2	238,278,317	PER2	intron variant	C	T	0.08	12/88	3/83	3.3E-02	4.62
rs75804782	2	238407402		Intergenic	T	C	0.08	6/94	9/73	2.8E-02	0.24	Jones et al. (2016)
rs28510558	3	23,944,975	NR1D2	2KB upstream variant	A	G	0.09	16/82	0/80	1.4E-02	37.78
rs71630059	3	136,862,769	NCK	intron variant	G	A	0.24	29/71	15/71	3.9E-02	2.41
rs1965107	3	136,929,801	NCK	intron variant	A	T	0.26	33/67	16/70	1.6E-02	2.77
—	3	136,949,580	NCK	3 prime UTR variant	T	C	0.03	0/98	6/76	2.7E-02	0.02
—	4	55,438,864	CLOCK	intron variant	A	T	0.08	15/85	0/84	1.7E-02	35.73
rs11022743	11	13,276,253	ARNTL	2KB upstream variant	G	A	0.36	48/50	18/66	2.5E-03	3.17
rs74795115	11	13,364,814	ARNTL	intron variant	C	T	0.06	12/88	0/86	3.9E-02	21.11
rs11022779	11	13,375,263	ARNTL	intron variant	G	A	0.18	11/89	22/64	5.0E-03	0.26
rs4756037	11	45,856,669	CRY2	intron variant	G	A	0.18	26/74	8/78	5.8E-03	4.01
rs10838527	11	45,881,643	CRY2	3 prime UTR variant	A	G	0.04	0/100	8/78	3.5E-02	0.04
rs6277	11	113,412,737	DRD2	synonymous variant	A	G	0.42	36/64	42/44	4.0E-02	0.48
rs1800499	11	113,416,972	DRD2	synonymous variant	C	T	0.06	2/98	9/77	3.3E-02	0.17
rs774031	12	56,430,408	TIMELESS	intron variant	C	T	0.50	44/56	49/37	2.6E-02	0.42
—	12	56,450,529	TIMELESS	2KB upstream variant	A	C	0.05	10/90	0/86	4.2E-02	22.93
rs7294953	12	107,051,359	CRY1	intron variant	C	T	0.48	42/56	44/38	8.19E-02	0.55
rs62002729	15	60,705,932	RORA	intron variant	T	A	0.06	3/97	9/77	5.1E-02	0.22
rs3027160	17	8,154,254	PER1	intron variant	T	C	0.26	29/71	19/67	6.5E-02	2.33
rs9961653	18	59,100,439		intergenic	T	C	0.49	42/58	49/37	4.7E-02	0.50	Jones et al. (2016)
rs1041528435	20	44,624,736	ADA	intron variant	A	G	0.01	0/94	2/74	3.1E-02	0.01
s5750581	22	38,299,401	CSNK1E	intron variant	T	C	0.15	10/90	18/68	3.0E-02	0.35

Abbreviations: SNP = single-nucleotide polymorphism; REF = reference allele; ALT = alternative allele; MAF = minor allele frequency; allelic distribution: total number of alternative and reference alleles observed in the 50 morning and 43 evening participants; p: association p value; OR = odds ratio; UTR = untranslated regions; SO = Sequence Ontology. Variants included in the small panel are highlighted in bold. Variants associated with the evening chronotype are highlighted in gray.

To obtain a ranking of the most predictive variants, the data set was repeatedly resampled in a 10 × 5-fold cross-validation, a PS-based model was generated for each resampling, and then the final ranking was determined based on the number of models in which each variant was included. This way, priority was given to the most informative variants, showing both high association and high MAF. For the PS modeling step, a p-value threshold of 0.05 was used to avoid inflation. A total of 83 lead significant variants were ranked (Suppl. Table S4). The predictive performance of 83 sets, encompassing the first N (1 ≤ N ≤ 83) variants, was assessed in a new 10 × 5-fold cross-validation. The first 13 variants (from here onward “small panel”) explained more than 80% of the variance of the population (Figure 1a), a result which was comparable with that of the predictive model with no variants selection (78.9%; Figure 1a). A second peak in explained variance (81.0%; Figure 1a) was reached with the first 35 variants (from here onward “large panel”). A further increase in the number of included variants did not improve the performance of the predictive model to any significant extent. PS based on both the large and the small panels was then calculated for all 96 genotyped subjects. Both PS distributions (Figure 1b) successfully separated EE/E from MM/M types (p < 10⁻¹⁴).

Figure 1.

Polygenic score analysis. (a) Estimated variance explained by the model as the number of included variants increases. The proportion of explained variance is reported as the average partial pseudo Nagelkerke’s R² in a 10 × 5-fold cross-validation. Red dots represent the variance explained by the first 13 (small panel) and 35 variants (large panel), respectively. The orange dotted line represents the variance explained by the predictive model without variants selection. (b) Differences between polygenic score distributions in the 44 evening (EE/E; blue) and 52 morning (MM/M; yellow) participants (median; upper and lower quartile; minimum and maximum) for the large (left) and the small (right) panels; p values refer to Mann-Whitney U tests. Asterisks mark potential outliers, that is, observations with a standardized value >3 or <−3. Abbreviations: EE/E extremely evening/evening; MM/M extremely morning/morning. Color version of the figure is available online.

The 35 highly predictive variants of the large panel were distributed on 15 out of the 19 target genes (Table 2). No predictive SNPs were found on the circadian gene NR1D1 or on 3 out of the 5 genes involved in sleep duration (ABCC9, BHLHE41, and FABP7). Of the 35 variants, 12 have been previously associated with chronotype (Table 2; Suppl. Table S1), including 2 SNPs (rs75804782 and rs9961653) on intergenic regions belonging to the group of 18 variants that we included even though they were located far from the target genes (Suppl. Table S2).

Ten of the highly predictive variants mapped on PER3, 3 on ARNTL, and 3 on NCK. The alternative alleles (i.e., alternative to the “wild-type” allele more commonly observed in the reference population) of 21 SNPs were associated with the MM/M chronotype (Table 2; OR > 1), while the alternative alleles of the remaining 14 variants were associated with the evening chronotype (Table 2; OR < 1). The 13 highly predictive variants of the small panel (Table 2) mapped on 8 clock genes (ARNTL, CLOCK, CRY2, CSNK1E, NCK, NR1D2, PER2, and PER3), with 9 and 4 being variants associated with MM/M and EE/E chronotype, respectively.

Discussion

In this study, we defined a panel of 35 predictive genetic polymorphisms for potential use in chronotype assessment. In further detail, we identified 23 novel variants which, in combination with 12 SNPs previously associated with chronotype as stand-alone, led to the development of a PS-based model with an estimated predictive value of about 80% in our population. Our findings suggest that the development of a predictive model for a complex trait, such as chronotype, can benefit from complete assessment of the genetic variability of a number of selected target genes. Our experimental design confirmed findings from previous GWAS and candidate gene studies, and also led to the identification of a non-negligible number of novel SNPs.

Our results confirmed the key role of PER3 in chronotype modulation, with the highest number of our identified highly predictive variants (10 out of 35) mapping on its gene. Unlike PER1 and PER2, which are essential for circadian rhythms’ maintenance and light responses in the master clock, PER3 has long been considered to exert its timekeeping function mostly in specific peripheral clocks (Pendergast et al., 2010, 2012). More recently, a role for PER3 in stabilizing PER1/PER2 has been proposed (Zhang et al., 2016). This hypothesis is supported by the high number of association studies linking human PER3 polymorphisms and chronotype (Hida et al., 2014; Parsons et al., 2014; Turco et al., 2017; Jones et al., 2019), sleep (Hasan et al., 2014), and mood (Zhang et al., 2016). rs228697 and rs140974114 are the only missense variants included in the model and are both located on the PER3 gene. rs228697 has been already associated with chronotype in previous studies (Hida et al., 2014; Turco et al., 2017) and it is thought to modify the secondary structure of the protein and its phosphorylation dynamics, affecting the interaction with the scaffold protein NCK, and PER3 stability.

Further analysis, either molecular or in silico, of the effects of the novel variants included in the predictive model could provide valuable insights into the molecular mechanisms underlying chronotype. For example, rs140974114 has never been associated with chronotype before. The alternative allele is associated with eveningness and leads to the substitution of a serine with an asparagine (S750N) in the PER3 protein (Figure 2). The polymorphism lies within the CSNK1E binding domain (555-760 aa; UniProt.org), close to a nuclear localization signal (NLS) motif (729-745 aa; UniProt.org), and the amino acid substitution could affect a putative phosphorylation site (phosphonet.ca). The phosphorylation status of PER proteins is known to influence stability and nuclear import, and more specifically, phosphorylation sites are required for optimal CSNK1E-mediated nuclear import of PER3 in mice (Akashi et al., 2002). In humans, the role of CSNK1E is less understood. However, the obliteration of the putative phosphorylation site in the S750N variant might alter the masking/unmasking of the NLS, resulting in different kinetics of the nuclear translocation of PER3.

Figure 2.

Schematic representation of the hypothetical effect of the S758N mutation on the CSNK1E-binding domain of PER3. The blue bar represents the CSNK1E-binding domain on the PER3 protein. The NLS is represented as a green box. Putative phosphorylation sites are highlighted in red. The red arrow indicates the mutation site and the resulting amino acid substitution. Abbreviation: NLS = nuclear localization signal. Color version of the figure is available online.

Among the most relevant genetic variations, we also found the PER3 VNTR, which, despite a low significance association with chronotype, was included in the small panel. The previously reported synergistic effect on chronotype of PER3 VNTR and the P864A substitution (rs228697; Turco et al., 2017) was confirmed also in this series.

Seven variants, which could alter gene expression, were also identified. Three were synonymous variants, two on the DRD2 gene, and another one in NPAS2, possibly affecting transcription, splicing, mRNA stability, and/or co-translational folding (Zeng and Bromberg, 2019). Four were variants located in the putative promoter regions of NR1D2, ARNTL, TIMELESS, NCK, and PER3, possibly altering gene expression by impairing a transcription factor binding site. For example, the rs228730 SNP on PER3 promoter region falls in a CpG site between 2 Sp1 (specificity protein 1) transcription factor binding sites and its influence on PER3 expression has been experimentally confirmed (Archer et al., 2010). Finally, 4 additional variants fell on the untranslated regions (UTR) of NCK, CRY2, and PER2. Mutations on these regions could affect mRNA stability, localization, and translation through effects on the specific binding sites of regulatory proteins and non-coding RNAs (Steri et al., 2018). For example, an SNP on the 3′UTR of the NCK gene (Ch12:136949580T>C), together with other 2 correlated SNPs (Ch12:136949584G>A and Ch12:136949587G>T), disrupts the hsa-miR-6875-3p recognition element site (predicted by miRDB, Target score 98; mirdb.org). Apparently, none of the intron variants fell on a canonical splice site (branchpoints or B-boxes), but their role in affecting alternative splicing through splicing regulatory elements (intronic splicing enhancer or silencer) cannot be excluded.

The estimated proportion of variance explained by our genetic model (81%) is above the previously estimated heritable component of chronotype (up to 52%; Klei et al., 2005; Koskenvuo et al., 2007; Barclay et al., 2010; Hsu et al., 2015). This is most likely related to the fact that, in this instance, intermediate types were not included. Our model (both the large and the small panels) will need validation in independent samples encompassing the whole range of chronotypes. Larger samples will also allow a linear regression approach based on the 5 chronotype classes (MM, M, intermediate, E, EE) or the continuous Morningness-Eveningness Questionnaire score.

Our study has a number of limitations, including the inherent confounding associated with phenotyping for sleep-wake features in a real-life environment, the relatively small sample size, and the exclusion of intermediate types.

We plan to make use of this model to develop a multiplex genotyping assay to define chronotype by a molecular approach, which may have advantages in settings where questionnaire-based chronotype assessment is confounded, for example, by disease or hospitalization (Bano et al., 2014). Furthermore, the relatively small number of predictive variants included in the model will allow development of a faster and accessible PCR-based genotyping assay to replace the high-throughput approach. The use for this can be envisaged in both public health initiatives and in precision medicine. Possibly even more importantly, one can imagine a process whereby further studies are performed to validate the model and the model is also progressively refined encompassing newly identified loci. This process is likely to lead to future, hypothesis-driven research.

In conclusion, two useful panels for the molecular assessment of chronotype were developed based on an innovative genotyping-by-sequencing approach of 19 clock and sleep-related genes in a sample of selected individuals with opposite chronotypes.

Supplemental Material

sj-tif-1-jbr-10.1177_07487304221099365 – Supplemental material for Toward a Molecular Approach to Chronotype Assessment

Supplemental material, sj-tif-1-jbr-10.1177_07487304221099365 for Toward a Molecular Approach to Chronotype Assessment by Alberto Biscontin, Lisa Zarantonello, Antonella Russo, Rodolfo Costa and Sara Montagnese in Journal of Biological Rhythms

Supplemental Material

sj-tif-2-jbr-10.1177_07487304221099365 – Supplemental material for Toward a Molecular Approach to Chronotype Assessment

Supplemental material, sj-tif-2-jbr-10.1177_07487304221099365 for Toward a Molecular Approach to Chronotype Assessment by Alberto Biscontin, Lisa Zarantonello, Antonella Russo, Rodolfo Costa and Sara Montagnese in Journal of Biological Rhythms

Supplemental Material

sj-xlsx-3-jbr-10.1177_07487304221099365 – Supplemental material for Toward a Molecular Approach to Chronotype Assessment

Supplemental material, sj-xlsx-3-jbr-10.1177_07487304221099365 for Toward a Molecular Approach to Chronotype Assessment by Alberto Biscontin, Lisa Zarantonello, Antonella Russo, Rodolfo Costa and Sara Montagnese in Journal of Biological Rhythms

Supplemental Material

sj-xlsx-4-jbr-10.1177_07487304221099365 – Supplemental material for Toward a Molecular Approach to Chronotype Assessment

Supplemental material, sj-xlsx-4-jbr-10.1177_07487304221099365 for Toward a Molecular Approach to Chronotype Assessment by Alberto Biscontin, Lisa Zarantonello, Antonella Russo, Rodolfo Costa and Sara Montagnese in Journal of Biological Rhythms

Supplemental Material

sj-xlsx-5-jbr-10.1177_07487304221099365 – Supplemental material for Toward a Molecular Approach to Chronotype Assessment

Supplemental material, sj-xlsx-5-jbr-10.1177_07487304221099365 for Toward a Molecular Approach to Chronotype Assessment by Alberto Biscontin, Lisa Zarantonello, Antonella Russo, Rodolfo Costa and Sara Montagnese in Journal of Biological Rhythms

Supplemental Material

sj-xlsx-6-jbr-10.1177_07487304221099365 – Supplemental material for Toward a Molecular Approach to Chronotype Assessment

Supplemental material, sj-xlsx-6-jbr-10.1177_07487304221099365 for Toward a Molecular Approach to Chronotype Assessment by Alberto Biscontin, Lisa Zarantonello, Antonella Russo, Rodolfo Costa and Sara Montagnese in Journal of Biological Rhythms

Footnotes

Acknowledgements

The study and authors AB and LZ were supported by a Supporting TAlent in ReSearch@University of Padova STARS@UNIPD 2019 “Consolidator Grants (STARS-CoG)” to author SM. The study was also supported by the Comparative Insect Chronobiology (CINCHRON), EU Horizon 2020, Marie Sklodowska-Curie Initial Training Network (grant agreement N° 765937) to author RC.

Conflict of Interest Statement

The author(s) have no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

ORCID iDs

Rodolfo Costa

Sara Montagnese

Notes

References

Adan

Natale

(2002) Gender differences in morningness-eveningness preference. Chronobiol Int 19:709-720.

Akashi

Tsuchiya

Yoshino

Nishida

(2002) Control of intracellular dynamics of mammalian period proteins by casein kinase I epsilon (CKIepsilon) and CKIdelta in cultured cells. Mol Cell Biol 22:1693-1703.

Amanat

Requena

Lopez-Escamez

(2020) A systematic review of extreme phenotype strategies to search for rare variants in genetic studies of complex disorders. Genes 11:987.

Archer

Carpen

Gibson

Lim

Johnston

Skene

von Schantz

(2010) Polymorphism in the PER3 promoter associates with diurnal preference and delayed sleep phase disorder. Sleep 33:695-701.

Archer

Robilliard

Skene

Smits

Williams

Arendt

von Schantz

(2003) A length polymorphism in the circadian clock gene Per3 is linked to delayed sleep phase syndrome and extreme diurnal preference. Sleep 26:413-415.

Ashbrook

Krystal

Y-H

Ptáček

(2020) Genetics of the human circadian clock and sleep homeostat. Neuropsychopharmacology 45:45-54.

Bano

Chiaromanni

Corrias

Turco

De Rui

Amodio

Merkel

Gatta

Mazzotta

Costa

, et al. (2014) The influence of environmental factors on sleep quality in hospitalized medical patients. Front Neurol 5:267.

Barchi

Acquadro

Alonso

Aprea

Bassolino

Demurtas

Ferrante

Gramazio

Mini

Portis

, et al. (2019) Single primer enrichment technology (SPET) for high-throughput genotyping in tomato and eggplant germplasm. Front Plant Sci 10:1005.

Barclay

Eley

Buysse

Archer

Gregory

(2010) Diurnal preference and sleep quality: same genes? A study of young adult twins. Chronobiol Int 27:278-296.

10.

Benjamini

Hochberg

(1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B Methodol 57:289-300.

11.

Carpen

von Schantz

Smits

Skene

Archer

(2006) A silent polymorphism in the PER1 gene associates with extreme diurnal preference in humans. J Human Genet 51:1122-1125.

12.

Chang

Chow

Tellier

Vattikuti

Purcell

Lee

(2015) Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4:7.

13.

Chellappa

Viola

Schmidt

Bachmann

Gabel

Maire

Reichert

Valomon

Götz

Landolt

, et al. (2012) Human melatonin and alerting response to blue-enriched light depend on a polymorphism in the clock gene PER3. J Clin Endocrinol Metabol 97:E433-E437.

14.

Curtis

Ashbrook

Young

Finn

Ptáček

Jones

(2019) Extreme morning chronotypes are often familial and not exceedingly rare: the estimated prevalence of advanced sleep phase, familial advanced sleep phase, and advanced sleep-wake phase disorder in a sleep clinic population. Sleep 42:zsz148.

15.

Del Fabbro

Scalabrin

Morgante

Giorgi

(2013) An extensive evaluation of read trimming effects on Illumina NGS data analysis. PLoS ONE 8:e85024.

16.

DePristo

Banks

Poplin

Garimella

Maguire

Hartl

Philippakis

del Angel

Rivas

Hanna

, et al. (2011) A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Gen 43:491-498.

17.

Dmitrzak-Węglarz

Pawlak

Wiłkość

Miechowicz

Maciukiewicz

Ciarkowska

Zaremba

Hauser

(2016) Chronotype and sleep quality as a subphenotype in association studies of clock genes in mood disorders. Acta Neurobiol Exp 76:32-42.

18.

Etain

Jamain

Milhiet

Lajnef

Boudebesse

Dumaine

Mathieu

Gombert

Ledudal

Gard

, et al. (2014) Association between circadian genes, bipolar disorders and chronotypes. Chronobiol Int 31:807-814.

19.

Euesden

Lewis

O’Reilly

(2015) PRSice: polygenic risk score software. Bioinformatics (Oxford, England) 31:1466-1468.

20.

Ghotbi

Pilz

Winnebeck

Vetter

Zerbini

Lenssen

Frighetto

Salamanca

Costa

Montagnese

, et al. (2020) The µMCTQ: an ultra-short version of the Munich ChronoType Questionnaire. J Biol Rhythms 35:98-110.

21.

Gottlieb

O’Connor

Wilk

(2007) Genome-wide association of sleep and circadian phenotypes. BMC Med Gen 8:S9.

22.

Hasan

van der Veen

Winsky-Sommerer

Hogben

Laing

Koentgen

Dijk

Archer

(2014) A human sleep homeostasis phenotype in mice expressing a primate-specific PER3 variable-number tandem-repeat coding-region polymorphism. FASEB J 28:2441-2454.

23.

Jones

Fujiki

Guo

Holder

Jr Rossner

Nishino

(2009) The transcriptional repressor DEC2 regulates sleep length in mammals. Science (New York, N.Y.) 325:866-870.

24.

Hida

Kitamura

Katayose

Kato

Ono

Kadotani

Uchiyama

Ebisawa

Inoue

Kamei

, et al. (2014) Screening of clock gene polymorphisms demonstrates association of a PER3 polymorphism with morningness-eveningness preference and circadian rhythm sleep disorder. Sci Rep 4:6309.

25.

Hirano

Shi

Jones

Lipzen

Pennacchio

Hallows

McMahon

Yamazaki

Ptáček

, et al. (2016) A Cryptochrome 2 mutation yields advanced sleep phase in humans. eLife 5:e16695.

26.

Höglund

Rafati

Rask-Andersen

Enroth

Karlsson

Johansson

(2019) Improved power and precision with whole genome sequencing data in genome-wide association studies of inflammatory biomarkers. Sci Rep 9:16844.

27.

Horne

Ostberg

(1976) A self-assessment questionnaire to determine morningness-eveningness in human circadian rhythms. Int J Chronobiol 4:97-110.

28.

Hsu

P-K

Ptáček

Y-H

(2015) Genetics of human sleep behavioral phenotypes. Methods Enzymol 552:309-324.

29.

Shmygelska

Tran

Eriksson

Tung

Hinds

(2016) GWAS of 89,283 individuals identifies genetic variants associated with self-reporting of being a morning person. Nat Commun 7:10448.

30.

Jankowski

Dmitrzak-Weglarz

(2017) ARNTL, CLOCK and PER3 polymorphisms—links with chronotype and affective dimensions. Chronobiol Int 34:1105-1113.

31.

Jones

Lane

Wood

van Hees

Tyrrell

Beaumont

Jeffries

Dashti

Hillsdon

Ruth

, et al. (2019) Genome-wide association analyses of chronotype in 697,828 individuals provides insights into circadian rhythms. Nat Commun 10:343.

32.

Jones

Tyrrell

Wood

Beaumont

Ruth

Tuke

Yaghootkar

Teder-Laving

Hayward

, et al. (2016) Genome-wide association analyses in 128,266 individuals identifies new morningness and sleep duration loci. PLoS Gen 12:e1006125.

33.

Klei

Reitz

Miller

Wood

Maendel

Gross

Waldner

Eaton

Monk

Nimgaonkar

(2005) Heritability of morningness-eveningness and self-report sleep measures in a family-based sample of 521 hutterites. Chronobiol Int 22:1041-1054.

34.

Koskenvuo

Hublin

Partinen

Heikkilä

Kaprio

(2007) Heritability of diurnal type: a nationwide study of 8753 adult twin pairs. J Sleep Res 16:156-162.

35.

Kurien

Hsu

P-K

Leon

McMahon

Shi

Lipzen

Pennacchio

Jones

, et al. (2019) TIMELESS mutation alters phase responsiveness and causes advanced sleep phase. Proc Natl Acad Sci U S A 116:12045-12053.

36.

Lane

Vlasac

Anderson

Kyle

Dixon

Bechtold

Gill

Little

Luik

Loudon

, et al. (2016) Genome-wide association analysis identifies novel loci for chronotype in 100,420 individuals from the UK Biobank. Nat Commun 7:10889.

37.

Lee

H-J

Kim

Kang

S-G

Yoon

Choi

Park

Kim

Kripke

(2011) PER2 variation is associated with diurnal preference in a Korean young population. Behav Gen 41:273-277.

38.

Durbin

(2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics (Oxford, England) 25:1754-1760.

39.

Lussier

Larose

(1997) A casein kinase I activity is constitutively associated with Nck. J Biol Chem 272:2688-2694.

40.

Manor

Segal

(2013) Predicting disease risk using bootstrap ranking and classification algorithms. PLoS Comp Biol 9:e1003200.

41.

Marees

de Kluiver

Stringer

Vorspan

Curis

Marie-Claire

Derks

(2018) A tutorial on conducting genome-wide association studies: quality control and statistical analysis. Int J Methods Psychiatr Res 27:e1608.

42.

Martin

(2011) Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J 17:110-112.

43.

Matsuo

Shiino

Yamada

Ozeki

Okawa

(2007) A novel SNP in hPer2 associates with diurnal preference in a healthy population. Sleep Biol Rhythms 5:141-145.

44.

Maukonen

Havulinna

Männistö

Kanerva

Salomaa

Partonen

(2020) Genetic associations of chronotype in the Finnish general population. J Biol Rhythms 35:501-511.

45.

Mercer

Clark

Andersen

Brunck

Haerty

Crawford

Taft

Nielsen

Dinger

Mattick

(2015) Genome-wide discovery of human splicing branchpoints. Gen Res 25:290-303.

46.

Mishima

Tozawa

Satoh

Saitoh

Mishima

(2005) The 3111T/C polymorphism of hClock is associated with evening preference and delayed sleep timing in a Japanese population sample. Am J Med Genet B Neuropsychiatr Genet 133B:101-104.

47.

Nagelkerke

NJD

(1991) A note on a general definition of the coefficient of determination. Biometrika 78:691-692.

48.

Parsons

Lester

Barclay

Archer

Nolan

Eley

Gregory

(2014) Polymorphisms in the circadian expressed genes PER3 and ARNTL2 are associated with diurnal preference and GNβ3 with sleep measures. J Sleep Res 23:595-604.

49.

Patke

Murphy

Onat

Krieger

Özçelik

Campbell

Young

(2017) Mutation of the human circadian clock gene CRY1 in familial delayed sleep phase disorder. Cell 169:203-215.e13.

50.

Pendergast

Friday

Yamazaki

(2010) Distinct functions of Period2 and Period3 in the mouse circadian system revealed by in vitro analysis. PLoS ONE 5:e8552.

51.

Pendergast

Niswender

Yamazaki

(2012) Tissue-specific function of Period3 in circadian rhythmicity. PLoS ONE 7:e30254.

52.

Robilliard

Archer

Arendt

Lockley

Hack

English

Leger

Smits

Williams

Skene

, et al. (2002) The 3111 Clock gene polymorphism is not associated with sleep and circadian rhythmicity in phenotypically characterized human subjects. J Sleep Res 11:305-312.

53.

Roenneberg

Kuehnle

Juda

Kantermann

Allebrandt

Gordijn

Merrow

(2007) Epidemiology of the human circadian clock. Sleep Med Rev 11:429-438.

54.

Roenneberg

Wirz-Justice

Merrow

(2003) Life between clocks: daily temporal patterns of human chronotypes. J Biol Rhythms 18:80-90.

55.

Steri

Idda

Whalen

Orrù

(2018) Genetic variants in mRNA untranslated regions. Wiley interdisciplinary reviews. RNA 9:e1474.

56.

Toh

Jones

Eide

Hinz

Virshup

Ptácek

(2001) An hPer2 phosphorylation site mutation in familial advanced sleep phase syndrome. Science 291:1040-1043.

57.

Tonetti

Natale

(2019) Discrimination between extreme chronotypes using the full and reduced version of the Morningness-Eveningness Questionnaire. Chronobiol Int 36:181-187.

58.

Turco

Biscontin

Corrias

Caccin

Bano

Chiaromanni

Salamanca

Mattei

Salvoro

Mazzotta

, et al. (2017) Diurnal preference, mood and the response to morning light in relation to polymorphisms in the human clock gene PER3. Sci Rep 7:6967.

59.

Vetter

(2020) Circadian disruption: what do we actually mean? Eur J Neurosci 51:531-550.

60.

Padiath

Shapiro

Jones

Saigoh

Ptácek

(2005) Functional consequences of a CKIδ mutation causing familial advanced sleep phase syndrome. Nature 434:640-644.

61.

Zeng

Bromberg

(2019) Predicting functional effects of synonymous variants: a systematic review and perspectives. Front Gen 10:914.

62.

Zhang

Y-H

(2020) The molecular genetics of human sleep. Eur J Neurosci 51:422-428.

63.

Zhang

Hirano

Hsu

P-K

Jones

Sakai

Okuro

McMahon

Yamazaki

Saigoh

, et al. (2016) A PERIOD3 variant causes a circadian phenotype and is associated with a seasonal mood trait. Proc Natl Acad Sci U S A 113:E1536-E1544.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.98 MB

0.69 MB

0.02 MB

0.01 MB

0.31 MB

0.19 MB