Abstract
Genetic factors may contribute to the heritability and susceptibility of recurrent aphthous ulcers (RAU). This study evaluated the heritability of RAU in the TwinsUK registry. Data from 890 twin pairs (319 monozygotic [MZ] and 571 dizygotic [DZ]) were used to estimate the prevalence of RAU in the previous year and 8 subphenotypes. A classical twin design was used to partition the variance in RAU presentation into components attributable to additive genetic (A), common/shared environmental (C), dominance genetic (D), and unique/nonshared environmental (E) effects. A multilevel ACE/ADE model with random effects at both the individual and twin-pair levels was fitted to the prevalence of RAU and each subphenotype separately. RAU prevalence in the previous year was 9.3%, with MZ and DZ twin correlations of 0.59 (95% confidence interval [CI]: 0.37, 0.82) and 0.30 (0.09, 0.82), respectively. The ACE model estimated the heritability of RAU prevalence in the previous year at 55.69% (34.43% to 76.95%). Using stricter RAU criteria yielded similar heritability estimates (58.82% [36.51, 81.14]), reinforcing the robustness of the findings. Among subphenotypes, frequency of episodes (53.61% [33.09, 74.13]) and time between occurrences (53.08% [30.96, 75.19]) showed the largest genetic contribution, while ulcer size had the lowest genetic contribution (40.55% [14.76, 66.33]). Heritability estimates for the number of RAU (50.32%, 95% CI: 25.99, 74.65), healing time (49.24% [27.46, 71.03]), and location in soft tissues (42.85% [12.91, 72.78]) and hard tissues (40.70% [12.43, 68.97]) fell between those values. The findings indicate a genetic contribution to RAU susceptibility, with heritability estimates differing across various phenotypic presentations of the condition.
Keywords
Introduction
Recurrent aphthous ulcers (RAU) or recurrent aphthous stomatitis are a common chronic inflammatory condition of the oral mucosa. They are characterized by the repeated appearance of 1 or more painful ulcers, typically lasting 1 to 3 wk, resolving spontaneously before recurring after variable intervals (Wang et al 2022; Stoopler et al 2024). Despite their high prevalence and impact on quality of life, there is currently no definitive cure for RAU, likely due to its poorly understood etiology (Bilodeau and Lalla 2019; Conejero Del Mazo et al 2023). It is believed that genetic factors play a significant role in both the heritability and susceptibility to RAU (Akintoye and Greenberg 2014; Cui et al 2016). A recent systematic review on genetics in human oral health reported strong genetic contribution for 25 oral conditions, moderate contribution for 2, and weak evidence for 14 others. However, RAU was notably absent from this review, likely due to the limited availability of robust genetic studies, despite its status as one of the most prevalent oral conditions (Joy-Thomas et al 2025).
Heritability is commonly estimated using family-based methods, including family aggregation studies and twin designs (Robette et al 2022; Barry et al 2023). Family aggregation studies assess clustering of traits or diseases within families (Barry et al 2023). A positive family history is frequently observed among individuals with RAU, and the likelihood of developing the condition increases with the number of affected first-degree relatives (Scully and Porter 2008; Bilodeau and Lalla 2019). Moreover, familial cases tend to exhibit a more severe phenotype, with earlier onset, more frequent recurrences, and longer ulcer duration (Slebioda et al 2014; Wang et al 2022). Twin designs compare monozygotic (MZ) twins, who share nearly 100% of their genes, with dizygotic (DZ) twins, who share approximately 50%, to better distinguish genetic influences from shared environmental factors. When MZ twins exhibit greater phenotypic similarity than DZ twins, it suggests that genetic variation contributes to the trait (Barry et al 2023). To date, only 2 twin studies have explored RAU heritability. The earliest study was conducted in Philadelphia, United States, on 19 twin pairs (12 MZ and 7 DZ, mean age: 13.1 y). The results showed higher concordance rates among MZ twins than DZ twins (92% vs 57%), suggesting a genetic influence (Miller et al 1977). A subsequent study in Brisbane, Australia, examined a larger, nonclinical sample of 290 twin pairs (127 MZ and 163 DZ), aged 10 to 12 y, and their biological parents (n = 1,160). The authors reported a substantial genetic contribution to RAU incidence, estimating heritability at 64% (Lake et al 1997). While both twin studies supported a genetic basis for RAU, only Lake et al (1997) reported a heritability estimate.
No twin study has been conducted in adults, representing a significant gap in knowledge because RAU often begins in young adulthood (Akintoye and Greenberg 2014), and the frequency and severity of recurrences decreases with age (Conejero Del Mazo et al 2023). Furthermore, a comprehensive characterization of the genetic contributions to distinct RAU subphenotypes is crucial for advancing understanding of the underlying biology. Specific clinical features, such as lesion size and healing time, likely reflect different biological processes including nociception, inflammation, and mucosal repair. By linking these subphenotypes to plausible biological pathways, the genetic architecture and pathophysiology of RAU can be elucidated more clearly. Therefore, this study evaluated the heritability of RAU using a large adult twin cohort from the TwinsUK Registry. Based on the findings of Lake et al (1997), it was hypothesized that the genetic contribution to RAU would exceed 50%, primarily reflecting additive genetic effects rather than nonadditive effects.
Methods
This report adheres to the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement (von Elm et al 2007).
Data Source
This study used data from TwinsUK, a nationwide registry of community-dwelling adult twins in the United Kingdom. The registry started in 1992, with the initial intention to investigate osteoporosis and osteoarthritis. As a result, several hundred middle-aged women formed the core of the original cohort. Between 1992 and 2004 (baseline), more than 7,000 twins responded to annual questionnaires, and approximately 5,500 twins attended a full comprehensive clinical visit, which included several investigations. The success of these early studies led to rapid expansion of TwinsUK, and today the cohort includes more than 14,000 male and female twins aged 18 y and older. However, the cohort remains predominantly female (82%) and middle-aged (mean age = 59 y) (Verdi et al 2019).
Phenotypic data and biological samples are collected through annual questionnaires and approximately quadrennial clinic assessments. The TwinsUK registry has received ethical approval from the London-Westminster NHS Research Ethics Committee (reference: EC04/015), and all participants provided written informed consent (Moayyeri et al 2013). This study used data from a sample of 890 twin pairs (319 MZ and 571 DZ) who completed a health questionnaire between 1998 and 2000.
Measures
Between 1998 and 2000, the health questionnaire included 9 items assessing multiple aspects of RAU, which were developed based on previous research aimed at identifying criteria for classification of disease severity (Tappuni et al 2013). Participants were first asked if they get oral ulcers 3 or more times per year (Rogers 1997; Barrons 2001). Those who responded affirmatively were subsequently asked a series of follow-up questions regarding specific features of RAU: frequency (how often do you get ulcers?), age of onset (how old were you when ulcers started appearing?), healing time (when ulcers appear in the mouth, how long before they heal?), time between occurrences (what is the average time between ulcer attacks (ulcer-free period)?), quantity (how many ulcers do you get at a time?), shape (what is the shape of the ulcer?), size (what is the maximum size your oral ulcer would be?), and location (tongue, under the tongue, roof of the mouth, gums, throat, cheeks, lips, elsewhere in the body). The primary phenotype was RAU in the previous year (RAU prevalence). Responses to the additional questionnaire items were used to derive secondary phenotypic traits (subphenotypes), representing various dimensions of RAU presentation. Participants without RAU were included when defining subphenotypes; they served as the reference group.
Twin zygosity was initially determined through self-reported responses at the time of study enrollment, using the Peas-in-the-Pod questionnaire (PPQ), a validated and cost-effective tool, for distinguishing MZ from DZ twins (Jarrar et al 2018). Zygosity was later confirmed through genome-wide genotyping (Verdi et al 2019). Between 1999 and 2006, 3,859 twins (46.5% of those with PPQ-based zygosity) had zygosity confirmed by both the PPQ and genotyping. A single PPQ assessment correctly classified twins as MZ or DZ with 86.9% accuracy, rising to 97.9% when multiple questionnaires showed consistent results (Jarrar et al 2018).
Demographic variables were considered as potential covariates. Demographic data were obtained from self-reported lifestyle questionnaires collected annually and during clinical visits (Verdi et al 2019). Environmental covariates were not included because many putative environmental exposures relevant to RAU (eg, educational attainment, nutritional deficiencies, stress, and smoking) are themselves substantially heritable and genetically correlated with the outcome. Adjusting for such variables would therefore condition on genetically influenced traits, potentially removing genetically mediated pathways and biasing estimates of total genetic variance downwardly, rather than controlling for exogenous confounding (Plomin and Bergeman 1991; McAdams et al 2013).
Data Analysis
Descriptive comparison between MZ and DZ twins were conducted. Age distributions were compared using the Mann–Whitney test, while differences in RAU prevalence and subphenotypes were assessed using chi-squared test. The number of MZ and DZ twin pairs were also reported, along with the distribution of concordant and discordant pairs for RAU prevalence and subphenotypes.
To estimate the relative contributions of genetic and environmental factors to RAU prevalence, we applied a classical twin design (Boomsma et al 2002). We decomposed the total phenotypic variance into 4 latent components: additive (A) genetic variance, common/shared (C) environment variance, dominant (D) genetic variance, and unique/nonshared environment (E) variance. However, studies limited to MZ and DZ twins cannot simultaneously estimate both C and D components due to model identification constraints (Neale and Cardon 1992; Verweij et al 2012). The ACE and AE models were first fitted to RAU prevalence and compared using the Akaike and Bayesian information criteria (AIC and BIC). The model with the lower AIC/BIC values was preferred. If C was excluded and the ratio of the MZ to DZ twin correlations exceeded 2, we then tested whether including D improved model fit by comparing ADE and AE models, again using AIC/BIC. Therefore, the final model for RAU prevalence was one of the following: ACE, AE, or ADE (Tamimy et al 2021; Hagenbeek et al 2023). Variance decomposition was conducted using a multilevel parameterization of the ACE model, with random effects specified at both the individual and twin-pair levels (Rabe-Hesketh et al 2008). The Guo and Wang (2002) algorithm was preferred because of its computational efficiency and lack of model convergence issues. RAU prevalence was modeled as a binary variable (using tetrachoric correlations) and adjusted for age in 10-y bands. The analytical strategy was repeated with each RAU subphenotype, which were modeled as either binary or ordinal variables, using tetrachoric and polychoric correlations, respectively. Listwise deletion was used to handle missing data. Analyses were performed in Stata (StataCorp.) with the acelong package (Lang 2019).
Statistical power for the ACE models was evaluated across a range of A and C effects (Visscher 2004). The sample had at least 80% power to detect A values of 0.32 or greater, regardless of the magnitude of C. For C, the sample achieved at least 80% power to detect values of 0.18 or greater when A ranged from 0.40 to 0.50. We conducted a sensitivity analysis using a more stringent RAU definition. Under this definition, participants must have experienced at least 3 episodes in the past year, with ulcers that were round or oval, measured ≤10 mm, and resolved within 2 wk.
Results
A total of 1,852 twins completed the RAU questionnaire. Of these, 72 were excluded for missing (n = 64) or unknown zygosity (n = 8). The sample consisted of 1,780 females (638 MZ and 1,142 DZ twins), with a mean age of 50.34 (SD: 12.58) y (Table 1). RAU prevalence in the previous year was 9.3%. The ulcers were reported to affect both soft and hard tissues equally (6.4% and 6.8%, respectively). There were no differences in the prevalence or site of presentation of RAU between the MZ and DZ twins. Among twins who reported RAU in the previous year, most reported that the ulcers first appeared before the age of 25 y (56.5%), occurred 3 to 4 times per year (40.6%), took less than a week to heal (58.7%), and more than 8 wk to reoccur (50%). Most twins reported having fewer than 3 ulcers per episode (79.5%), which were typically round (69.1%), and 3 to 5 mm in diameter (44.5%). There were no differences between MZ and DZ twins across these phenotypic characteristics, except for ulcer size, where MZ twins reported significantly larger ulcers than DZ twins.
Demographic and Clinical Characteristics of Participants, according to Zygosity.
DZ, dizygotic; MZ, monozygotic; RAU, recurrent aphthous ulcers.
Total for these questions is the number of participants who had RAU in the previous year (n = 165).
Chi-square and Mann–Whitney tests were used to compare categorical and numerical variables, respectively.
Concordance and discordance rates for RAU prevalence and subphenotypes are shown in Table 2. For RAU prevalence, most twin pairs were concordant for not having RAU, and concordance for the presence of RAU was lower than discordance. Also, 10.3% (n = 33) of MZ twin pairs and 15.8% (n = 90) of DZ twin pairs were discordant. Across subphenotypes, discordance rates were consistently lower among MZ than DZ twin pairs, ranging from 7.8% to 11.9% for MZ twin pairs and from 12.4% to 16.5% for DZ twin pairs.
Number and Percentage of Twin Pairs Concordant and Discordant for Different RAU Phenotypes.
DZ, dizygotic; MZ, monozygotic; RAU, recurrent aphthous ulcers.
The correlation for RAU prevalence was 0.59 for MZ twin pairs and 0.30 for DZ twin pairs (Table 3). For subphenotypes, the MZ twin correlations ranged between 0.44 and 0.59, whereas the DZ twin correlations ranged between 0.11 and 0.29. Of the 8 subphenotypes assessed, RAU in soft tissues, healing time, quantity, size, and shape showed MZ-to-DZ correlation ratios greater than 2. For these 5 subphenotypes, the ADE model was tested in addition to the ACE and AE models.
Twin Correlations for Different RAU Phenotypes.
RAU, recurrent aphthous ulcers; rMZ, correlation among monozygotic twins, rDZ, correlation among dizygotic twins.
Tetrachoric and polychoric correlations were used with binary and ordinal phenotypes, respectively.
Table 4 shows that the heritability for RAU prevalence was 55.69% (95% confidence interval [CI]: 34.43, 76.95). The AE model was preferred over the ACE model, as there was no evidence of the effect of the common environment (C component). Similarly, there was no indication of the influence of the common environment for any subphenotype. For the 5 subphenotypes with MZ-to-DZ correlation ratios greater than 2, the ADE model did not provide a better fit to the data than the AE model did. The Figure shows that the frequency of episodes (53.61%, 95% CI: 33.09, 74.13) and time between occurrences (53.08%, 95% CI: 30.96, 75.19) had the highest genetic contribution, whereas size had the lowest heritability (40.55%, 95% CI: 14.76, 66.33). Moreover, moderate heritability was observed for shape (50.50%; 95% CI: 26.09, 74.91), quantity (50.32%, 95% CI: 25.99, 74.65), healing time (49.24%, 95% CI: 27.46, 71.03), and location of ulcers in soft (42.85%, 95% CI: 12.91, 72.78) and hard tissues (40.70%, 95% CI: 12.43, 68.97).
Variance Decomposition for Different RAU Phenotypes.
A, additive genetic variance; AIC, Akaike information criterion; BIC, Bayesian information criterion; C, common environmental variance; D, dominant genetic variance; E, unique environmental variance; RAU, recurrent aphthous ulcers.
All models were adjusted for age in 10-y bands. Bold values indicate the best-fitting model (lowest AIC and BIC) for each RAU phenotype.

Heritability estimates for different recurrent aphthous ulcer (RAU) phenotypes. Heritability estimates correspond to the additive genetic variance (A) component from the chosen ACE, AE, or ADE model in Table 4.
In the sensitivity analysis, the MZ and DZ twin correlations for a more stringent definition of RAU were 0.60 and 0.44, respectively. The AE model (AIC = 813.25, BIC = 844.46) was preferred over the ACE model (AIC = 815.24, BIC = 850.90), with a heritability estimate of 58.82% (95% CI: 36.51, 81.14).
Discussion
This study demonstrates that genetic factors play a substantial role in RAU presentation among UK twins. Our heritability estimate for RAU prevalence was reassuringly consistent across analyses, with closely comparable values in both the main (56%) and sensitivity (58%) analysis, supporting the reliability of our findings. These estimates also align with those reported in a previous twin study of Australian children (64%), further reinforcing the evidence for a substantial genetic contribution to RAU (Lake et al 1997).
The findings also indicate that specific RAU subphenotypes differ in their degree of heritability, with episode frequency and ulcer-free intervals showing the strongest genetic contribution (>50%) and ulcer location, number, healing time, and size showing moderate heritability (40% to 41%). These estimates, although accompanied by wide and partly overlapping confidence intervals, provide valuable insights into the biological underpinnings of RAU. Higher heritability estimates for ulcer frequency and ulcer-free intervals is consistent with models in which genetically driven immune regulation and inflammatory thresholds shape RAU susceptibility. This aligns with evidence implicating dysregulated cytokine signaling and T-cell–mediated responses in RAU pathogenesis (Slebioda et al 2013). Genetic association studies support polymorphisms in genes encoding the serotonin transporter, several interleukins (IL-1β, IL-6, and IL-10), tumor necrosis factor–α, interferon-γ, and specific human leucocyte antigens, all of which may influence individual inflammatory response profiles (Slebioda et al 2014; Wu et al 2018; Wang et al 2022; Yousefi et al 2022). By contrast, traits such as ulcer size and location are likely more sensitive to local environmental influences, including mechanical trauma and chronic mucosal irritation, which can precipitate or exacerbate ulceration in genetically predisposed individuals (Slebioda et al 2014). Healing time probably represents a composite phenotype involving both genetically influenced tissue repair and wound-healing pathways as well as modifiable factors such as nutritional status, psychosocial stress, and other systemic or local exposures.
Furthermore, the less-than-perfect concordance observed in MZ twins highlights the important role of environmental influences in RAU expression. More than 55% of twins reported RAU onset before 25 y of age, underscoring young adulthood as a particularly vulnerable period. This life period is characterized by heightened psychosocial stress, increased engagement in risk-taking behaviors, and dietary experimentation, all of which have been implicated as potential triggers or exacerbating factors in RAU (Conejero Del Mazo et al 2023). In addition to psychosocial influences, local environmental factors such as repeated mechanical trauma and irritation of the oral mucosa remain important precipitants of RAU episodes (Conejero Del Mazo et al 2023; Stoopler et al 2024) . Emerging mechanistic studies further suggest that complex inflammatory pathways, including PANoptosis-related forms of programmed cell death, may mediate the link between bacterial stimuli, epithelial damage, and ulcer formation. Such processes provide a biologically plausible interface by which environmental exposure interacts with host genetic susceptibility (Riveros-Gomez et al 2024).
These findings have several important implications. First, heritability estimates define the upper limit of predictive power that genetic information can provide, thereby informing the development of future predictive genetic panels. Our findings support a model in which polygenic susceptibility creates a vulnerable mucosal and immune milieu that can be unmasked or amplified by psychological stress, lifestyle factors, local trauma, and microbial stimuli during critical developmental windows, particularly early adulthood. Second, by identifying which phenotypic features of RAU’s natural history are most strongly influenced by genetic makeup, our findings can guide the design of more targeted therapies and personalized management strategies for the prevention and treatment of RAU (Mayhew and Meyre 2017; Robette et al 2022). In this context, the higher heritability of RAU frequency and ulcer-free intervals suggests that these traits may require different therapeutic targets or intervention strategies compared with size-related features, which appear more environmentally determined.
Certain limitations must be acknowledged. First, the TwinsUK cohort is composed predominantly of healthy, middle-aged, White British women, and RAU heritability may differ among demographic groups. Although sex-specific differences in heritability could arise through interactions among genetic factors, sex hormones, and environmental factors, current large-scale analyses indicate that most complex traits show broadly comparable heritability in males and females, with only modest deviations (Sharma et al 2025). RAU typically begins in young adulthood, which is a group underrepresented in our sample, and evidence on ethnic variation in RAU heritability is extremely limited. Consequently, our estimates should be interpreted as applying primarily to this demographic profile, and their generalizability to other ages, sexes, and ethnic backgrounds remains uncertain.
Second, RAU status was based on self-report, which is susceptible to measurement error and reporting bias. Clinical confirmation at each episode is impractical in large population-based studies, as RAU is unpredictable in onset and typically resolves within 1 to 2 wk. Self-reporting is therefore a necessary and pragmatic approach. To minimize misclassification, we used a structured questionnaire designed to differentiate RAU from other oral lesions (Tappuni et al 2013), and previous research has demonstrated strong concordance between validated questionnaires and clinical measurements for RAU (Baccaglini et al 2013). Although it is theoretically possible that MZ twins may report more concordantly than DZ twins due to shared reporting tendencies rather than shared genetic liability, thus potentially inflating heritability estimates, evidence indicates that self-reported traits typically yield lower heritability estimates than clinically measured traits (Macgregor et al 2006). Furthermore, our 1-y prevalence estimate (9%) aligns with population-based estimates (5% to 20%), supporting the overall validity of our RAU assessment (Conejero Del Mazo et al 2023; Stoopler et al 2024). Finally, the validity of heritability estimates in the classical twin design relies on assumptions of equal environments for MZ and DZ twins, random mating, no gene–environment interaction, and that twins represent the general population (Boomsma et al 2002; Barry et al 2023; Hagenbeek et al 2023). While violations of these assumptions can inflate heritability estimates, empirical evidence suggests that such biases are generally modest across behavioral, psychological, and health traits (Conley et al 2013; Felson 2014).
Conclusion
This study establishes that genetic factors make a major contribution to variation in RAU among female twins in the United Kingdom. The differing heritability estimates across RAU subphenotypes indicate distinct biological pathways driving susceptibility and clinical expression. Overall, the findings affirm a multifactorial model in which genetic predisposition interacts with environmental exposures to produce the clinical heterogeneity characteristic of RAU.
Author Contributions
A.R. Tappuni, E. Bernabe, contributed to conception and design, data acquisition, analysis, and interpretation, drafted and critically revised manuscript. Both authors gave their final approval and agreed to be accountable for all aspects of the work.
Footnotes
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: TwinsUK is funded by the Wellcome Trust, Medical Research Council, Arthritis UK, European Union Horizon, Chronic Disease Research Foundation (CDRF), Wellcome Leap Dynamic Resilience Programme (co-funded by Temasek Trust), ZOE Ltd, the National Institute for Health and Care Research (NIHR) Research Delivery Network (RDN), and Biomedical Research Centre based at Guy’s and St Thomas’ NHS Foundation Trust in partnership with King’s College London.
