Abstract
Objectives
To further demonstrate the validity of Affordable Rapid Olfaction Measurement Array (AROMA), an essential oil−based smell test, and compare it to the Sniffin’ Sticks 12 Test (SST12).
Study Design
Prospective cross-sectional study.
Setting
Academic medical center.
Methods
Fifty healthy individuals without sinonasal disease were recruited to the study. AROMA has been previously validated against the University of Pennsylvania Smell Identification Test. The current study tests 2 additional higher concentrations to increase the ability to detect olfactory reserve. Healthy participants completed AROMA, SST12, Sino-Nasal Outcome Test (SNOT-22), and Questionnaire of Olfactory Disorders (QoD). Spearman correlations were used to evaluate AROMA, SST, SNOT-22, and QoD.
Results
AROMA demonstrated strong test-retest reliability (
Conclusion
AROMA has a moderate correlation with SST12. AROMA is more strongly correlated than SST12 to age and SNOT-22. AROMA’s stronger correlation with subjective olfactory status, low cost, and adaptability may help remove barriers to routine olfactory testing in the clinic.
A wide range of electrophysiological and psychophysical tests have been developed to assess olfaction and olfactory dysfunction (OD). In the past few decades, the University of Pennsylvania Smell Identification Test (UPSIT)1,2 and the Sniffin’ Sticks Test (SST)3,4 have become the 2 most commonly used olfactory tests in research. The UPSIT involves a multistep scratch-and-sniff smell identification procedure that relies on intact executive functioning. The extended SST (112 sticks), which includes smell threshold, discrimination, and identification testing, takes an hour to complete in neurocognitively intact individuals and costs approximately $1000. The Screening SST-12 test (12 sticks) (SST12) is shorter and costs approximately $400. These olfaction assessment methods have been studied extensively in research trials and show promise as tools for diagnosing anosmia,5-10 monitoring neurocognitive symptoms,11-15 and predicting endoscopic sinus surgery outcomes,16-18 and they have usages in many other conditions. Despite over 2 decades of research studies, 2 issues emerge: routine olfactory testing in clinics is not done, and there is no consensus on the best test for olfactory testing.
Barriers to widespread adoption of current olfactory assessment methods may include expense, complexity, and limited utility of static odor concentrations. First, studies have acknowledged the overly complicated evaluation of SST’s olfaction threshold component 19 and the time-consuming procedure of the full odor discrimination, threshold, and identification test.20,21 Second, UPSIT’s multistep scratch-and-sniff smell identification procedure relies heavily on intact cognitive function. Studies have shown that identification is dependent on semantic memory and higher executive functioning, whereas olfactory detection (determining the presence of a scent) does not require intact higher cognitive processing.22,23 It is important to consider that olfactory detection serves as a prerequisite for intact olfactory identification. 24 Olfactory identification results are likely confounded by reduced executive functioning and loss of higher cognitive brain functioning, which limits utility in neurocognitive diseases. 22 Third, UPSIT and SST12 both use odors at suprathreshold odor intensities, which precludes utility in assessing minor olfactory losses. Other studies have attempted to include this population by expanding the 16-item SST to additional scents. 25 No point-of-care olfactory tests that could be used in routine clinical practice utilize assessment of different concentrations of scents.
We previously developed an essential oil−based Affordable, Rapid, Olfactory Measurement Array (AROMA) for point-of-care olfactory testing. AROMA uses multiple concentrations of odorants, tests scent detection, and identification, and scents can be modified. This combination of factors was intentional to circumvent known issues with other available tests of olfaction. Our previous study
26
demonstrated favorable correlation (
Methods
This project was reviewed by the University of Kansas Institutional Review Board of the authors’ institution and approved prior to commencement of study activities. All participants signed an approved informed consent document.
Materials
AROMA comprises 14 scents at different concentrations. Full description of olfactory testing methodology is described in our prior study. 26 Two additional higher concentrations were added to increase the ability to detect olfactory reserve in individuals. These concentrations allowed us to capitalize on our ability to titrate odorant concentration to address the known OD that accompanies both aging and neurocognitive disease. The essential oils were diluted at 4 concentrations (1×, 2×, 4×, 8×), and the selected dilutions were applied in uniform amounts to aromatherapy inhalant sticks. The 14 scents at 4 concentrations comprise a full battery of 56 inhalant sticks. However, not every individual is presented with every stick. Each individual begins at the 2× concentration; all scents at a particular concentration are completed before moving to the next round of testing at a different concentration. The order of odors is randomized prior to presentation. A correct response requires both correctly stating that an odor is present (scent detection is measured as “percent detected”) and correctly selecting the present odor among 4 multiple choices (scent identification is measured as “percent correct”). If the individual responds incorrectly, then the next higher concentration is added to the remaining lot of inhalant sticks. Correct response at the 2× concentration results in the individual being presented with the 1× concentration and assumes correct responses at the 4× and 8× concentrations ( Figure 1 ). As such, the maximum number of inhalant sticks presented to an individual is 42. Figure 2 shows scoring methodology.

Affordable, Rapid, Olfactory Measurement Array test design and administration.

Affordable, Rapid, Olfactory Measurement Array scoring methodology. Round 1: odors correctly identified at 2× are assumed to be correctly identified at 4× and 8×. Rounds 2 to 4: points given to correctly administered odors.
Participants completed the SNOT-22, QoD-NS, SST12, and AROMA.
The SNOT-22 is the most commonly used instrument measuring sinus symptomology. It consists of 22 questions, and each question is scored 0 to 5 points. The total range of scores is 0 to 110. Higher scores represent worse sinus symptoms.
The QoD measures the impact of OD on daily life. The NS portion of the QoD consists of 17 items. Each question is scored 0 to 3 points, and the total range of points is 0 to 51. The QoD was coded such that higher scores represent worse olfactory impairment on quality of life.
SST12 consists of 12 felt-tipped pens with 12 different, suprathreshold odorants. Each stick is presented to the participant for a short time under each nostril; then the participants are asked to select the scent among 4 multiple choices. 19 For the purpose of comparing AROMA to SST12, a question asking participants whether or not they detected a scent (yes/no) prior to the forced multiple choice was added to the normal SST12 protocol.
All data were captured and stored on REDcap. 27
Participant Population
Healthy individuals ages 18 to 90 years without sinonasal disease were prospectively recruited to complete the study. A subset of these participants volunteered to return for a follow-up visit 48 hours to 2 weeks after the initial visit based on a test-retest protocol. All volunteers were provided with informed consent prior to the study. All tests were administered in a proctored setting. Prior to informed consent, participants were asked if they had any subjective OD or current upper respiratory infection symptoms, and those individuals were excluded from the study. In addition, individuals with documented anosmia secondary to known surgical removal or agenesis of olfactory apparatus, history of never being able to detect smell, suspected malingering, neurocognitive or psychiatric disorders, or history of sinonasal inflammatory disease (eg, chronic sinusitis) were excluded from the study.
Statistical Analysis
Study data were collected and managed using REDCap 27 electronic data capture tools hosted at the University of Kansas Medical Center. Data were analyzed with SPSS version 24 (SPSS, Inc). Demographic statistics were reported with median and interquartile range. Spearman ρ was used to determine the degree of correlation between AROMA and SST12 odor detection and identification and age, QoD, and SNOT-22. The test-retest reliability coefficient for AROMA was assessed in the healthy cohort using the Pearson correlation coefficient between the initial and follow-up AROMA scores. AROMA and SST12 odor detection and identification scores were reported using median and interquartile range. In addition, AROMA scores at the 4 concentrations were reported with median and interquartile range.
Results
Population Completing AROMA and SST12
Fifty participants completed both AROMA and SST (descriptive demographics in
Table 1
). The sex distribution was predominately male: 22% were female and 78% were male. However, distribution of AROMA scores was not significantly different between the sexes (
Participant Characteristics and Demographics.
Abbreviations: IQR, interquartile range; QoD, Questionnaire of Olfactory Disorders; SNOT-22, Sino-Nasal Outcome Test; SST, Sniffin’ Sticks Test.
Test-Retest of AROMA
A subset of 20 participants volunteered to complete the AROMA test-retest protocol. Average age of this cohort was 42 years (95% CI, 35-50 years). The AROMA score (percentage of correct identification) remained relatively stable between the 2 visits: first visit (87.2%) vs second visit (90.9%). Pearson test-retest reliability coefficient for AROMA was strong (
AROMA vs SST12 Comparison
Spearman ρ correlation showed a moderate correlation between AROMA and SST12 correct identification (
Correlation of Objective Olfactory Tests to Subjective Olfactory Tests.
Abbreviations: AROMA, Affordable, Rapid, Olfactory Measurement Array; QoD, Questionnaire of Olfactory Disorders; SNOT-22, Sino-Nasal Outcome Test; SST, Sniffin’ Sticks Test.
Spearman correlation.
Scent Detection vs Identification for AROMA and SST12
AROMA median identification score was 82%, whereas SST median identification score was 92% ( Figure 3 ). As expected, median values for scent detection were about 9% higher than scent identification for both AROMA and SST. SST detection range was very narrow (92%-100%), whereas AROMA detection range was broader (55%-100%).

Box plots of Affordable, Rapid, Olfactory Measurement Array (AROMA) and Sniffin’ Sticks Test (SST) scents detected and identified in 50 participants. SST detected values were all at 100 for maximum and all the quartiles.
AROMA Concentrations
As expected, median performance on AROMA increased with increased concentrations ( Figure 4 ). There was no difference between median performance on 4× and 8× concentrations. The largest difference between 2 concentrations was between 2× and 4×; median performance increased by almost 25% from 2× to 4×.

Box plots of Affordable, Rapid, Olfactory Measurement Array scent identification rates by concentration in 50 participants.
Discussion
This was a prospective study comparing AROMA to SST12 in healthy individuals. AROMA was designed to address some of the limitations of current olfactory tests and uses 4 concentrations of odorant and tests both olfactory detection and identification. Test-retest of AROMA was
AROMA uses multiple concentrations, allowing adaptability for specific disease states and anticipated magnitude of OD. This study showed that AROMA is more reflective of subjective smell loss than SST12, with a stronger correlation between AROMA and SNOT-22 scores. Prior studies show that elders have reported OD 28 and are particularly vulnerable to increased frailty and malnutrition; in fact, OD is associated with frailty and reduced survival. 29 Surprisingly, SST12 was weakly correlated to age and a poor reflection of the OD that should be associated with increased age. AROMA had a moderate inverse correlation between age and olfactory performance. AROMA may be a more appropriate test for OD given its stronger correlation with subjective olfactory status, which improves its utility in daily practice.
AROMA’s olfactory testing with both scent detection and identification may broaden the applicability to olfactory testing in diseases with neurocognitive deficits. Olfactory impairment is a well-observed phenomenon that precedes neurocognitive decline in Alzheimer’s disease (AD) by many years. As intact olfactory detection (determining the presence of a scent) and higher cognitive functioning are both required for olfactory identification (identifying the name of a scent), it is important to have an olfactory test that includes both features.22-24 Both UPSIT and SST12 only measure olfactory identification. While reduced performance on olfactory identification with UPSIT and SST is certainly associated with AD, 30 results are likely confounded by reduced executive functioning and loss of higher cognitive brain functioning. 29 AROMA’s ability to test olfactory detection and identification gives clinicians a better ability to distinguish deficits in semantic memory from impairments in higher executive functioning. This study shows that scent detection with AROMA shows a spread of scores similar to SST12 scent identification. As expected, median detection scores were roughly 10% higher than identification scores. This added piece of information may allow us to obtain additional meaningful data in future studies with the neurocognitive population. Moreover, the ranges of scent concentrations tested in AROMA address the issue of SST12 and UPSIT not being able to detect minor olfactory loss in patients. Other studies have had to expand SST12 to other scents in order to evaluate minor losses. 25 This feature of AROMA allows the possibility of future studies to assess correlations between a wider range of olfactory losses and types of neurocognitive impairment. The implementation of AROMA in clinics may aid in overcoming the previously mentioned barriers to the regular use of olfactory tests in the evaluation of neurocognitive disorders and has the potential to improve conventional approaches. The goal of AROMA is to provide another valuable tool to aid clinicians in the early detection of neurocognitive disorders.
This study is not without limitations. The study population was predominantly male; however, distribution of AROMA scores is not significantly different between the sexes. In addition, only odorants available in essential oil format were used. Some scents might be dependent on cultural knowledge of an odorant. This study did not include participants with any known disease etiologies affecting olfaction. Follow-up studies evaluating AROMA in specific disease states are ongoing. As this was a pilot study on novel olfactory testing methodology, we are unable to conclude score ranges that would classify individuals as normosmic, hyposmic, or anosmic. As we continue to enroll individuals in subsequent studies, we will be able to gather a larger population that will enable normalization of AROMA results.
Conclusion
Currently available olfactory tests are complex, reliant on static concentrations, and expensive. These barriers may limit usage in funded research, clinical utility, and self-assessment. AROMA offers a possible solution while maintaining clinical significance. AROMA yielded comparable results to the SST12 test (
Author Contributions
Disclosures
Footnotes
Acknowledgements
We thank Bryan Humphrey, clinical research coordinator; Joseph Penn, otolaryngology research fellow; and medical students Madeleine St. Peter, Patrick Kim, Cody Uhlich, Luke Bontrager, and Chelsea Moore for their aid with participant recruitment and testing.
