Sage Journals: Discover world-class research

Abstract

Low-cost indoor air quality (IAQ) sensors offer new opportunities for real-time monitoring in the built environment by occupants and researchers. However, their performance can vary substantially depending on the environmental conditions. This study presents a comprehensive evaluation of carbon dioxide (CO₂) and fine particulate matter (PM_2.5) measurements from two consumer-grade low-cost sensors (the Airthings View Plus for CO₂ only and Air Gradient Pro for CO₂ and PM_2.5) through co-location tests with two reference instruments, Graywolf DSII-8 for CO₂ and Lighthouse Handheld 3016 for PM_2.5. Using time-series analysis, linear regression, Pearson correlation, Root-Mean Squared Error (RMSE), Bland-Altman test, and paired t-tests, we assess the precision and accuracy of these sensors. At a 5-minute sampling interval, the Air Gradient sensor had a higher coefficient of determination (R²), stronger Pearson correlation, and narrower range of limits of agreement (LoAs), but higher bias (i.e. the mean difference) and RMSE, suggesting higher precision but lower accuracy when compared to Airthings. As a result, it can perform well for tracking the relative changes in CO₂, though less ideal for absolute concentrations without calibration. For PM_2.5, the Air Gradient also had relatively high R² (0.79), moderately strong Pearson correlation (ρ = 0.69, p < 0.05), and a narrow range of LOAs (30.1 μg/m³) and low RMSE (5.8 μg/m³). Averaging the 5-minute measurements over 30-minute intervals generally improved the accuracy and precision of both sensors. However, statistically significant differences from the reference instruments remained for both sensors. Overall, this study offers a multi-metric assessment of consumer-grade sensors and highlights the need for in-situ calibration prior to long-term deployment.

Keywords

indoor air quality low-cost sensors LCS particulate matter

Introduction

Canadians spend approximately 90% of their time indoors (Government of Canada, 2021), making indoor environment quality critical for human health and well-being. Real-time assessment of indoor environmental conditions and pollutant concentrations is therefore essential for understanding occupants’ exposure and informing strategies for indoor environmental quality improvement.

Low-cost sensors are effective tools to monitor indoor pollutant concentrations, with growing applications in both consumer use as well as experimental research. They are widely accessible to consumers and are more affordable than research-grade instruments, making them suitable for real-time monitoring of indoor air quality (IAQ) or large-scale deployment. In addition, they are portable and often easy to deploy, allowing occupants to keep track of their indoor air quality and choose to take steps to reduce their exposure to indoor pollutants. In this paper, the performance of two low-cost sensors, Airthings View Plus (“Airthings”) and Air Gradient Pro (“Air Gradient”) is assessed through co-location testing with higher-accuracy research-grade instruments.

Background

The US EPA defines a low-cost sensor to cost between $100 and $2500 USD (Williams et al., 2014). However, the higher end of this range is likely inaccessible to many consumers. To address this, previous research has proposed a more practical upper limit of $500 USD (Zhang and Srinivasan, 2020). There is a wealth of literature on the performance of low-cost sensors. Figure 1 summarizes the indoor air and environmental quality parameters commonly assessed, based on a review of 40 studies published between January 2018 to December 2023. These studies were identified through Google Scholar, Scopus, and the Toronto Metropolitan University library database using various combinations of keywords including “IAQ,”“indoor air quality,”“low-cost sensor,”“performance,”“laboratory,”“field,” and “evaluating.” The two most commonly evaluated parameters in the literature were fine particulate matter (PM_2.5) and carbon dioxide (CO₂) concentrations. Among the 40 studies reviewed:

34 included PM testing (Afroz et al., 2023; Baldelli, 2021; Baptista et al., 2022; Chen et al., 2020; Collingwood et al., 2019; Coulby et al., 2021; Curto et al., 2018; Demanega et al., 2021; Gillooly et al., 2019; Jayaratne et al., 2020; Kaliszewski et al., 2020; Kim et al., 2023; Konstantinou et al., 2022; Li et al., 2018; Liu et al., 2020; Manibusan and Mainelis, 2020; Moreno-Rangel et al., 2018; Palmisani et al., 2021; Pei et al., 2023; Reis et al., 2023; Schalm et al., 2022; Shen et al., 2021; Singer and Delp, 2018; Taştan, 2022; Tiele et al., 2018; Tryner et al., 2021; Wang et al., 2019, 2020; Zamora et al., 2020; Zhang and Srinivasan, 2020; Zheng et al., 2022; Zou et al., 2020) and

16 included CO₂ testing (Afroz et al., 2023; Baldelli, 2021; Baptista et al., 2022; Coulby et al., 2021; Demanega et al., 2021; Gillooly et al., 2019; Konstantinou et al., 2022; Marinov et al., 2021; Moreno-Rangel et al., 2018; Palmisani et al., 2021; Reis et al., 2023; Taştan, 2022; Thomas et al., 2019; Tiele et al., 2018; Tryner et al., 2021; Zheng et al., 2022).

Figure 1.

Most commonly evaluated parameters among 40 studies published between Jan 2018 and Dec 2023.

Temperature and relative humidity (RH) performance were also commonly assessed (Afroz et al., 2023; Baptista et al., 2022; Coulby et al., 2021; Demanega et al., 2021; Kim et al., 2023; Konstantinou et al., 2022; Moreno-Rangel et al., 2018; Reis et al., 2023; Zheng et al., 2022).

Understandably, accurate measurements of PM_2.5 and CO₂ are important due to their implications for health and IAQ assessment. Exposure to elevated levels of PM_2.5 has been linked to a range of adverse health effects including increased respiratory symptoms, decreased lung function, and premature death in individuals with heart or lung conditions (United States Environmental Protection Agency, 2024). Meanwhile, CO₂ serves as a useful proxy for indoor ventilation levels, and exposure to high concentrations may increase the risk of decreased cognitive function and neurophysiological symptoms (Government of Canada, 2021).

While many studies have evaluated the measurement performance of low-cost sensors for CO₂ and PM_2.5, there remains a lack of standardized methodologies for sensor performance evaluation (Karagulian et al., 2019). In this study, we apply six analysis methods to assess the performance of two low-cost sensors and compare the results across these methods, to support the development of a more comprehensive evaluation procedure. This is accomplished by analyzing the accuracy and precision of sensors using data collected through co-location testing with reference instruments in an occupied house.

Data collection

Description of sensors

Co-location tests were conducted to collect CO₂ and PM_2.5 measurements using the Airthings and Air Gradient with two reference instruments, the Graywolf DSII-8 (Graywolf) for CO₂ and the Lighthouse Handheld 3016 (Lighthouse) for PM_2.5. Both instruments were calibrated by their manufacturers and have been used in previous peer-reviewed research as reference instruments (Baptista et al., 2022; Demanega et al., 2021; Moreno-Rangel et al., 2018; Zhang et al., 2022). In addition to their demonstrated accuracy, both reference instruments were selected for their portability and lower cost compared to some higher-grade alternatives.

The Graywolf DirectSense II CO₂ sensor (sensor model SEN-SMTX-CO2) is a dual-wavelength non-dispersive infrared (NDIR) sensor with a measurement range of 0–10,000 ppm. According to the manufacturer, it is factory-calibrated over five reference points to optimize performance in the 350–2000 ppm range, achieving an accuracy of ±35 ppm within that range. Between 2000 and 7000 ppm, the reported accuracy is ±3% of reading. While a formal calibration certificate was not available, this sensor was used as received and is commonly employed in indoor air quality and occupational exposure studies. Its accuracy and reliability have also been supported in prior peer-reviewed research (e.g. Baptista et al., 2022; Demanega et al., 2021; Moreno-Rangel et al., 2018).

The Lighthouse instrument reports particle number concentrations across five bin sizes: 0.3–0.5, 0.5–1.0, 1.0–2.5, 2.5–5.0, and 5.0–10.0 µm and collects raw data at one-second intervals (Lighthouse Worldwide Solutions, n.d.). The instrument was calibrated on May 23, 2022, in Medford, Oregon using certified monodisperse particles, in accordance with ISO 21501-4 2018 (Calibration Certificate 44704220544022). The device demonstrated a counting efficiency of 48.0% for 0.303 µm particles and 102.1% for 0.45 µm particles.

The Airthings is a wireless and battery-operated low-cost sensor. It was selected as it is widely used in the North American market (Airthings, 2024). It uses a non-dispersive infrared (NDIR) sensor for CO₂ measurements from 400 to 5000 ppm at 5-minute intervals, with an accuracy of ±50 ppm + 3% of reading (Airthings, n.d.). For PM_2.5 measurements, it uses a laser scattering optical particle sensor (Cubic PM2105L) and converts the optical signal directly to PM_2.5 mass concentrations in μg/m³ using internal algorithms calibrated with a Grimm reference instrument using cigarette smoke (Evotech Air Quality, 2023). The Airthings sensor can measure PM_2.5 from 0 to 500 μg/m³ at 10-minute intervals, with an accuracy of ±5 μg/m³ + 15% of reading when PM_2.5 is below 150 μg/m³, and ±5 μg/m³ + 20% of reading when PM_2.5 exceeds 150 μg/m³. Unfortunately, during our experiment, the Airthings sensor was unable to log PM_2.5 data due to a malfunction. As a result, only the CO₂ measurement performance of the Airthings sensor was evaluated in this study.

The Air Gradient sensor uses the Plantower PMS5003 sensor for PM_2.5 measurements (Air Gradient, n.d.). The sensor measures concentrations from 0 to 500 μg/m³ with an accuracy of ±10 μg/m³ below 100 μg/m³, and ±10% of reading between 100 and 500 μg/m³. For CO₂ measurement, the Air Gradient sensor is equipped with a SenseAir S8 NDIR sensor (Air Gradient, n.d.). While the manufacturer’s specifications indicate that the sensor may be either the SenseAir S8 or S88 model, the unit tested in this study was confirmed to use the S8. It measures CO₂ concentrations from 400 to 2,000 ppm with an accuracy of ±40 ppm + 3% of reading. The Air Gradient sensor records both PM_2.5 and CO₂ data every 30 seconds. This sensor was selected primarily for its open-access software and hardware, which provides greater flexibility for research applications.

To compare to the PM_2.5 mass concentrations measured by the two low-cost sensors, particle number concentrations measured by the Lighthouse were converted to mass concentrations assuming spherical particles and a constant particle density of 1.7 g/cm³. This density is commonly used by research-grade instruments such as the Grimm MiniWRAS (Durag Group, n.d.) and has been applied in previous studies (Li et al., 2016; Li and Siegel, 2020). The number concentrations in each bin were first converted to volume concentrations based on equations (1.1) and (1.2), then to mass concentrations using equations (1.3) and (1.4). PM_2.5 concentrations were then estimated by summing the converted mass concentrations from the first three bins (0.3–0.5, 0.5–1, and 1–2.5 μm). Similarly, PM₅ and PM₁₀ mass concentrations were estimated by summing the first four and all five bins, respectively, for further comparisons in this study.

r = \sqrt{upper limit \times lower limit}

(1.1)

V = \frac{4}{3} π r^{3} \times N

(1.2)

M = V \times ρ

(1.3)

C = M / V

(1.4)

where r is the geometric mean of the particle diameter based on the upper and lower limits of each bin, V is the total particle volume, N is the number concentration in each bin, ρ is the assumed particle density, M is the total mass of particle for each bin, and C is the calculated mass concentration for each bin.

Measurement setup

To evaluate the performance of the Airthings and Air Gradient sensors, co-location tests were conducted with the two reference instruments on a desk in a home office of a residence in downtown, Toronto, Canada. Figure 2 shows the setup of the sensors during the testing periods. The daily activities of the occupants, including the use of the home office, were not disrupted by the deployment of these devices and their locations remained unchanged throughout each test. CO₂ data were collected from May 23 to May 25, 2023, for both low-cost sensors and the Graywolf, and PM_2.5 data were collected from July 17 to July 19, 2023, with only the Air Gradient and Lighthouse. During the test periods, the only CO₂ source was occupants’ exhalation in the house. PM_2.5 was primarily generated from typical indoor sources such as resuspension, cleaning, and cooking, with possible contributions from the infiltration of outdoor PM_2.5. All measurements recorded by the Air Gradient, Greywolf, and Lighthouse were converted to 5-minute averages so that they could be compared directly with the Airthings. A total of 396 measurements and 516 measurements at 5-minute intervals were collected for CO₂ and PM_2.5, respectively.

Figure 2.

Co-location test set-up showing: (1) the Lighthouse Handheld 3016, (2) the Graywolf DSII-8, (3) the Air Gradient Pro, and (4) the Airthings View Plus sensors.

Analysis methods

A total of six analysis methods were selected primarily based on the findings from the aforementioned literature review. We identified that some of the most commonly used analysis methods included:

time-series visual analysis (Konstantinou et al., 2022; Palmisani et al., 2021; Reis et al., 2023; Schalm et al., 2022; Singer and Delp, 2018; Taştan, 2022; Thomas et al., 2019),

linear regression with the coefficient of determination (R²; Afroz et al., 2023; Baldelli, 2021; Chen et al., 2020; Collingwood et al., 2019; Curto et al., 2018; Demanega et al., 2021; Gillooly et al., 2019; Kim et al., 2023; Konstantinou et al., 2022; Liu et al., 2020; Manibusan and Mainelis, 2020; Marinov et al., 2021; Moreno-Rangel et al., 2018; Shen et al., 2021; Singer and Delp, 2018; Taştan, 2022; Tryner et al., 2021; Wang et al., 2019, 2020; Zamora et al., 2020; Zhang and Srinivasan, 2020; Zheng et al., 2022; Zou et al., 2020), and

Pearson correlation tests (Coulby et al., 2021; Demanega et al., 2021; He et al., 2020; Kaliszewski et al., 2020; Liu et al., 2020; Manibusan and Mainelis, 2020; Marinov et al., 2021; Reis et al., 2023; Singer and Delp, 2018; Taştan, 2022; Tryner et al., 2021).

Root-Mean Squared Error (RMSE) has become more commonly used in recent studies (Baptista et al., 2022; Gillooly et al., 2019; Li et al., 2018; Reis et al., 2023; Tryner et al., 2021; Zamora et al., 2020; Zheng et al., 2022). Although Bland-Altman plots are less commonly used for low-cost sensor assessment (Baldelli, 2021; Coulby et al., 2021; Curto et al., 2018; Moreno-Rangel et al., 2018; Reis et al., 2023), they are an effective way to quantify the agreement between two quantitative measures and were therefore included. Similarly, paired t-tests were included to examine if the mean differences between the low-cost sensors and the reference instruments were significant.

Using the six analysis methods, the performance of the Airthings and Air Gradient sensors was evaluated by comparing their 5-minute concentration measurements with the reference instruments. In addition, the measurements for both CO₂ and PM_2.5 were averaged over 30-minute intervals to assess the impact of sampling frequency on sensor performance. Details of these analysis methods are provided below.

Linear regression

Linear regression was conducted to identify the relationship between measurements from the low-cost sensors (dependent variable) and the reference instruments (independent variable). The coefficient of determination (R²), calculated using equation (2), quantifies the proportion of the variance in the dependent variable that is predictable from the independent variable.

R^{2} = 1 - \frac{\sum {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum {(y_{i} - \bar{y})}^{2}}

(2)

where $y_{i}$ is the measured value from the low-cost sensor, $\bar{y}$ is the mean of the low-cost sensor measurements, and ${\hat{y}}_{i}$ is the measured value from the reference instrument.

Pearson correlation

Pearson correlation analysis was conducted to assess the strength and direction of the correlation between the measurements from the low-cost sensors and reference instruments. The Pearson correlation coefficient (ρ), calculated using equation (3), quantifies the degree of association between the two datasets.

ρ = \frac{\sum (x_{i} - \bar{x}) (y_{i} - \bar{y})}{\sqrt{\sum {(x_{i} - \bar{x})}^{2} \sum {(y_{i} - \bar{y})}^{2}}}

(3)

where $x_{i}$ and $y_{i}$ are the paired measurements from the low-cost sensor and the reference instrument, respectively, and $\bar{x}$ and $\bar{y}$ are their corresponding means.

RMSE

RMSE was calculated to quantify the mean difference between the measurements from the low-cost sensors and the reference instrument, using equation (4).

RMSE = \sqrt{\sum_{i = 1}^{n} \frac{{({\hat{y}}_{i} - y_{i})}^{2}}{n}}

(4)

where ${\hat{y}}_{i}$ is the measurement from the reference instrument, $y_{i}$ is the corresponding low-cost sensor measurement, and n is the number of paired data points.

Bland-Altman

Bland-Altman analysis was conducted to illustrate the bias between the low-cost and reference instrument measurements. First, the mean of the low-cost and reference measurements, the difference between their measurements, and the standard deviations ( $σ)$ of their differences were calculated. The bias and the upper and lower levels of agreement (LoAs) are then calculated using equations (5.1), (5.2), and (5.3) and plotted against the mean values for visual inspection.

Bias = \frac{\sum (y_{i} - x_{i})}{n}

(5.1)

Upper LoA = Bias + 1.96 \times σ

(5.2)

Lower LoA = Bias - 1.96 \times σ

(5.3)

where $y_{i}$ is the measurement from the low-cost sensor, $x_{i}$ is the paired measurement from the reference instrument, n is the number of paired data points, and $σ$ is the standard deviation of the differences between the low-cost and reference measurements.

Paired t-test

Paired t-test was conducted to determine whether the mean values measured by the low-cost sensors and reference instruments were significantly different. To validate assumptions of the t-test, a Jarque-Bera Normality Test was conducted to determine whether the difference between the low-cost sensors and the reference instruments followed a normal distribution.

Two extreme CO₂ outliers measured by the Greywolf were identified through visual inspection of the time-series comparison plots, as they were the only points exceeding 850 ppm, while the remainder of the measurements were lower than 700 ppm. They were likely caused by accidental contact by the occupants with the Graywolf, and were hence excluded from the analysis.

Results

The analysis results for the CO₂ and PM_2.5 measurements are presented in this section, starting with analyses for CO₂ at 5- and 30-minute intervals, followed by the analyses for PM_2.5. The results from the time-series analysis, linear regression, and Bland-Altman analysis are presented in figures, and the results from Pearson correlation, RMSE, and paired t-tests are summarized in Tables.

CO₂ data analysis

Figures 3(a) and (b) show the time-series comparison of CO₂ concentrations measured by the Graywolf, Airthings, and Air Gradient at 5- and 30-minute intervals, respectively. Both figures show that the Airthings closely follow the measurements by the Graywolf with a moderate mean underestimation of 11.6 ppm (std. dev. = 15.5 ppm) from the start of the test before noon on May 24th. With the peak concentration of 603.1 ppm recorded by the Graywolf on the afternoon of May 24th, the Airthings reported larger overestimations with a maximum of 105.4 ppm (mean = 60.5 ± 22.5 ppm) for approximately 80 minutes (n = 16). Interestingly, the Airthings then reported underestimations with greater fluctuations in CO₂ concentrations with a mean underestimation of 34.0 ± 31.8 ppm following the peak and continuing until the end of the test. In contrast, the Air Gradient sensor consistently underreported the CO₂ concentrations by a mean of 78.3 ppm throughout the test, though with fewer fluctuations (std. dev. = 16.2 ppm). Despite these discrepancies, both sensors captured the same overall trend observed by the Greywolf and were able to detect all elevated CO₂ concentrations caused by occupant activities in the vicinity. The aggregated 30-minute concentrations (Figure 3(b)) show smoother trends of CO₂ concentrations from both sensors. However, it further highlights the fluctuations in the Airthings measurements toward the end of the testing period.

Figure 3.

Time-series comparison of CO₂ concentrations at: (a) 5-minute and (b) 30-minute intervals.

Linear regression plots comparing the Airthings and Air Gradient sensor data to the Graywolf measurements at 5- and 30-minute intervals are shown in Figures 4(a) and (b). The regression models reported an R² value of 0.68 for Airthings at 5-minute intervals, indicating that 68% of the variance in its measurements could be explained by the reference instrument. In comparison, the Air Gradient sensor had a higher R² value of 0.87, which is also visually evident from the better alignment of its data points with the line of the best fit shown in Figure 4(a). At 30-minute intervals (Figure 4(b)), both sensors had higher R² values. This is expected as aggregation smooths out short-term fluctuations in measurements and reduces discrepancies that may be caused by sensor response time. However, at both measurement intervals, most data points from both sensors fall below the 1:1 relationship line, indicating a consistent underestimation of CO₂ concentrations.

Figure 4.

Linear regression for Airthings (blue) and Air Gradient (yellow) CO₂ measurements against the Graywolf at: (a) 5-minute intervals and (b) 30-minute intervals.

Figure 5 shows the Bland-Altman analysis comparing the CO₂ concentrations measured by Airthings and Air Gradient against the Graywolf at both 5- and 30-minute intervals. Figure 5(a) shows greater disagreement when the mean CO₂ concentration is below 550 ppm for the Airthings sensor. As CO₂ concentrations increase, the differences between Airthings and Graywolf become smaller, with data points clustering closer to the bias line (−19.4 ppm). However, a notable number of points fall outside the lower LoA (−81.7 ppm) at lower concentrations and outside of the upper LoA (42.8 ppm) at higher concentrations, indicating the Airthings sensor underestimates CO₂ at lower concentrations and overestimates it at higher concentrations. Although the bias line shows that the average measurement of the Airthings was relatively accurate, the wide range of LoAs (124.5 ppm) shows it was not precise.

Figure 5.

Bland-Altman plots for CO₂ concentration by (a) Airthings and (b) Air Gradient compared to the Graywolf at 5-minute intervals and by (c) Airthings and (d) Air Gradient at 30-minute intervals.

In contrast, Figure 5(b) shows the Air Gradient sensor has a larger bias (−78.3 ppm), indicating a greater level of underestimation of CO₂ concentrations aligned with the observations from the time-series comparison. However, with a standard deviation of the differences of 16.2 ppm, the range of its LoAs (63.6 ppm) is narrower than that of the Airthings, suggesting the Air Gradient provides more precise measurements. Similar to the Airthings, more points fall outside the upper LoA at higher concentrations (>550 ppm). Interestingly, because the Air Gradient consistently underestimated concentrations, this deviation from the bias line made the measurements less precise, but ultimately more accurate at higher concentrations. At 30-minute intervals (Figure 5(c) and (d)), the standard deviations of difference for both Airthings (from 31.8 to 24.5 ppm) and Air Gradient (from 16.2 to 12.5 ppm) decreased moderately, with fewer data points falling outside the LoAs. This suggests that data aggregation improves the precision of both sensors, though the general trends remain unchanged.

Table 1 summarizes the Pearson correlation coefficient, RMSE, and paired t-test results for the Airthings and Air Gradient sensors at 5- and 30-minute intervals. The Pearson correlation tests show that both low-cost sensors have positive and statistically significant correlations with Greywolf. The Air Gradient has a stronger correlation of 0.93, confirming the findings from the linear regressions. Aggregating the measurements over 30 minutes further improves the strength of the correlation. In contrast, the Airthings has 54% and 58% lower RMSE when compared to the Air Gradient at 5- and 30-minute intervals, respectively, indicating it provided more accurate measurements during the test period. However, the paired t-test results show that the measurements from both sensors, regardless of the sampling intervals, are statistically significantly different from the Graywolf CO₂ measurements (p < 0.05).

Table 1.

Summary of Pearson correlation, RMSE, and paired t-test results for both sensors at 5- and 30-minute intervals.

Low-cost sensor	Sampling interval (min)	Pearson correlation test		RMSE (ppm)	Paired t-test p-value
Low-cost sensor	Sampling interval (min)	Coefficient (ρ)	p-value	RMSE (ppm)	Paired t-test p-value
Airthings	5	0.82	0.000	37	0.000
	30	0.89	0.000	31	0.000
Air gradient	5	0.93	0.000	80	0.000
	30	0.96	0.000	78	0.000

PM_2.5 data analysis

Figures 6(a) and (b) present the time-series comparison of PM_2.5 concentration between Air Gradient and Lighthouse at 5- and 30-minute intervals. Throughout the sampling period, the Air Gradient sensor consistently overreported the PM_2.5 concentration levels (mean difference = 3.0 ± 5.0 μg/m³). Particularly, it captured periods of elevated concentrations that were not detected by the Lighthouse. To further quantify this discrepancy, we defined an elevated concentration period as any period with PM_2.5 concentrations remaining above 10 μg/m³ for at least 15 minutes. The Lighthouse only identified 2.9% (75 minutes) of the test periods as elevated, while the Air Gradient identified 18% (465 minutes). Even during elevated periods identified by both sensors, the Air Gradient overreported PM_2.5 with a mean overestimation of 11.3 μg/m³ and a maximum overestimation of 48.8 μg/m³. This is unexpected, as the higher-grade reference instruments are generally better at detecting elevated concentrations when compared to low-cost sensors (Singer and Delp, 2018). The 30-minute time-series comparison in Figure 6(b) shows that after data aggregation, the Air Gradient’s response aligns more closely to the Lighthouse. The maximum difference also decreased from 48.8 to 15.6 μg/m³. It shows that the alignment of the Air Gradient with the Lighthouse can be improved at the expense of data resolution. However, this alignment improvement does not necessarily lead to performance improvement of the sensor. This point is further discussed in the linear regression results.

Figure 6.

Time-series comparison of PM_2.5 concentrations at: (a) 5-minute and (b) 30-minute intervals.

A likely cause for the discrepancy shown in Figure 6 is that the counting algorithms of the two devices respond differently to varying particle sizes. Although they are both optical, the Lighthouse sorts particles into five bins with different particle sizes and reports particle number concentrations in each bin, while the Air Gradient converts the light scattering signals directly to mass concentrations based on internal calibration algorithms.

A closer examination of the Lighthouse concentration measurements shows that on average, particles in the 0.3–0.5, 0.5–1.0, and 1.0–2.5 μm contributed to 28.2%, 24.4%, and 47.4% of the PM_2.5 mass concentrations, respectively. During the elevated concentration periods, these contributions shifted to 21.8%, 25.6%, and 52.7% of the mass concentrations, respectively. This suggests that changes in particle distribution may have led to the Air Gradient sensor overestimating the size of particles greater than 0.5 μm based on the light scattering signals, resulting in inflated mass concentration estimates. Furthermore, the Air Gradient might count particles greater than 2.5 μm for PM_2.5 mass concentrations. As shown in Figure 7, the Air Gradient measurements were better aligned with the Lighthouse PM₅ or even PM₁₀ measurements during some elevated concentration periods.

Figure 7.

Time-series comparison of Air Gradient PM_2.5 concentrations with Lighthouse PM_2.5, PM₅, and PM₁₀ concentrations converted based on number concentrations.

The Lighthouse’s response to some emission sources could also be a potential contributing factor. For instance, Singer and Delp (2018) found that two types of research-grade instruments (Thermo Pdr-1500 and MetOne BT-645) had weaker responses to large particle sources such as Arizona Test Dust 2 (0–3 μm) and dust from a workshop dust mop when compared to some low-cost sensors. Similarly, the Lighthouse may have underperformed when sizing some larger particles from activities such as sweeping, mopping, vacuuming, and resuspension of particles caused by occupant movements into the corresponding bins, resulting in an underestimation of PM_2.5. The Lighthouse’s relatively low percentage of elevated PM_2.5 concentration periods, compared to previous studies on the impact of indoor emissions in residential settings, also suggests that some emission events may not have been captured (Chan et al., 2018; Zhang et al., 2020). However, since we did not record occupants’ activities, this hypothesis could not be verified.

Figure 8(a) and (b) show the linear regression of the Air Gradient PM_2.5 measurements against the Lighthouse measurements at 5- and 30-minute intervals, respectively. Most data points are clustered near the origin and above the 1:1 relationship line, further confirming that the Air Gradient overreported PM_2.5 concentrations. Interestingly, while the R² values for both the 5- and 30-minute data were relatively high, aggregation did not improve them. This suggests that while aggregation reduced the maximum difference between the devices and improved alignment as discussed earlier, the variance in the 30-minute Air Gradient data was not better explained by the Lighthouse measurements.

Figure 8.

Linear regression for Air Gradient PM_2.5 measurements against the Lighthouse at (a) 5-minute intervals and (b) 30-minute intervals. Note Figure 7 b) has a smaller scale because of the aggregated data.

The Bland-Altman comparisons between the Air Gradient and Lighthouse in Figures 9(a) and (b) show that the differences between the two devices were not consistent. Instead, the differences increased with the mean PM_2.5 concentrations in a near-linear relationship. Most data points are also located above the bias lines, once again, indicating a consistent overestimation by Air Gradient. Aggregating the 5-minute measurements to 30-minute intervals slightly increased the bias from 3.0 to 4.8 μg/m³, while slightly reducing the range of the lower and upper LoAs from −12.1 – 18.0 to −7.3 – 16.8 μg/m³, respectively.

Figure 9.

Bland-Altman plots for PM_2.5 concentration by Air Gradient compared to the Lighthouse at: (a) 5-minute intervals and (b) at 30-minute intervals. For better visual clarity, the graphs are cropped at a mean PM_2.5 concentration of 40 μg/m³, resulting in 2 data points excluded from (a) and 1 data point excluded from (b).

Lastly, Table 2 summarizes the 5- and 30-minute interval results of the Pearson correlation test, RMSE, and paired t-test results. The Pearson correlation tests show that the Air Gradient has a positive and statistically significant correlation with the Lighthouse, with the strength of the correlation improving through data aggregation. Similarly, the RMSE of the Air Gradient decreased with aggregation. However, the paired t-test indicates that the mean difference between the devices remains statistically significant (p < 0.05) regardless of the aggregation.

Table 2.

Summary of Pearson correlation, RMSE, and paired t-test results for Air Gradient at 5- and 30-minute intervals.

Low-cost sensor	Sampling interval (min)	Pearson correlation test		RMSE (μg/m³)	Paired t-test
Low-cost sensor	Sampling interval (min)	Coefficient (ρ)	p-value	RMSE (μg/m³)	Paired t-test
Air gradient	5	0.69	0.000	5.8	0.000
	30	0.87	0.000	5.2	0.000

Discussion

In this study, co-location tests were conducted to assess the performance of two low-cost sensors, the Airthings and Air Gradient against the Graywolf for CO₂ measurements, and the Air Gradient for PM_2.5 measurements against the Lighthouse. By employing multiple analysis methods, we were able to gain a more comprehensive evaluation of the sensor performance in terms of accuracy and precision, which cannot be easily derived from a single evaluation approach. The results of these assessments highlight the strengths and limitations of these sensors and help identify the most suitable scenarios for their deployment.

The CO₂ results show that Airthings provided more accurate CO₂ measurements, with better alignment in the time-series comparison and data points clustering closer to the 1:1 line in linear regression. At 5-minute intervals, it also had a smaller bias (−19.4 ppm) from the Bland-Altman analysis and a smaller RMSE of 37 ppm, both within the manufacturer’s specified accuracy of 50 ppm + 5% of readings. On the contrary, the Air Gradient was not as accurate, with a bias of −78 ppm and an RMSE of 80 ppm at 5-minute intervals, exceeding the manufacturer’s specification (40 ppm + 3% of readings). However, the Air Gradient was more precise, with a narrower range of LoAs from the Bland-Altman analysis. It also has a greater R² value of 0.87 compared to the Airthings (R² = 0.68).

These findings show that while the Airthings could provide a better representation of the actual indoor CO₂ concentrations, the Air Gradient is more reliable for tracking the relative changes, instead of the absolute values. This makes the Air Gradient more suitable for applications such as estimating indoor and outdoor air exchange rates based on increases and decreases in CO₂ concentrations (Di Gilio et al., 2021; Du et al., 2024), where fluctuations in CO₂ levels are more important than absolute values. In addition, since the bias of the Air Gradient remained generally consistent across different CO₂ levels, calibration equations can be easily applied to improve its accuracy. Averaging the 5-minute concentrations to 30-minute intervals improved the accuracy and precision of both sensors at the cost of data resolution. However, this approach limits their applicability in transient-state analyses, such as air exchange rate estimation, which requires high data resolution (Du et al., 2024). For long-term CO₂ exposure assessment and studies on the risk of airborne infectious disease transmission (e.g. Rudnick and Milton, 2003), however, this trade-off may be acceptable.

Due to the malfunction of the Airthings, PM_2.5 performance assessment was only conducted for the Air Gradient sensor. Overall, Air Gradient consistently overestimated the PM_2.5 concentrations by an average of 3.0 μg/m³ with an RMSE of 5.8 μg/m³. While both fall within the manufacturer’s specifications (±10 μg/m³ below 100 μg/m³, and ±10% of reading between 100 and 500 μg/m³), the overestimation was more pronounced in the presence of emission sources, with a maximum difference of 48.8 μg/m³, likely due to misidentifying particles larger than 2.5 μm for PM_2.5 mass concentrations. This was evident by comparing Air Gradient PM_2.5 data with the PM₅ and PM₁₀ measurements by the Lighthouse. Despite a relatively high coefficient of determination (R² = 0.77) and Pearson correlation (ρ = 0.69) compared to previous studies on other types of low-cost sensors (e.g. Coulby et al., 2021; Singer and Delp, 2018), the increasing discrepancy between the Air Gradient and the Lighthouse at higher PM_2.5 concentrations indicates that the Air Gradient may not be suitable for short-term tracking of changes in particle concentrations or long-term exposure monitoring without careful calibration.

Overall, the findings from the CO₂ and PM_2.5 co-location tests show that while low-cost sensors can provide useful information on pollutant concentrations, their accuracy and precision should be carefully evaluated in the specific environments they would be deployed. Notably, all paired t-test results confirmed that both low-cost sensors measured mean concentrations that were statistically significantly different from the reference instruments, regardless of the sampling intervals. For low-cost sensors measuring PM_2.5 concentrations, it is especially important to challenge the sensors against the specific type of particles they would be exposed to as their responses vary by particle type. This was also frequently noted in previous studies (Jayaratne et al., 2020; Qin et al., 2024).

There were a few limitations in this study that should be acknowledged. Firstly, the testing durations for both CO₂ and PM_2.5 measurements were relatively short. A longer measurement duration would enhance the analysis and provide a more comprehensive assessment of both sensors. It would also allow for comparison of the mean values over longer periods (e.g. the daily averages) that are commonly used for occupant exposure assessment. A related limitation is the generally narrow range of the CO₂ concentrations (from 450 to 650 ppm) and the generally low PM_2.5 concentrations (only 20% of the time exceeded 5 μg/m³ based on measurements by Lighthouse), which limit the testing of these sensors under more extreme conditions. Changing the measurement location to a space with higher occupancy density and more occupant activities (e.g. the living room or kitchen) would likely address both issues. Furthermore, we did not track other factors that might influence the performance of sensors. For instance, environmental parameters such as temperature and RH could impact the accuracy of NDIR sensors for CO₂ measurements (Vafaei and Amini, 2021). Similarly, a record of the occupant activities could provide us with more information on the source of particles to determine their size distribution and composition, thus better understanding their influence on PM_2.5 measurements. Lastly, the study design can be further improved by deploying multiple units of the same sensors during the tests. This could greatly improve the reliability of our data collection process and mitigate data loss due to malfunctions. It would also enable us to examine the intra-sensor variations and better identify their most suitable deployment scenarios. Based on our experience with the Airthings sensor, we also recommend regular checks during tests to minimize data loss from connectivity issues.

Conclusion

Continuous real-time monitoring of indoor air pollutants, such as CO₂ and PM_2.5 is crucial for assessing occupants’ exposure and providing information for the HAVC systems, building managers, and occupants on strategies to improve air quality. In this study, we evaluated the performance of Airthings and Air Gradient for CO₂ measurements, and Air Gradient for PM_2.5 measurement against the reference instruments. The comparison results show that the Airthings provide more accurate CO₂ measurements, making it more suitable for exposure assessment or threshold-based control strategies, such as Demand Controlled Ventilation (ASHRAE, 2022). In contrast, the Air Gradient was more precise while less accurate, making it more suitable for tracking relative changes in CO₂ concentrations for air exchange rate estimations. For PM_2.5 measurements, although the Air Gradient sensor had a small mean difference (i.e. bias), its overestimation increases with PM_2.5 concentrations, making it less than ideal for concentration monitoring without careful calibration. The paired t-tests from all analyses confirmed significant differences between the mean measurements by the low-cost sensors and the reference instruments, further highlighting the importance of colocation tests and calibration.

The six analysis methods employed in this study show they could provide a well-rounded assessment of sensor accuracy and precision. Particularly, the Bland-Altman analysis, which was less commonly used in previous studies on low-cost sensors, proved valuable in determining the trend of bias across pollutant concentrations and quantifying precision through the range of LoAs. We recommend that future studies, low-cost sensor evaluation agencies or guidelines (e.g. Health Canada, AQ-SPEC, and US EPA), and potentially low-cost sensor manufacturers adopt this set of testing metrics for sensor evaluation (South Coast AQMD, n.d.; United States Environmental Protection Agency, 2025).

We also recommend future studies to conduct longer co-location tests with a wider range of pollutant concentrations, account for parameters such as temperature, RH, and occupant activities beyond the pollutant concentrations, and deploy multiple units of the same sensor type to better evaluate their performance. In addition, usability metrics such as reliability, cloud connectivity, integration with building systems, and visual display clarity should be evaluated, as they play an important role in the successful adoption of these sensors at a larger scale.

Footnotes

ORCID iD

Helen Stopps

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was funded by the Natural Sciences and Engineering Research Council of Canada (NSERC Discovery Grant RGPIN 2022-03821).

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

References

Afroz

Guo

Cheng

, et al. (2023) Investigation of indoor air quality in university residences using low-cost sensors. Environmental Science: Atmospheres 3(2): 347–362.

Air Gradient (n.d.) Indoor Air Quality Monitor. Available at: https://www.airgradient.com/indoor/ (accessed 22 November 2024).

Airthings (2024) Airthings delivered strong top-line growth in Q2 2024; Updates strategy. Available at: https://www.airthings.com/newsroom/q2_2024_press_release (accessed 22 November 2024).

Airthings (n.d.) Air Quality Monitor: View Plus. Available at: https://www.airthings.com/en-ca/view-plus (accessed 22 November 2024).

ASHRAE (2022) ASHRAE 62.1 Ventilation and Acceptable Indoor Air Quality: Addendum ab. Peachtree Corners, GA: ASHRAE.

Baldelli

(2021) Evaluation of a low-cost multi-channel monitor for indoor air quality through a novel, low-cost, and reproducible platform. Measurement: Sensors 17: 100059.

Baptista

Almeida-Silva

Silva

, et al. (2022) Indoor air quality assessment in grocery stores. Applied Sciences 12(24): 12940.

Chan

Logue

, et al. (2018) Quantifying fine particle emission events from time-resolved measurements: Method description and application to 18 California low-income apartments. Indoor Air 28(1): 89–101.

Chen

LWA

Olawepo

Bonanno

, et al. (2020) Schoolchildren’s exposure to PM2.5: a student club–based air quality monitoring campaign using low-cost sensors. Air Quality, Atmosphere and Health 13(5): 543–551.

10.

Collingwood

Zmoos

Pahler

, et al. (2019) Investigating measurement variation of modified low-cost particle sensors. Journal of Aerosol Science 135: 21–32.

11.

Coulby

Clear

Jones

, et al. (2021) Low-cost, multimodal environmental monitoring based on the Internet of Things. Building and Environment 203: 108014.

12.

Curto

Donaire-Gonzalez

Barrera-Gómez

, et al. (2018) Performance of low-cost monitors to assess household air pollution. Environmental Research 163: 53–63.

13.

Demanega

Mujan

Singer

, et al. (2021) Performance assessment of low-cost environmental monitors and single sensors under variable indoor air quality and thermal conditions. Building and Environment 187: 107415.

14.

Di Gilio

Palmisani

Pulimeno

, et al. (2021) CO2 concentration monitoring inside educational buildings as a strategic tool to reduce the risk of Sars-CoV-2 airborne transmission. Environmental Research 202: 111560.

15.

Reda

Licina

, et al. (2024) Estimating air change rate in mechanically ventilated classrooms using a single CO2 sensor and automated data segmentation. Environmental Science and Technology 58:18788–18799.

16.

Durag Group (n.d.) MiniWRAS 1371 Portable wide-range aerosol spectrometer. Available at: https://www.durag.com/en/product-filter-837.htm?productID=MiniWRAS1371 (accessed 22 November 2024).

17.

Evotech Air Quality (2023) Smart indoor air quality monitoring solutions. Available at: https://www.evotechairquality.co.uk/assets/files/EvotechAirthingsBrochureOct23.pdf?static=1 (accessed 22 November 2024).

18.

Gillooly

Zhou

Vallarino

, et al. (2019) Development of an in-home, real-time air pollutant sensor platform and implications for community use. Environmental Pollution 244: 440–450.

19.

Government of Canada (2021) Residential indoor air quality guidelines: Carbon dioxide. Available at: https://www.canada.ca/en/health-canada/services/publications/healthy-living/residential-indoor-air-quality-guidelines-carbon-dioxide.html (accessed 22 November 2024).

20.

Han

Bachman

, et al. (2020) Evaluation of two low-cost PM monitors under different laboratory and indoor conditions. Aerosol Science and Technology 55(3): 316–331.

21.

Jayaratne

Liu

Ahn

, et al. (2020) Low-cost PM2.5 sensors: An assessment of their suitability for various applications. Aerosol and Air Quality Research 20(3): 520–532.

22.

Kaliszewski

Włodarski

Młyńczak

, et al. (2020) Comparison of low-cost particulate matter sensors for indoor air monitoring during covid-19 lockdown. Sensors (Switzerland) 20(24): 1–17.

23.

Karagulian

Barbiere

Kotsev

, et al. (2019) Review of the performance of low-cost sensors for air quality monitoring. Atmosphere 10(9): 506.

24.

Kim

Shin

Hwang

(2023) Calibration of low-cost sensors for measurement of indoor particulate matter concentrations via laboratory/field evaluation. Aerosol and Air Quality Research 23(8): 230097.

25.

Konstantinou

Constantinou

Kleovoulou

, et al. (2022) Assessment of indoor and outdoor air quality in primary schools of Cyprus during the COVID–19 pandemic measures in May–July 2021. Heliyon 8(5): e09354.

26.

Chen

, et al. (2016) Physiochemical properties of carbonaceous aerosol from agricultural residue burning: Density, volatility, and hygroscopicity. Atmospheric environment (1994) 140: 94–105.

27.

, et al. (2018) Spatiotemporal distribution of indoor particulate matter concentration with a low-cost sensor network. Building and Environment 127: 138–147.

28.

Siegel

(2020) In situ efficiency of filters in residential central HVAC systems. Indoor Air 30(2): 315–325.

29.

Lighthouse Worldwide Solutions (n.d.) Lighthouse Handheld Technical Data Sheet. White City, Oregon: Lighthouse Worldwide Solutions.

30.

Liu

Jayaratne

Thai

, et al. (2020) Low-cost sensors as an alternative for long-term air quality monitoring. Environmental Research 185: 109438.

31.

Manibusan

Mainelis

(2020) Performance of four consumer-grade air pollution measurement devices in different residences. Aerosol and Air Quality Research 20(2): 217–230.

32.

Marinov

Ganev

Nikolov

(2021) Indoor air quality assessment using low-cost commercial off-the-shelf sensors. In: Proceedings of the 2021 6th international symposium on environment-friendly energies and applications, EFEA 2021, Sofia, Bulgaria, 24–26 March 2021. New York, NY: IEEE.

33.

Moreno-Rangel

Sharpe

Musau

, et al. (2018) Field evaluation of a low-cost indoor air quality monitor to quantify exposure to pollutants in residential environments. Journal of Sensors and Sensor Systems 7(1): 373–388.

34.

Palmisani

Di Gilio

Viana

, et al. (2021) Indoor air quality evaluation in oncology units at two European hospitals: Low-cost sensors for TVOCs, PM2.5 and CO2 real-time monitoring. Building and Environment 205: 108237.

35.

Pei

Freihaut

Rim

(2023) Long-term application of low-cost sensors for monitoring indoor air quality and particle dynamics in a commercial building. Journal of Building Engineering 79: 107774.

36.

Qin

Wei

Ning

, et al. (2024) Dissecting PM sensor capabilities: A combined experimental and theoretical study on particle sizing and physicochemical properties. Environmental Pollution 356: 124354.

37.

Reis

Lopes

Graça

, et al. (2023) Using low-cost sensors to assess real-time comfort and air quality patterns in indoor households. Environmental Science and Pollution Research 30(3): 7736–7751.

38.

Rudnick

Milton

(2003) Risk of indoor airborne infection transmission estimated from carbon dioxide concentration. Indoor Air 13(3): 237–245.

39.

Schalm

Carro

Lazarov

, et al. (2022) Reliability of lower-cost sensors in the analysis of indoor air quality on board ships. Atmosphere 13(10): 1579.

40.

Shen

Hou

Zhu

, et al. (2021) Temporal and spatial variation of PM2.5 in indoor air monitored by low-cost sensors. Science of the Total Environment 770: 145304.

41.

Singer

Delp

(2018) Response of consumer and research grade indoor air quality monitors to residential sources of fine particles. Indoor Air 28(4): 624–639.

42.

South Coast AQMD (n.d.) AQ-SPEC: Air quality sensor performance evaluation center. Available at: https://www.aqmd.gov/aq-spec (accessed 22 November 2024).

43.

Taştan

(2022) A low-cost air quality monitoring system based on Internet of Things for smart homes. Journal of Ambient Intelligence and Smart Environments 14(5): 351–374.

44.

Thomas

Mistry

Snow

, et al. (2019) Indoor air quality monitoring (IAQ): A low-cost alternative to CO2 monitoring in comparison to an industry standard device. In: Advances in Intelligent Systems and Computing. Cham: Springer Verlag, pp.1010–1027.

45.

Tiele

Esfahani

Covington

(2018) Design and development of a low-cost, portable monitoring device for indoor environment quality. Journal of Sensors 2018: 1–14.

46.

Tryner

Phillips

Quinn

, et al. (2021) Design and testing of a low-cost sensor and sampling platform for indoor air quality. Building and Environment 206: 108398.

47.

United States Environmental Protection Agency (2024) Health and Environmental Effects of Particulate Matter (PM). Available at: https://www.epa.gov/pm-pollution/health-and-environmental-effects-particulate-matter-pm (accessed 22 November 2024).

48.

United States Environmental Protection Agency (2025) Air sensor performance targets and testing protocols. Available at: https://www.epa.gov/air-sensor-toolbox/air-sensor-performance-targets-and-testing-protocols (accessed 7 July 2025).

49.

Vafaei

Amini

(2021) Chamberless NDIR CO(2) sensor robust against environmental fluctuations. ACS Sensors 6(4): 1536–1542.

50.

Wang

Chen

F er

, et al. (2019) Evaluating the feasibility of a personal particle exposure monitor in outdoor and indoor microenvironments in Shanghai, China. International Journal of Environmental Health Research 29(2): 209–220.

51.

Wang

Delp

Singer

(2020) Performance of low-cost indoor air quality monitors for PM2.5 and PM10 from residential sources. Building and Environment 171: 106654.

52.

Zamora

Rice

Koehler

(2020) One year evaluation of three low-cost PM2.5 monitors. Atmospheric Environment 235: 117615.

53.

Zhang

Srinivasan

(2020) A systematic review of air quality sensors, guidelines, and measurement studies for indoor air quality management. Sustainability (Switzerland) 12(21): 1–40.

54.

Zhang

Tan

Liang

, et al. (2022) Dynamic processes of dust emission from gobi: A portable wind tunnel study atop the Mogao Grottoes, Dunhuang, China. Aeolian Research 55: 100784.

55.

Zhang

Siegel

(2020) Investigating the impact of filters on long-term particle concentration measurements in residences (RP-1649). Science and Technology for the Built Environment 26(8): 1037–1047.

56.

Zheng

Krishnan

Walker

, et al. (2022) Laboratory evaluation of low-cost air quality monitors and single sensors for monitoring typical indoor emission events in Dutch daycare centers. Environment International 166: 107372.

57.

Zou

Young

Chen

, et al. (2020) Examining the functional range of commercially available low-cost airborne particle sensors and consequences for monitoring of indoor air quality in residences. Indoor Air 30(2): 213–234.

Validating the performance of low-cost IAQ sensors through co-location

Abstract

Keywords

Introduction

Background

Data collection

Description of sensors

Measurement setup

Analysis methods

Linear regression

Pearson correlation

RMSE

Bland-Altman

Paired t-test

Results

CO2 data analysis

PM2.5 data analysis

Discussion

Conclusion

Footnotes

ORCID iD

Funding

Declaration of conflicting interests

References

CO₂ data analysis

PM_2.5 data analysis