Abstract
Air quality is a critical matter of concern in terms of the impact on public health and well-being. Although the consequences of poor air quality are more severe in developing countries, they also have a critical impact in developed countries. Healthcare costs due to air pollution reach $150 billion in the USA, whereas particulate matter causes 412,000 premature deaths in Europe, every year. According to the Environmental Protection Agency (EPA), indoor air pollutant levels can be up to 100 times higher in comparison to outdoor air quality. Indoor air quality (IAQ) is in the top five environmental risks to global health and well-being. The research community explored the scope of artificial intelligence (AI) in the past years to deal with this problem. The IAQ prediction systems contribute to smart environments where advanced sensing technologies can create healthy living conditions for building occupants. This paper reviews the applications and potential of AI for the prediction of IAQ to enhance building environment and public health. The results show that most of the studies analyzed incorporate neural networks-based models and the preferred evaluation metrics are RMSE,
Introduction
Air quality not only has a material responsibility in human exposure to pollutants but is also crucial for specific groups such as older adults and people with disabilities [121]. Numerous research studies state the adverse health effects associated with poor air quality levels, such as premature death, respiratory, cardiovascular disease along with a relevant increase in asthma attacks, dementia, and cancer [112,128,130]. Poor air quality concentration levels are responsible for 3.2 million deaths worldwide [128,130]. The consequences of poor air quality are most severe in developing countries where there is no regulation to control pollutant emissions. However, air quality levels are also a problem in developed countries. Every year in the USA, approximately 60,000 premature deaths are reported and linked to reduced air quality levels. Moreover, the healthcare costs related to air quality diseases reach $150 billion [87]. According to the European Environment Agency, air pollution was responsible for 400,000 premature deaths in the European Union (EU) in 2016. The particulate matter caused 412,000 premature deaths in 41 European countries, and 374,000 occurred in the EU [36]. Moreover, the cost related to the air pollutant emissions caused by industrial facilities in the EU was estimated at around €59 to 189 billion in 2012 [51]. The Environmental Protection Agency (EPA) stated that indoor pollutant levels could be up to 100 times higher when compared with outdoor air quality. Therefore, indoor air quality (IAQ) is ranked as one of the top five environmental risks to global health and well-being [107]. IAQ is a matter of potential concern for the building occupants [26]. As people spend most of their time indoors, poor air quality leaves a significant impact on overall public health [8,14,17]. In particular, older adults and people with disabilities, who are the most venerable groups, commonly spend all of their time inside buildings [82]. Living environments include numerous types of spaces and locations, such as workplaces, clinics, public service centers, faculties, leisure spaces, vehicles, cabins, and outdoor locations [29]. Notably, a significant percentage of indoor environments have a high number of occupants [79]. Even in locations with good air quality, short-term exposure to pollutant levels can cause potential health symptoms to sensitive groups such as elderly and children; especially those suffering from asthma and cardiovascular problems [62,126].
World Health Organization (WHO) has developed numerous reports on IAQ [19,88,127,130]. These reports state that almost three billion of the most impoverished population in the world rely on solid fuels (crop wastes, charcoal, animal dung, wood and coal) for their everyday cooking and heating needs [129]. These solid fuels produce a considerable level of harmful gases and increase particulate matter concentration levels [72]. Repeated exposure to these pollutants can hamper the health quality of an individual. The impact of indoor air pollution (IAP) is equally high in the urban buildings as well due to excessive use of chemical-rich cleaning agents, oil-based pains, fragrant decorations, and other toxic consumer products and building elements [47,72]. Unfortunately, household air pollution caused more than 4.3 million premature deaths in 2012, mostly in middle and low-income countries [18,44,54,58,65,74]. The stats show 6% deaths due to lung cancer, 12% due to pneumonia, 22% because of chronic obstructive pulmonary disease (COPD), ischemic heart disease accounts for 26% and stroke for 34% deaths annually [129].
Ventilation arrangements considerably influence the quality of indoor air [59,67,76,93]. Numerous countries have set up new regulations for achieving adequate ventilation and IAQ in the buildings [7,35–37]. However, the starting point should be the source control and reduction of pollutants in the indoor air [1,7,10,15,52,80,94]. Several studies available in the state-of-art reveal a considerable change from open fireplaces in the residential areas to sealed modern fireplaces [28,34,117]. The new buildings are equipped with wireless communication technologies and sensors. Therefore, it becomes easier to monitor the environmental factors on a real-time basis [71].
Governments and environmental agencies have also designed new public policies to reduce pollutant exposure to the building occupants [27,41,45,118]. Although it is a reasonable response towards IAQ management, monitoring pollutant levels in the building environment on a real-time basis can be a significant step towards efficient source control and management. Furthermore, the latest technologies, such as artificial intelligence (AI) and machine learning (ML), can utilize Big Data related to pollutant levels for forecasting future conditions in the living environment [3,6,73,105]. Several researchers are also exploring the potential of the Internet of Things (IoT) for developing smart environments that could address major challenges related to IAQ, building energy efficiency and occupant comfort [39,43]. The concept of smart homes, smart factories, smart cities and smart health systems are gaining immense popularity around the world. Moreover, smart environments are mainly influenced by the combination of AI and IoT [43,110,123]. On the one hand, traditional threshold-triggered solutions can provide instant updates about critical IAQ levels. On the other hand, AI-based prediction systems can deliver prior information about upcoming critical changes in IAQ levels. Hence, building occupants can take preventive majors to avoid serious health impacts [5,125]. The research communities from the past years are exploring the potential of AI to design intelligent environments where building occupants get automatic, real-time updates about changing environmental conditions [24,95,135]. This vision has taken them to the concept of ambient intelligence that further contribute to the development of smart environments for healthy living [30,110,111]. Several researchers in the past have proposed efficient prediction systems for IAQ to improve public health and well-being [70,109,115,139,140]. These studies can enhance the daily activity level while providing better scope for ventilation arrangements. Also, these applications can assist in the development of favorable ambient assisted living systems and improved productivity levels at office premises.
In sum, IAQ leaves a considerable impact on public health and well-being. Therefore, it is a critical matter of concern for both developed and developing countries [20]. This paper reviews the application of AI for the prediction of IAQ to enhance the building environment and public health. The main objective of this work is to study the potential of AI methods for developing smart IAQ systems to enhance building environments. To achieve this, an in-depth analysis is performed on IAQ prediction systems considering features used for evaluation of pollutant levels in the indoor environment, accuracy rate of the existing systems, and prediction interval for which IAQ condition is predicted by the system. The scope of this systematic review consists of an analysis of the AI-based forecasting approach proposed by researchers from different countries with unique demographic conditions such as domestic environment, IAP variables and socioeconomic status [40,99,103].
This review will help to find answers for potential research questions while highlighting the new problem domain in which future researchers need to put effort. Moreover, this systematic review provides a detailed comparison of existing AI-based IAQ prediction systems for smart environments. It highlights the potential of the specific techniques, along with the impact of several feature extraction methods. In addition, this paper aims to summarize the findings achieved by previous studies regarding the accuracy and methods used.
The rest of this review article is structured as follows: Section 2 provides the methodology with research questions, inclusion and exclusion criteria, search strategy, study selection and risk of bias. Section 3 includes the results and discussions along with answers to the RQs. Finally, Section 4 presents the conclusion.
Materials and methods
This systematic review is conducted using the PRISMA (Preferred Reporting Items for Systematic Review and Meta-Analysis) methodology. This is a technique of evidence-based reporting with a minimum set of items for meta-analysis and systematic reviews. In order to address the challenges associated to IAQ prediction, the process was divided into several steps. In the first step, the relevant research questions were identified and then a search strategy developed following specific search keywords and strings. After this, the inclusion and exclusion criteria were defined to ease the selection of most relevant papers from the existing database. Next, data extraction was carried out based on the pre-defined research questions. Furthermore, the answers to these questions were given while highlighting the challenges, opportunities and limitations in the field. These steps are defined in the subsequent sections below.
Research questions
The rising number of health problems due to poor building environments is a matter of concern for government agencies and policymakers as well. It is essential to address the challenges by utilizing the latest technologies, and AI shows potential in this direction. However, future research needs to examine the critical aspects to design a more reliable solution for IAQ management. The authors in this systematic review identified essential research questions and tried to find relevant answers through this detailed study. Therefore, the research questions for this systematic review are:
What are the system architectures used for IAQ data collection and how is it collected?
What are the features or input parameters used to process the IAQ data for prediction system design?
What are the widely used AI methods for IAQ prediction?
What are the accuracies and prediction times of these methods?
What application domains are addressed by existing publications?
How can these systems be integrated into smart building systems?
How are the results of IAQ systems presented to the end-users?
These questions have been established by the authors to achieve the main contribution of this paper, that is to present a systematic review of AI methods used for IAQ prediction. Moreover, this paper aims to provide a comprehensive review of the main features used for the prediction, the accuracies achieved, the data collection techniques, the period of prediction, and state the future research challenges and opportunities.
Search strategy
To address the research questions, the authors have used three different databases: PubMed, IEEE and ACM. The research for relevant publications was initiated on 27th March 2020, and a filter to select studies after the year 2008 was applied. The initial search query used has the following combination of keywords: “indoor air quality AND (prediction OR forecasting)”.
In total, 235 documents were identified, out of which 159 were obtained from PubMed, 41 from IEEE and 35 from ACM database. These studies were further processed as per the inclusion and exclusion criteria.
Inclusion and exclusion criteria
All the authors independently evaluated all papers, which were selected for analysis by the cumulative agreement of all parties. The documents were analyzed to address the different methods related to the implementation of AI methods for IAQ prediction. The selection of the papers for inclusion in this review was made if the research satisfied the following eligibility criteria.
Inclusion criteria: (1) Research studies that include IAQ prediction based on methods related to AI sub-domains; (2) The information about the data used and their origin must be present in the document; (3) The paper must concern an analysis of indoor living environments; (4) The study must present at least one prediction metric; (5) The information of the indoor parameters monitored, or the instruments used must be presented in the document; (6) The research paper must be written in English and published after 2008.
Exclusion criteria: (1) Duplicate papers; (2) Publications that are secondary studies, such as reviews, study paper or demo papers; (3) Papers that do not provide clear insights about the prediction system and performance parameters; (4) Papers that are relevant to outdoor environments only.
Study selection
All publications obtained after applying the initial search query were analyzed as per the PRISMA guidelines. First of all, the documents were analyzed for presence of any duplicate studies and at this stage, two papers were rejected. The remaining 233 papers were transferred for a second level screening. The relevance of papers was then identified by considering the title, and abstract and 193 papers were excluded because they did not meet the specified inclusion and exclusion criteria. Most of the papers were literature reviews of the environmental science field, studies about the IAQ exposure and their effects on people’s health, studies on building-related problems, IoT and WSN architectures for IAQ supervision, research on HVAC systems and sensors, computation fluid dynamics IAQ models, IAQ prediction methods using theoretical and mathematical approaches, and were not related to artificial intelligence methods.
After applying the above-mentioned eligibility criteria, the authors obtained 40 papers for the third stage, which were studied in detail. In this list, two papers only focus on outdoor air quality [64,97], eight papers do not include any AI-specific prediction algorithms [22,31,47,50,84,91,116,122] or were based on some mathematical approaches. Three papers [12,96,137] only focused on thermal comfort (temperature and/or humidity data) or other smart building aspects instead of air quality. Moreover, two studies [89,120] were rejected because they were limited to a monitoring system design and no prediction system was implemented. One paper [9] had a relevant abstract, but the authors did not specify the prediction methods. Similarly, authors in [108] did not specify the AI method used for prediction. Therefore, these studies did not meet the first criteria for inclusion and were thus excluded. Besides this, the document presented in [124] was excluded from the study. The authors used a fuzzy control method, but the evaluation parameters such as prediction accuracy or period were not defined. Therefore, this study does not fulfil the fourth inclusion criteria. The researchers in [23] applied advanced fuzzy control theory to control IAQ and reduce energy consumption. They worked on the prediction of the Air Quality Index, and the parameters considered to control indoor environment conditions were indoor temperature, air quality and humidity. The fuzzy prediction system helped to create an energy balance between IAQ, and at the same time, energy consumption was optimized. The authors implemented real-time monitoring and concluded that the system is correct and feasible. However, this study does not provide the details in terms of prediction accuracy and period. Therefore, it did not fulfil the inclusion criteria no. 4 and was not included in the meta-analysis. Finally, this systematic review analyzed 21 papers that were relevant to address the research questions identified at the beginning of this section. Clear insights concerning the selection process as per the PRISMA guidelines are presented in Fig. 1.

PRISMA flow diagram for studies included in this systematic review.
The relevant data was extracted from the selected publications for further analysis. In order to conduct this systematic review, the following information was extracted:
Author details, titles and abstracts.
Year of publication and associated database.
Focused geographical area and application.
Pollutant type, sensors used, and calibration status of sensors.
AI method used for IAQ prediction.
After data extraction, the included publications were synthesized and analyzed in detail to obtain answers for the pre-defined research questions.
Risk of bias
The main limitation of conducting a systematic review is that it is influenced by bias. The first risk of bias arises during the selection of initial keywords/string to initiate a search on database. Moreover, the subjectivity of eligibility criteria defined by authors increases bias at the screening stage. Furthermore, the search only included three databases (PubMed, IEEE and ACM). However, based on the PRISMA guidelines, the authors tried to follow the best possible criteria and procedures for completing this systematic review. Although early researchers have conducted several reviews of AI-based IAQ prediction systems [11,55,90,101], they did not consider all these relevant RQs, especially RQ2, RQ3 and RQ4. The information provided for all these relevant factors make this review a valuable addition to the scientific community.
Results and discussion
This systematic review includes 21 studies on AI-based IAQ prediction systems from three different databases out of which eight studies (38.09%) were included from PubMed, seven (33.3%) from IEEE and six (28.57%) from ACM (Table 1).
Year-wise distribution of papers from the three databases
Year-wise distribution of papers from the three databases
Table 2 summarizes the studies based on the origin of the selected papers. It can be seen that out of the included publications, six studies (28.57%) were conducted in China, four (19%) in the USA and three (14.28%) in Korea. Finally, one study each was included from other countries as mentioned in Table 2. However, no study from other developing countries such as India, Nepal and Bangladesh that are greatly affected by IAP due to inadequate ventilation arrangements were included [13,42,48,53,83,100]. As the majority of the population in these countries use biomass fuels for cooking and heating purpose, the researchers in these locations need to show an active participation in the development of some potential IAQ monitoring and prediction systems that can provide more accurate results based on specific geographic conditions and pollutant concentrations [61,85,138].
Country-wise distribution of included publications
Furthermore, the synthesis process that focused on extracting relevant information from the included publications is presented in Tables 3, 4 and 5. Table 3 lists technical insights of the system designed by previous researchers, Table 4 presents details about the focus IAQ parameters, and Table 5 provides an analysis of extracted features and performance parameters of the existing systems.
List of papers included in the study
(Continued)
The first research question concerned the types of system architectures used and the methods of data collection. Consequently, the studies can be divided into four parts: 1) Studies that were based on real-time monitoring systems designed by the researchers, 2) Commercial monitoring solutions, 3) Studies that used data obtained from already installed or government-operated systems and 4) Mobile stations or wearable sensors. The results are summarized in Table 6.
Most of the reviewed studies use data acquisition systems either developed by the authors or commercially available ones for data collection. In total, 57.14% (
To ensure accuracy in real-time data collection, either researchers used expensive, highly calibrated sensor units or low-cost sensors with specific calibration arrangements. There are several air quality pollutants that affect indoor environment. However, distinct researchers have focused on different set of pollutants to predict the future conditions. An analysis of main parameters for data collection is provided in Table 4.
Different IAQ pollutants measured by researchers in different studies
F.: Formaldehyde; Tol: Toluene; Eth.: Ethanol; Ben.: Benzene; Air.: Airborne bacteria; Fun.: Fungi; T.: Temperature; R.H.: Relative Humidity.
Different IAQ pollutants measured by researchers in different studies
F.: Formaldehyde; Tol: Toluene; Eth.: Ethanol; Ben.: Benzene; Air.: Airborne bacteria; Fun.: Fungi; T.: Temperature; R.H.: Relative Humidity.
Essential details extracted from all papers
(Continued)
(Continued)
(Continued)
Data collection methods
This analysis reveals that 66.6% (
Besides this, particulate matters (
The second research question concerned features or input parameters used for designing a prediction system. Feature extraction and input parameter selection play an essential role in designing an AI-based prediction system. The performance of the prediction model is highly dependent on the type of features used for network training. The list of features used in included 21 papers is presented in Table 5 (column 3).
In total, four studies [24,66,119,133] presented an analysis of the sensitivity of selected features. In order to ensure higher accuracy for forecasting system, it is essential to ensure that network is trained with most relevant features because irrelevant or least relevant features can deviate network performance [16,69,86,92,102].
Seven studies used measured input parameters as training parameters [4,21,66,113,119,134] and five studies considered statistical analysis of features to ensure that most relevant features are fed to the network [2,75,77,132,133]. One study [66] provided a clear analysis of the relevance of features and how their inclusion or exclusion affect the performance of the IAQ prediction system. The researchers in this paper executed different cases with unique input parameter selection and visualized network performance for those changes. The analysis shows that bad or irrelevant parameters cause a worse impact on the prediction system performance.
Answer to RQ3
The third research question focuses on the used AI methods for IAQ prediction systems. As can be seen in Table 3, most of the researchers worked on different versions of neural networks. In total, the researchers of 10 studies (47.61%) used neural network-based methods. Two studies [131,136] followed ARIMA and two other studies used the GRU method for IAQ prediction. Besides this, one study each focused on ANFIS [134], Kalman filter [49], GA-based SVM [114], time slicer method [106], Bayesian inference [132] and decision tree regression [119]. However, none of these studies included fuzzy logic, which otherwise offers potentials scope for forecasting problems [56,57]. Future researchers should focus on the application of fuzzy logic and other relevant machine learning methods for forecasting IAQ conditions [32]. LSTM is another crucial solution, and several researchers considered this technique (see Table 5) for comparing performance of their proposed methods and to validate the quality of results. It is also possible to create hybrid forecasting techniques by combining these available methods or by utilizing the potential of optimization techniques such as PSO, GA and simulated annealing [63,98].
Answer to RQ4
In terms of the accuracies and prediction time of the existing models, essential details are mentioned in Table 4. Researchers focused on common parameters to evaluate the performance of the prediction system, which are listed in Table 7.
As can be seen, 42.85% (
Evaluation parameters used in different studies
Evaluation parameters used in different studies
This RQ focuses on the application domains that are addressed by existing publications. As can be seen from Table 3, 11 (52.03%) out of 21 studies were executed on IAQ data collected from an office building which were either an institute-labs, staff rooms or traffic prone workspaces. The data collected in six studies (28.57%) [2,4,38,78,119,131] is related to residential buildings. However, three studies [24,113,134] focused on other commercial buildings such as gym, shopping malls and two studies were conducted at indoor spaces such as waiting rooms of subway stations. IAQ has been a considerable challenge for people who spend most of their routine time indoors. A considerable number of health issues among employees in offices and industrial units are reported due to unfavourable environmental conditions. The excessive use of chemical-rich cleaning agents and fragrance solutions put more threat to the overall health and well-being of employees [46,71]. Furthermore, the risks are more significant in remote areas where people use traditional sources such as wood, coal and kerosene for cooking and heating purpose [60]. Women, children and elderly members of such poor families are at a higher risk since they spend 80–90% of their routine time indoors [103]. The main concern while designing IAQ monitoring and prediction systems is that the ultimate product must be cost-effective, easy to use and simple to install at rural as well as urban areas [68]. Besides this, future researchers need to address the issues related to battery consumption, type of sensors, a communication mechanism and system architecture [104]. An adequate combination of hardware and software is a must to achieve real-time IAQ monitoring and prediction goals. At the same time, policymakers need to raise awareness about the use of real-time monitoring systems so that most of the people consider installing them.
Answer to RQ6
RQ6 concerns the opportunities for integrating IAQ prediction systems with smart building systems. In total, seven (33.3%) out of 21 studies [2,24,33,49,106,119,134] were based on smart building solutions where the IAQ prediction system was integrated with other smart solutions in the premises for improved lifestyle and well-being. However, the remaining 14 studies (66.6%) were independent solutions where researchers worked solely on IAQ monitoring and prediction. Ventilation is one of the main concerns in modern as well as traditional houses. The new age IAQ prediction systems must be integrated into automated ventilation management so that adequate arrangements for the circulation of fresh air can be made on time. The prediction systems can provide updates about future conditions of IAQ levels, and the smart building management can be adjusted accordingly to prevent serious health consequences for the building occupants. As can be seen in Table 5, four studies [2,33,77,119] provided information about the number of predicted hours using their proposed method. One study [33] claimed prediction for the next 24 hours. However, the authors of [77] proposed prediction for 6, 12, 18 and 24 hours. Furthermore, the authors of [2] provided a prediction efficiency of 30 minutes and one hour only. The number of predicted hours is crucial for real-time systems as it can help occupants make prior arrangements in terms of expected critical changes in the pollutant concentrations [30]. This information could be essential for disabled patients and those suffering from chronic diseases such as respiratory health problems or cardiovascular disease.
Answer to RQ7
Finally, RQ7 focuses on the methods that are used by early researchers to present IAQ system results to the end-users. The field of research is not restricted to the design and development of the IAQ prediction system. Moreover, future researchers need to be careful about how the predictions or ultimate results of monitoring systems are presented to the end-users. As can be seen in Table 3, four studies (19.04%) presented the results of the prediction system on a web-based solution. Alternatively, four other studies (19.04%) preferred designing a smartphone application. The details about the end-user interface were missing from the remaining studies. The overall effectiveness of the IAQ prediction system depends on how the results are accessible to end-users. The design should not be limited to smartphone applications and web-based platforms. It is equally relevant to provide alerts for predicted critical situations so that building occupants can take immediate actions for ventilation arrangements [30]. The triggers must be further connected to the smart building management systems to control all mechanisms accordingly.
Conclusion
This study conducted a systematic literature review on IAQ prediction systems based on AI methods. The review was performed by studying and analyzing academic papers published in PubMed, IEEE and ACM databases. The most relevant articles were analyzed as per the pre-defined RQs and eligibility criteria, which helped to highlight the potential of AI to address IAQ-related problems.
The trend for IAQ monitoring has become a dominant concept in most developing countries where a significant part of the population is dependent on traditional cooking, heating measures and use of inadequate ventilation arrangements [103]. Furthermore, the forecasting of IAQ conditions ahead of time has become an essential concern for improved public health and well-being for enhanced ambient intelligence and smart environments. In this study, 47.61% of the reviewed papers (
The researchers have shown interest in measuring a variety of IAQ pollutants, 66.6% (
The analyzed literature presents the potential of deep learning, machine learning and neural networks for enhanced living environments and occupational health in the smart environments. Nevertheless, this literature review has limitations. For this study, only papers in English from PubMed, IEEE and ACM were considered. This study may help to outline crucial possibilities in the field of IAQ and public health management. At the same time, this literature review states multiple challenges regarding the current state-of-the-art for smart environments. Future research also needs to evaluate the impact of different pollutants based on different geographical conditions and variable living arrangements. Another critical area of work is the development of the most adequate and highly calibrated sensor networks to measure IAQ levels on a real-time basis. Furthermore, future research needs to ensure that developed systems are useful on a real-time basis for rural areas, where people might not be able to afford more expensive solutions.
Conflict of interest
None to report.
