Abstract
The present work reports the impacts on urban mobility and air quality in Lisbon, Portugal, of the imposed restrictions to curb the transmission of SARS-CoV-2 virus, which causes COVID-19 disease. We performed a data-driven approach over Lisbon Smart cities data, collected from several sources, such as traffic and pollution. During the first Portuguese emergency period (18-03-2020 to 03-05-2020) the sharp reductions in anthropogenic activities, most importantly road traffic, resulted in generally reduced criteria air pollutant concentration compared to an homologous baseline from 2013–2019 measured in the six air quality monitoring stations throughout the city. The most negatively impacted air pollutants were NO2, with a reduction of 54.35% in traffic stations and 28.62% in background stations. Google mobility indicator for local commerce was found to be the main anthropogenic activity indicator for Lisbon, with a moderate and positive correlation with NO2 concentration (r=+0.54). A regressor ML pipeline was trained to predict NO2 concentration with the available anthropogenic activity, weather, and air pollutant inputs from March/2020 to March/2021, achieving R2 = 0.925 on the test set.
Introduction
In April 2020, the month with the most severe levels of confinement, the consumption of liquid fuels in Portugal suffered unprecedented homologous reductions (Sugawara et al., 2021), −58.6% for Gasoline and −47.0% for Diesel, and electrical energy consumption had a homologous drop of −13.8%, whereas 69% of the production was of renewable sources, 14% was imported and the remaining 17% of non-renewable sources (REN - Consumo de energia elétrica recua 12% em abril). Changes in economic activity and telework promoted drastic changes in energy consumption (Hook et al., Aug. 2020), namely domestic energy consumption, which in April 2020 had a homologous increase of 31%, while industry energy consumption dropped 17% and services dropped 43%. In April 2020, the maritime port of Lisbon registered a decrease of 47.70% in ships docked (Estatísticas,” Statistics Port of Lisbon), and Lisbon International Airport registered a drop in 95.88% of aircraft movements (Passenger traffic at the Lisbon Airport).
Throughout the most severe lockdown periods, government agencies, environmental associations and the press reported exceptional improvement of air quality in Lisbon and the cause was associated with the sharp reduction of anthropogenic pollutant activities. It is of utmost importance to study the phenomena of air pollution response to the variation of anthropogenic activities to better understand and quantify its causal relation and support the decision process related to environmental policies.
In Portugal, in 2019, of all registered vehicles (Motorised road vehicles (No.) by Vehicle Type and Fuel Type), diesel engines account for 65.47%, gasoline engines for 32.46%, LPG engines for 0.83% and 1,24% for other motorization types where all types of electric vehicles are included (Associação de Utilizadores de Veículos Elétricos (UVE), Jan. 5, 2023). Moreover, in 2019, 62.02% of the Portuguese vehicle fleet is over 10 years (Motorised road vehicles (No.) by Vehicle Type and Fuel Type), whose engines implement worse European emission standards than newer engines [Commission Regulation (EU)].
The 2017 Mobility Survey for the Lisbon Metropolitan Area [Mobility and functionality of the territory in the Metropolitan Areas of Porto and Lisbon, 2017, p. 2017] reports that 60.8% of all trips are done using private passenger vehicles (cars and motorbikes), and the occupation ratio of passenger vehicles is 1,60 persons. On the other hand, only 15.8% of trips are done using public transportation and 23.5% of trips are done with soft transportation means from which only 0.5% of trips are done using bicycles.
The low electrification of the Portuguese vehicle fleet, the large share of diesel engines and older vehicles (pre-EURO 5/6), high urban road congestion levels and low adoption of public transport, make urban air pollution concentrations, namely NOx and particulate matter (PM) pollutants, in an urban setting such as Lisbon a tough phenomenon to tackle (Hooftman et al., Feb. 2016).
In the Lisbon and Tagus Valley area, the main contributing anthropogenic sources of primary air pollutants are predominantly road transport vehicles (Coordenação et al., 2019) with NOx contribution of 63%, Carbon Monoxide (CO) contribution of 78% and PM10 contribution of 62%, as well as the minor contribution from other forms of transportation like air and sea transport, which were all gravely impacted by restrictive mobility measures to control the COVID-19 pandemic.
Air pollution in urban settings is a major source of concern due to adverse effects on human health, leading to increased respiratory and cardiovascular disease development and premature death (Ghorani-Azam et al., Sept. 2016). Other impacts of air pollution include increases in sick leaves for workers and students, lower standards of living for vulnerable groups, such as asthmatic and elderly citizens, as well as an added cost burden to public health systems. WHO estimates 4.2 million yearly deaths were related to air pollution in 2016 (W. H. Organization, 2021), EEA estimates that 6.690 premature deaths in Portugal (2018) are caused due to three main air pollutants (NO2, PM2.5 and O2) (E.E.A., 2018) and an 858 city study in Europe for the year of 2015 estimates that 1.837 premature yearly deaths in Lisbon Metropolitan Area are related to two main air pollutants (NO2, PM2.5) (Khomenko et al., 2021), ranking 116th worst position for NO2 related premature deaths and 514th worst position for PM2.5 related premature deaths.
The COVID-19 pandemic management by the Portuguese government and local Lisbon Metropolitan Area authorities included highly restrictive measures that severely reduced urban mobility in Lisbon for all transportation forms and for a large period of time. It is a once in a lifetime opportunity to measure and study the effects of reduced pollutant anthropogenic activities in an urban setting, which were previously only possible to simulate using atmospheric chemistry simulation techniques, in order to aid the definition and prioritization of air pollution reduction policies.
That said, we will use sensor data from the city and correlate mobility and pollution in order to perform a complete data analysis process. Once we understand the relationships between the variables, we will apply a machine-learning algorithm to our data to predict the concentration of NO2.
City sensors allow measuring traffic and air quality. This data can be integrated with other sources to produce knowledge and check correlations among mobility patterns and pollution in a city.
Literature review
In order to search existing relevant scholarly literature, Google Scholar was used. Taking into consideration that Google Scholar is not a scholarly literature database, but rather an academic search engine, it does not have an editorial review board, and content quality is solely evaluated by means of specialized algorithms, so special care was taken to assess the source and quality of the papers, namely if they come from peer-reviewed journals as well as sorting out conference proceedings, reports, and other documents as grey literature.
Literature for three different relevant research topics was searched separately. The literature research topics are as follows: COVID-19 pandemic restrictions impact on urban air pollutant concentration Full-text articles analyzed: 17 Machine learning methods for air pollutant concentration prediction Full-text articles analyzed: 11
In addition to scholarly literature, several searches for grey literature, in the form of official technical and statistical reports, as well as other publications like fact sheets, were conducted in trusted and/or official Portuguese, European Union and International agencies related to the environment, demography, mobility, etc. (13/38 references). Finally, press articles related to pandemic management and the environment in Lisbon were also searched through Google News.
Impact of COVID-19 restrictions on urban air quality
The impact of sharp reductions of anthropogenic activities throughout many countries during COVID-19 pandemic associated lockdowns were widely analyzed as to its impact and contributions to primary and secondary air pollutant concentrations. Rana et al. (Rana et al., 2021) carried out a systematic literature review of the impact of COVID-19 restrictions on air quality in China using PRISMA guidelines where 35 studies out of 396 met the eligibility criteria and were thoroughly analyzed. All articles were published in 2020 and included articles using both satellite (12) and ground monitoring (23) air pollutant concentration data, with temporal comparison to both pre and post lockdown periods in 2020 (8), comparison to the same period in 2019 (15) and the remainder compared using period ranges from 2015 to 2019 (12). The most studied air pollutant was
Highlighting some additional studies, Bauwens et al. (Bauwens et al., 2020) used satellite observations to acquire NO2 vertical column measures from two satellite instruments, TROPOMI (TROPOspheric Monitoring Instrument) on board of Sentinel P-5 satellite and OMI (Ozone Monitoring Instrument) on board of Aura satellite, to reveal a sharp reductions of NO2 concentrations throughout the COVID-19 related lockdown phases in multiple cities throughout the world when compared to pre-lockdown periods and homologous periods in 2019. Notable examples include −43% to −57% in Wuhan, China, and −31% to −32% in Barcelona, Spain.
Connerton et al. (2020) used air quality data from ground monitoring stations in four megacities during the initial COVID-19 lockdown period in March 2020, and conducted a statistical analysis having measured pollutant concentration reductions, for instance in Paris, France, of 67% for CO, 39% for NO2 and 29% for PM2.5 when compared to a 2015–2019 air pollutant baseline. In order to extrapolate the contribution of the reduction of anthropogenic activities to the air pollutant measured concentration changes during the pandemic period, thus accounting for natural atmospheric phenomena that can disperse air pollutants or facilitate atmospheric chemical reactions that consumes said air pollutant, such as wind speed, air temperature and relative humidity, a general linear model was fitted with meteorology measures and a lockdown indicator variable as independent variables and air pollutant concentration as dependent variables, where it was found that while both anthropogenic activities and meteorology significantly influence air pollutant concentrations, anthropogenic activities reduction contribution was heavier.
Sicard et al. (2020) investigated the effect of sharp reductions of NOx emissions on O3 concentrations during the COVID-19 related lockdowns in European and Chinese cities where it was suggested that local urban O3 concentrations greatly increased as NOx emissions decreased due to reduced O3 titration by NO thus raising the need of controlling VOC emissions to balance the NOx : VOC ratio which is key in tropospheric O3 formation. For instance, during the 2020 COVID-19 related lockdown period in Wuhan, China daily mean O3 concentration increased by 36% while NO2 concentrations decreased 57%, when compared to a computed 2017–2019 baseline for the same period.
Machine learning based air pollutant concentration prediction
Advanced atmospheric chemical transport frameworks (E.P.A.) to model air pollution, such as CAMX or CMAQ, use as inputs a weather forecast model, an emission inventory for anthropogenic and biogenic sources and using a chemical transport model can output a 3D air pollution concentration map and compute source appointment analysis. On the other hand, instead of a framework capable of simulating complex atmospheric chemical interactions, which can rely on outdated or incorrect estimated emission stocks, there are other data-driven techniques which use statistical and machine learning techniques to model air pollutant concentrations, capture the complex non-linear relationship of the several variables that determine the concentration of a given air pollutant at a specific place and time. Rybarczyk and Zalakeviciute (2018) have performed a systematic review of machine learning approaches to outdoor air pollution modeling following the PRISMA guidelines where 46 out of 103 papers published from 2010 to 2018 met the eligibility criteria. It was found that the volume of publications of applied machine learning techniques related to air quality has ramped up from 2016 onwards, mostly published with regards to northern hemisphere geographies, and are mostly related to criteria pollutants (NO2, SO2, CO, PM10, PM2.5 and O3) or the air quality index (AQI), whereas the largest group of publications are related to identifying relevant predictor variables and modeling non-linear relationship of variables in air pollution. The main algorithm classes ordered by prevalence found in the publications were Ensembles (mainly tree-based predictors), ANNs, SVMs and LRs.
Further analyzing more recent publications, Vu et al. (2019) trained a Random Forest ML algorithm and compared prediction results for 2017 PM2.5 concentrations in Beijing with those outputted by a WRF-CMAQ model and were able to produce a slightly more accurate value for the yearly mean PM2.5 concentration, 61.8–62.4 µg/m3 for the WRF-CMAQ model and 61.0 µg/m3 for the RF ML model, whereas the observed value was 58.0 µg/m3. At the month granularity for 2017, the WRF-CMAQ model concentration predictions ranged from 3% to 33.6% difference when compared to the observed values, a mean difference of 7.8%, whereas the RF ML model predictions ranged from 0.4% to 7.9%, a mean difference of 1.5%. Castelli et al. (2020) employed SVR (Support Vector Regression) algorithm to predict air pollutant concentrations such as NO2, CO, SO2, O3 and PM2.5 in California for years 2016 to 2018, using meteorological measures, pollutant concentration rolling means and timeseries features as predictors, achieving R2 = 0.937 on the validation set using the RBF (Radial Basis Function) kernel for NO2 forecasting. Luna et al. (2014) also employed an SVR (Support Vector Regression) algorithm to predict O3 concentrations in Rio de Janeiro using ground monitoring stations data from 2011 and 2012, using as data features the chemical precursors, such as NO, NO2, NOx and CO, as well as meteorological factors, namely wind speed, solar radiation, air temperature and relative humidity. This model achieved R2 = 0.912 on the validation set, having also trained an ANN (Artificial Neural Network) using the same data source to solve the same problem and achieving R2 = 0.915 on the same validation set.
Data analysis
In order to pursue this work objectives and according to the literature review process, several data sources are required to properly model the urban air pollution phenomena: (1) Air pollutant concentration, (2) meteorological parameters and (3) anthropogenic activity direct or indirect indicators. In this sense, apart from data available online, several public entities and companies were directly contacted to acquire such data. From all the contacts and data gathered, a high-level summary of the most relevant data-sources used in the present work can be found in Juma (2021).
After the data extraction, transformation and loading process, the resulting analytical data model loaded into PowerBI was used to support all of the visualizations and consolidated data extractions for machine learning purposes. A simplified diagram of the data model can be found in Figure 1, the summary of the datasets in Table 1, and a more detailed description of all the datasets used in (Juma, Oct. 2021).

Simplified analytical data model built in the present work.
Datasets description.
The present work mainly focuses on the months between March and July 2020, where the two main confinement periods occurred, to understand the impacts of restriction measures introduced by the authorities to manage the pandemic in the urban environment and mobility. The preemptive approach to the management of the pandemic by the Portuguese government, which included strict lockdowns, resulted in the limited spread of COVID-19 in the Lisbon and Tagus Valley health region as depicted in Figure 2. This region (a NUTS II unit until 2000) is often used as a standard for transport-related problems in the Lisbon area due to its high internal connectivity.

Main phases of the pandemic management in the Lisbon Metropolitan Area.
According to the index published by a public transportation mobility app (Moovit App, 2020) depicted in Figure 3, the most affected period was 09-04-2020 to 16-04-2020, where the 7-day rolling percentual change against the baseline estimated a 78.30% decrease in public transport usage. Even after the first state of emergency ended on the 2nd of May 2020, demand for public transportation recovered only slightly to values far lower than those of pre-pandemic periods, potentially due to general practice of telework, large scale lay-offs, reduced offer, and capacity of PT equipment, inexistent tourism as well as fear of contagium due to the enclosed nature of the public transportation equipment. During the first national emergency period (18-03-2020 to 03-05-2020) the mean reduction in public transportation demand is estimated at 75.63% against the index baseline. The index would only come to reach the pre-pandemic baseline levels on the 13th of July 2021.

Moovit Insights Public Transportation Index for Lisbon 2020.

Apple Mobility Trend Report for Lisbon.

Google Community Report for Lisbon 2020.
According to the Apple Mobility Trend Report (Apple Inc.), a mobility app which uses Apple Maps data from the navigation feature, the daily percentual change of direction requests in the city of Lisbon, Portugal against the baseline day of 13-01-2020 (a pre-lockdown Monday) provides details on other urban mobility means, namely walking and driving, which are two of the Apple Maps app modes.
According to this index, the day with largest reduction of walking navigation requests was on the 29th of March 2020 with a reduction of 92.60% and on the 12th of April for driving navigation requests with a reduction of 86.02% as depicted in Figure 4. These results must be interpreted cautiously since the baseline period for comparison is a pre-lockdown Monday and both the dates with largest reductions during the lockdowns are Sundays which are days where there's naturally reduced urban mobility associated to the weekends.
Nevertheless, the results are consistent with the high initial adherence to lockdowns by the population, where in the first national emergency state (18-03-2020 to 03-05-2020) walking trips reduced in 89% and driving trips reduced in 78.84% against baseline, while during the first national calamity state (04-05-2020 to 01-07-2020) walking trips reduced in 72% and driving trips reduced in 43.05% against baseline. From then on, lockdown erosion and continuous de-escalation of confinement contributed to a steady increase in both mobility patterns until the summer.
From the 1st of August 2020 onwards, the remaining 19 Lisbon City Hall Assembly parishes still in calamity state, which still had special mobility restrictions due to the virus incidence, joined the rest of the Lisbon parishes in the lower state of contingency. This appears to be correlated with a sharp increase in walking trips starting in early August, with additional potential contributions from tourism, which peaked in August 2020, and the vacation periods that might have been spent closer to home.
Waze App (Waze) follows a similar distribution to the Apple Driving index but with a negative offset that widens from the beginning of the lockdown period until the summer period, potentially due to the fact that values are in driven kilometers and not the number of trips. In contrast, it is possible that the average driven kilometers per trip during the lockdown periods were, overall, lower due to the more direct trips and non-leisure use of the private vehicle.
During the first national emergency period (18-03-2020 to 03-05-2020), it registered a 75.63% decrease in the daily driven kilometres registered with the Waze App while the first state of calamity (04-05-2020 to 01-07-2020), which was marked by lockdown erosion and continuous de-escalation of confinement, registered a 56.13% decrease against the baseline.
According to Google Community Mobility Index (Juma, Oct. 2021) a report on city mobility based on mobile phone data, depicted in Figure 5, the days with the sharpest reduction of activity in Grocery and Pharmacy (Commerce) was the 12th of April 2020 with a decrease in activity of 84%, for Workplace was the 10th of April 2020 with a decrease of 89%, for Parks was the 5th of April 2020 with a decrease of 91%, for Retail and Leisure was the 12th of April 2020 with a decrease of 91% and for Public Transportation was the 10th of April 2020 with a decrease of 90% (Juma, Oct. 2021). On the other hand, for Residential activity there was a general increase peaking on the 10th of April 2020 with a maximum of +47%.
As we can imagine from the mobility report of Figure 5, Lisbon Subway as followed the same trend, being one of the three main public transportation modes used in the city, it showcases the dramatic effect of the lockdown in the city.
The YoY (a database with daily counts of subway trips in Lisbon) analysis should be done with care since from September 2019 onwards a change in the prices of recurrent titles (From April, your pass costs less - Metropolitano de Lisboa) increased the baseline usage of the subway network starting in April 2019. The first two months of 2020 where no restrictions were in place, compared to the homologous month in 2019, regular passes had an increase in ridership of 23%, free child passes 39%, and overall ridership of 13%.
The most impacted month was April/2020, with a homologous decrease in total ridership of 84% when compared to 2019. For the same month, when compared to January/2020 baseline, there's an 85% decrease in total ridership. The most impacted type of trip was the occasional title which measured a homologous ridership decrease of 92% and a decrease of 89% when compared to January/2020. This was expected since leisure and tourist ridership were virtually halted during the initial lockdowns.
Bike trips fell by 68.47% when compared to the same period in 2019 but recovered during the first state of calamity (04-05-2020 to 01-07-2020), measuring an increase of 22.59% in ridership when compared to the same period in 2019. After the first state of emergency, the usage of the shared GIRA bike during the first state of calamity increased in the greater order of magnitudes than other types of urban transportation, namely the subway, which still measured a 72.56% retraction against the same period in 2019. This could be attributed to increased demand for soft transportation means due to fear of contagion or because GIRA bikes might also be used as an outdoor leisure activity and not only as a means of transportation.
International non-resident business or leisure visitors make their way to Lisbon from their home countries primarily by means of air transportation (S.E.F./G.E.P.F., 2020). Besides aircraft emitted air and noise pollution, visitors temporarily enlarge Lisbon population and, therefore the human footprint in terms of anthropogenic pollutant activities and additional pressure on public transportation systems.
With regards to the Lisbon's Airport data, the total non-transferred passengers during COVID-19 restrictions had a homologous reduction of 99.11% in the 15th week of the year and was measured to be only 4.423 passengers in 2020 (2020-04-05 to 2020-04-11) against 498.565 passengers in 2019 (2019-04-07 to 2019-04-13). Regarding the number of total aircraft movements, it had a homologous reduction of 96.44% in the same analysis period wherein 2020 there was only 152 movements against 4273 movements in 2019, and some of the few flights were related to citizen repatriation efforts.
There was a partial recovery from July 2020 onwards, being this related to the lifting of self-imposed travel restrictions by the Portuguese government and additional relaxation of travelling rules from August 2020 onwards also had an effect (Portugal, 2020). The overall volume of passengers was also affected by travel restrictions imposed by the countries of origin of tourists or expatriates that visit Portugal (England, Germany, France, Italy, Spain, The Netherlands, etc.) and reciprocate restrictions of not allowing Portuguese citizens to travel to their countries.
By computing the Pearson Correlation of both public transportation mobility indexes for the year 2020 with an independent public transportation variable (subway ridership), it is found that Google Mobility Index for Public Transport correlates better than the Moovit COVID-19 Public Transport Impact Index (r = 0.99 vs r = 0.91) and is thus a mobility index that correctly models public transport commuter routines in the city. This could be useful to gauge changes in public transportation usage and to be used as a proxy indicator for public transportation usage in, for instance, the training of machine learning models.
A visual summary of the main impacts on urban mobility indicators in Lisbon during the first national emergency state period from 18-03-2020 to 03-05-2020 and during the first national calamity state period from 04-05-2020 to 01-07-2020 can be found in Figure 6.

Impact on urban mobility indicators in Lisbon during the first national emergency state period from 18-03-2020 to 03-05-2020 (1) and during the first national calamity state period from 04-05-2020 to 01-07-2020 (2), against a pre-pandemic baseline.
While the present work used in-situ measures acquired by ground air quality monitoring stations, one can use satellite observations to have a high-level view of the impact of the confinement measures on Nitrogen dioxide (NO2), a pollutant emitted directly and indirectly by internal combustion vehicles. During the first national emergency state lockdown and the first half of the first calamity state (March-June), where the most restrictive measures affecting anthropogenic activities were in-place, we can observe a clear reduction on the density of the NO2 vertical column measured by the TROPOMI instrument aboard Sentinel-P5 satellite when compared to the same period in 2019 as shown in Figure 7.

Lisbon and Tagus Valley Monthly Average of Total NO2 vertical column (μmol/m2) measured by the TROPOMI instrument onboard Sentinel-P5 satellite.
Apart from expert analysis of weather and pollutant emission data using traditional statistical analysis to assess the relationship between independent variables and the dependent variable, and, in the case of air pollution modelling, the usage of advanced and complex atmospheric chemical transport simulation models, it is also possible to use Machine Learning techniques to model physical and chemical phenomena involved in air pollution. In this work, the AutoML framework TPOT (Tree-Based Pipeline Optimization Tool) (TPOT Home Page - Data Science Assistant, 2021) is used to discover a performant and interpretable machine learning model capable of explaining the relationship between independent and dependent variables as well as their strength. This automatic parameter optimization (AutoML) tool generates a tree-based search by splitting one parameter value at every branch and evaluating effectiveness with small trainings.
The present work intends to train a sufficiently accurate model capable of identifying the most relevant independent data features used to model the NO2 urban concentration phenomena and to identify any relation between independent features that might be of interest.
The list of initial raw data features used to model NO2 concentration was selected from the datasets described in Table 1, taking into consideration known variables that potentiate the formation and destruction of the NO2 air pollutant also described in the literature.
To perform this initial analysis, TPOT genetic algorithm optimizer ran 100 generations and 1000 retained individuals in every generation with a total of 101.000 model fits evaluations using sklearn (EUROCONTROL, Jul.–Sep. 2020) KFold cross-validation with 5 folds on top of 75% of the original data (N = 381) whereas the remainder 25% were kept for model performance testing purposes. In this initial run no specific template was used and TPOT ran with full autonomy, using all the 30 processors in the default Regressor configuration (Pennsylvania), to build a pipeline with whatever stages and with whatever hyperparameters that maximizes the NO2 model accuracy.
This initial TPOT execution took roughly six days and two hours in a D4a_v4 Azure Virtual Machine (4 vCores, 16GB RAM and SSD Volume) using all the machine cores, keeping a cache of intermediate pipeline results to avoid unnecessary recomputations and periodically checkpointing the optimization process progress. This execution resulted in a complex pipeline with nine stages, four pre-processors, one feature-selector and four stacked regressors, with a resulting average KFold Cross-Validation score on the training set of 12.87 (MSE) and 11.92 (MSE) on the 25% of records that was held for model testing. Having a similar MSE on both the CV (train + validate set) and test set is a good sign that the model is not overfitting or underfitting. While another pipeline performed better on the training set with MSE = 12.32, the reported pipeline performed better on the test set (Figure 8 and Table 2) and was therefore selected as the reference NO2 prediction regressor ML pipeline (Figure 9).

Exploratory AutoML based NO2 prediction pipeline stages and hyperparameters.
Metrics for the prediction of NO2 concentration using AutoML.

Exploratory AutoML based NO2 prediction performance.
During this first run, the pipeline cross-validation score evolved rapidly during the first phases of the optimization process progress. It slowed down on later phases with each generation taking longer to complete which is expectable since the average complexity of the pipelines increases over time with additional pipeline stages, unions, and stacks that take longer to compute. The time required to complete a generation stabilized at around generation 40 since the optimizer focused on a similar cost and complex search space for the remainder of the optimization process. It took 87 generations in 100 to achieve the best score, which took 122 h in a total of 146 h, and it took 36 generations in 100 to achieve 80% of the optimization from a min-max perspective, which took 32 h in a total of 146 h. A depiction of how this process unfolded throughout the optimization process can be found in Figure 10 with the highlighted area representing the fulfilment of 80% of the training CV score optimization.

TPOT AutoML training efficiency.
As for the impact of COVID-19 pandemic related restrictions in Lisbon on the urban mobility, it is estimated that during the first national emergency period (18-03-2020 to 03-05-2020) public transportation ridership has suffered a homologous drop of 75% to 80% while during the first calamity state period (04-05-2020 to 01-07-2020) it is estimated to have dropped 57% to 62%. Regarding the usage of private vehicles, during the first national emergency period, it is estimated to have suffered a reduction of 78% to 84% while during the first calamity state period it is estimated to have dropped 43% to 59%. Residential area mobility activity has increased 33% during the first national emergency period, decreasing to 19% during the first calamity state period. The urban mobility indicators used in the present work correlate moderately with the NO2 air pollutant which is usually associated to anthropogenic activity in the city, whereas the local commerce indicator (Google) has the strongest Pearson correlation (r = 0.54) and has also been identified as the main anthropogenic data feature contributing to NO2 concentration by the trained NO2 concentration prediction pipeline.
Regarding the impact of COVID-19 pandemic related restrictions in Lisbon on the air quality, during the first national emergency period (18-03-2020 to 03-05-2020) and subsequent calamity state period (04-05-2020 to 01-07-2020) the main criteria air pollutants have generally decreased in both urban background and urban traffic stations when compared to a 2013-2019 baseline with the exception of Ozone (O3) in urban traffic stations which has increased. With a sharp reduction in anthropogenic activities, most importantly in road traffic, and with depressionary weather suboptimal for NO2 build-ups in the first phase, NO2 registered a 54.35% drop in the first phase and 40.39% in the second phase in urban traffic stations, while in background stations it dropped 28.62% and 22.99%, respectively.
These results are consistent with (Olson et al.; Pedregosa et al., 2011) taken in Wuhan, China, in similar periods, where the daily mean of O3 concentration increased by 36% and NO2 concentrations decreased 57%.
An AutoML framework (TPOT) was used to build, train, and optimize a regressor ML pipeline to predict NO2 concentration with the available anthropogenic activity, weather, and air pollutant inputs from March/2020 to March/2021, achieving R2 = 0.925 out of the box on the test set. This is an acceptable result for an AutoML approach when compared to other recently purpose-built NO2 prediction models such as R2 = 0.920 (Sicard et al., 2020), R2 = 0.890 (Srivastava et al., May 2021) and R2 = 0.937 (Castelli et al., 2020; Zhan et al., 2020).
Footnotes
Acknowledgments
This work was partially supported by Fundação para a Ciência e a Tecnologia, I.P. (FCT) [ISTAR Projects: UIDB/04466/2020 and UIDP/04466/2020].
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was partly funded through national funds by FCT - Fundação para a Ciência e Tecnologia, I.P. under project UIDB/04466/2020 & UIDP/04466/2020 (ISTAR).
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
