Abstract
Estimating building energy performance remains a challenge due to uncertainties in user behaviour and model simplifications. This study investigates how data availability influences discrepancies between measured and simulated performance, focussing on occupant behaviour uncertainties and modelling assumptions. A comprehensive analysis distinguishes different types of uncertainties and errors in energy modelling. A calibration study is conducted using an existing residential building in the Netherlands. The model is calibrated by progressively increasing data availability levels, incorporating high-quality measurements of environmental conditions, system operations, and occupant behaviour inputs across multiple scenarios. The results show that the integration of detailed occupant data significantly improves the accuracy of the model, reducing the heating usage deviations from 100% (the base case scenario established on the default assumptions) to approximately 5%. Post-calibration evaluations confirm that the monthly mean biased error and the coefficient of variation of the root mean squared error for heating and indoor temperature predictions align with ASHRAE Guideline 14 thresholds. However, refinements in ventilation modelling and behavioural data, such as radiator usage, are needed for further accuracy. This study quantifies the impact of data availability on simulation reliability and provides practical guidance for selecting appropriate abstraction levels based on available data quality rather than simply maximizing model complexity.
Practical Application
This study presents practical implications for stakeholders. The results guide the modellers, emphasising transparency in assumptions, simplifications, input parameters, and modelling methods, while considering error margins to improve decision-making accuracy. Robust calibration practices are vital to improve the accuracy of the model, communicate uncertainties, and account for potential errors. For policymakers, the findings stress the need to establish clear data collection standards for new buildings and promote best practices that ensure transparency in modelling assumptions. These insights are crucial for strengthening confidence in building models, improving their reliability, and supporting well-informed decisions across diverse contexts.
Keywords
Introduction
Building energy modelling and simulation (BES) serves a wide range of applications, including design optimisation, regulatory compliance, certification, and operational performance assessment. Over the past decades, various modelling approaches have been developed to address these diverse purposes. As with all models, BES represents a simplified version of reality, with abstraction levels ranging from highly simplified to more detailed representations. A typical BES model comprises several submodels, e.g., thermal models, airflow models, and occupant behaviour models, each with its own abstraction level and complexity. The appropriate abstraction level for each (sub)model depends on the specific purpose of the simulation. While some purposes can be effectively supported by simplified models, others require a more detailed representation of reality. Generally, models with lower abstraction levels tend to be more complex and require a larger number of input parameters. The key input parameters in BES often include building envelope characteristics, occupant behaviour, and climate data.
Every input parameter in BES comes with inherent assumptions and uncertainties. These uncertainties can be categorised into aleatory uncertainty (the inherent variability of systems or environments) and epistemic uncertainty (which arises from limited knowledge, for example, imprecise insulation values or user-defined heating setpoints). Epistemic uncertainties arise from various sources and can be further categorised as physical, design, or scenario uncertainties. 1 Physical uncertainties refer to the uncertainties arising from the physical properties that are modelled, e.g., material properties such as the conductivity or density. Design uncertainties, also referred to as undecided parameter uncertainty, 2 emerge during the planning phase and are often due to incomplete knowledge or evolving decisions (e.g., undetermined thermal mass or glazing choices). In many cases, certain aspects of a building design or context remain undetermined not because they are unpredictable by nature, but because they depend on decisions or developments that occur later in the process. As such, some uncertainties can be reduced if additional time is taken to gather information, explore alternative options, or wait for related design elements to be finalised. Scenario uncertainties evolve over the lifespan of the building and can be categorised into internal factors (e.g., occupant behaviour) and external factors (e.g., climate change, policy changes and energy prices). In contrast, uncertainties can also arise from errors. A common distinction is made between unacknowledged errors, such as those caused by human mistakes (for example, coding errors), and acknowledged errors, which are intentionally introduced through model simplifications (e.g., abstraction errors resulting from representing a building as a single zone). Abstraction errors are also referred to as model, or model-form uncertainty and numerical uncertainty. 2 This highlights the importance of distinguishing between uncertainties that are truly irreducible and those that may be clarified or constrained through deeper investigation or a sequence of decisions.
The key challenge in modelling lies in balancing abstraction errors and input uncertainty. Although complex models can represent reality with greater accuracy, they also risk being overly detailed without necessarily improving overall accuracy.3,4 Simpler models, on the other hand, have greater abstraction errors but require fewer input parameters, which typically increases the reliability of those inputs.
Conversely, complex models tend to reduce abstraction errors but introduce higher input uncertainty due to the need for more extensive (and often less certain) data estimation. Therefore, the goal should be to minimise the overall error by finding a trade-off between the complexity of the model and the certainty of the input
5
(see Figure 1).
Pidd’s modelling principles emphasise simplicity in modelling and the avoidance of unnecessary complexity. 6 Other studies also recommend sparse models, using the simplest viable approach.7,8 This complexity trade-off is particularly evident in building performance simulation through the concept of level of detail (LoD), the degree of granularity used to represent building elements, systems, and occupant behaviour. For example, simplified single-zone models may suffice for city-scale energy estimates, while multizone or component-level detail is needed for precise building-level assessments.
The selection of LoD should align with the specific simulation purpose, such as design support, code compliance, or operational forecasting, and the performance indicators of interest (e.g., energy use or thermal comfort).9,10 Recent research demonstrates that while adding more information generally improves model accuracy, it can significantly affect energy savings projections by up to 2 times, highlighting the importance of selecting the minimum level of information (data) required for the model’s intended purpose rather than simply maximising LoD. 11
In general, adopting an appropriate LoD is essential for a reliable, efficient and scalable building performance simulation, illustrating the broader modelling principle of optimising the trade-off between model complexity and practical applicability.
Aim of the study
The discrepancy between the actual and simulated performance of the building (commonly called the performance gap) is well known in building energy modelling. This gap is primarily caused by input uncertainties and model simplifications. While incorporating detailed input data can substantially reduce this gap, such comprehensive datasets are often unavailable, especially for newly constructed or yet-to-be-built buildings. In this context, the aim of this study is to investigate the impact of data availability on model accuracy, using a Dutch terraced house as a case study. Specifically, we examine how different levels of detail in available data influence input assumptions related to occupant behaviour, ventilation, and heating systems. By progressively increasing the availability of data regarding occupant behaviour, we evaluated how input affects the general accuracy of the model. Furthermore, this study assesses how varying levels of model abstraction influence the simulation results; the specific focus is on ventilation and heating system models. Through this approach, this research highlights the importance of managing uncertainties related to data availability and model simplifications to improve the reliability of building energy simulations.
Novelty and contribution
The contribution of this work is to quantify the impact of data availability on model accuracy, demonstrating that data availability can significantly influence simulation outcomes. Although previous studies, e.g.,14–16 have emphasised the importance of input parameters, particularly those influenced by the occupants, this study goes further by systematically evaluating their influence within a controlled case study. In addition, we explore alternative modelling approaches for heating and ventilation using a comprehensive dataset of environmental conditions, system performance, and occupant behaviour.
A series of simulation experiments is conducted, each reflecting varying levels of data detail. This approach enables a practical evaluation of how simplified models can still maintain acceptable levels of predictive reliability. Additionally, this study provides guidance for practitioners by underscoring the importance of recognising and addressing uncertainties and (abstraction) errors in building energy simulations.
The remainder of this paper is structured as follows. The Methodology section outlines the approach used in this study. The Results section presents the findings in detail, while the Discussion section interprets the results and findings. Finally, the Conclusions secsion summarises the main findings and highlights the key implications of this study.
Methodology
This study employs a simulation-based approach to assess how varying levels of data availability and specific modelling assumptions influence the predicted energy performance of a case study building. The analysis focuses on input uncertainties related to occupant behaviour. A comprehensive dataset was first collected from a terraced house in the Netherlands, which was monitored throughout 2022. Data collection included both sensor-based monitoring and a resident questionnaire conducted in the first quarter of 2022. The resulting dataset includes detailed information on the geometry of the building, construction details, heating set points, hourly occupancy, use of shower and bath, ventilation and window operation, use of radiators, and operation of heating and ventilation systems.
Subsequently, three sets of simulation experiments were defined, each representing a different level of data availability. The first set uses low-availability data to assess the uncertainty range associated with common occupant behaviour assumptions. The second set incorporates medium-availability data, aiming to demonstrate the influence of integrating occupant behaviour data derived from questionnaires, as well as implementing site-specific data, such as local weather data and contextual shading. The third set uses high-availability data to evaluate the impact of incorporating occupant behaviour data obtained through detailed measurements, as well as several specific modelling assumptions.
Simulation results are evaluated by quantifying the deviation between simulated and measured results using the Mean Biased Error (MBE) and the Coefficient of Variation of the Root Mean Squared Error (Cv(RMSE)). The key building performance indicators analysed include the use of energy for space heating (excluding domestic hot water, DHW) and the indoor air temperature.
The following subsections provide further details on the case-study building, data collection, simulation models, and the design of simulation experiments.
Case study building
Geometry
The case study building is a typical Dutch three-storey postwar corner house. It was built in 1974 and has a total surface area of 93 m2. The house is oriented along a northwest-southeast axis, with the main façade facing southwest. A detailed floor plan illustrating the orientation is provided in Figure 2. The ground floor consists of a living room (southwest side) and a centrally located toilet. An open kitchen and a storage area are located at the northeast end. The first floor includes a main bedroom (southwest), a home office (centre), and an additional room (northeast). The attic is open without interior partitions. A centrally positioned staircase connects all floors. Layout of the ground floor (left) and the first floor (right) of the case study building.
Construction of the house
The case study building consisted of various construction elements with different thermal properties. The ground floor is an uninsulated concrete slab (thickness 200 mm) with an Rc value of 0.15 m2K/W. The side facade is constructed with an uninsulated cavity wall (Rc = 0.35 m2K/W). The rest of the external walls are masonry cavity walls with 70 mm insulation (Rc = 1.92 m2K/W). The roof is insulated with 40 mm insulation (Rc = 1.11 m2K/W). The house has double-glazed windows with a U-value of 2.900 W/m2K and a solar transmittance (g-value) of 70%, allowing a significant amount of solar gain. The external doors are not insulated, with a U-value of 3.400 W/m2K, indicating relatively high heat loss.
HVAC
The house is equipped with a mechanical ventilation system (Stork EMC), which extracts air from the toilet and the bathroom and supplies fresh air to the other rooms through natural ventilation openings. The ventilation system is equipped with a three-speed control panel and a “boost” function with a timer. The boost function allows residents to temporarily increase the ventilation system to the highest setting for a set period (10/30 minutes) before automatically returning to the previous setting. The space heating is provided by a condensing gas boiler (Vaillant VHR 18–22°C). The boiler has a nominal heating output of 16 kW at 80/60°C with an annual utilisation efficiency of 80.1%. No cooling system is present in the building.
Data collection
Three types of data are collected: (i) environmental data, (ii) occupant behaviour, and (iii) system-related data. Environmental data includes CO2 concentration, temperature, and relative humidity levels in each room. Occupant behaviour data includes thermostat set points (see also Figure 3) and window open/close status, providing insights into natural ventilation patterns. System-related data cover warm water usage (flow rate and supply/return temperature) for space heating, as well as the electricity consumption of the ventilation system. More details on data types and collection frequency are provided in Table 1. Heating setpoint responded by the resident in the questionnaire in the first quarter of 2022 (The handwritten note ‘Elke dag een beetje zelfde’ means: ‘Every day is a little bit similar’). Details of the data collected.
Simulation model and tools
EnergyPlus was selected as the calculation core of the developed model due to its robust capabilities. 17 While EnergyPlus relies on text-based input files (e.g., IDF files), which require in-depth knowledge of its syntax, a graphical user interface (GUI) is generally preferred for ease of use and visualisation. This study used DesignBuilder, a widely used GUI for EnergyPlus. 18 DesignBuilder offers an intuitive interface and a comprehensive set of tools, simplifying the creation and analysis of building energy models while maintaining EnergyPlus’s flexibility for detailed inputs and modelling choices.
The geometry is modelled according to the provided drawings of the case study building, including a site drawing, floor plans and a section plan. Figure 4 shows a 3-D visualisation of the spatial context modelled in DesignBuilder. The house with a grey colour is the case study building, and the pink volume is the side extension of the house, which is used as a carpark/storage. Surrounding buildings (with a brick red colour) are represented with shading objects in the model to account for the potential shading/reflection effect on the case study building from the context. Figure 5 depicts the internal zoning of the modelled case study building. Zones are created per room. Visualisation of the geometry model. Visualisation of the internal zones per floor in the simulation model, (a) ground floor zones, (b), first floor zones, (c) the attic zone, (d) the storage room zone on the ground floor.

In terms of the heating system, parameters such as efficiency and nominal capacity are set according to the technical specifications as described in subsection 2.1. Since technical information on ventilation fans is not known, the maximum ventilation capacity of the fan is considered to be at a typical rate of 0.9 L/s·m2, as defined in the Dutch Bouwbesluit. 19 It should be noted that due to measurement error, the data quality of the measured heating setpoint turned out to be not suitable for this study. Therefore, the measured average air temperature for each room was adopted as the heating setpoint.
Design of simulation experiments in relation to the level of data availability
Occupant-related modelling assumptions and considered data availability levels.
Other modelling assumptions and their considered options.
Simulation set 1 - Low data availability and default modelling assumptions
Description of occupant behaviour scenarios for a most energy consuming case and a least energy consuming case.
Furthermore, the following assumptions are defined for the Simulation Set 1: radiator capacity is automatically sized by EneryPlus; mechanical ventilation rate is automatically sized by EneryPlus; boiler is modulated according to a weather compensation curve; natural ventilation is constant throughout the year and is not influenced by weather conditions; internal thermal mass of furniture, papers and any other items is considered, and a zone capacitance multiplier of 20 is assumed
20
; thermostat is controlled based on zone mean air temperature; standard weather data is adopted according to the NEN 5060 standard
21
; shading effects from surrounding buildings are ignored.
Simulation set 2 - Medium data availability and site-specific inputs
Two subsets, 2.1 and 2.2, have been defined to illustrate the effect of input parameters derived from insights obtained through questionnaires. Subset 2.1 (QUE) applies only the occupant behaviour inputs obtained from the questionnaire. All previously outlined assumptions remain consistent with those of Simulation Set 1. In Subset 2.2 (QUE_site), two additional site-specific inputs are introduced alongside the questionnaire-based occupant behaviour inputs: The weather data is obtained from the nearest KNMI meteorological station located in Heino in the year 2022; Contextual shading is modelled, specifically accounting for shading from adjacent buildings.
Simulation set 3 - High data availability and specific modelling assumptions
Five subsets have been defined to further illustrate the effect of input parameters regarding heating and ventilation behaviour with varying levels of modelling complexity. Subset 3.1 (Heating_Tair): Heating control based on measured air temperature; natural and mechanical ventilation settings remain consistent with Simulation Set 2. Subset 3.2 (Heating_Top): Similar to Subset 3.1 but with thermostat control based on operative temperature. Subset 3.3 (Ventilation_Constant): Incorporates measured window operations and mechanical ventilation behaviour retrieving from measurements, retaining constant ventilation flow. Subset 3.4 (Ventilation_Wind): Adds wind effects to ventilation modelling, using flow coefficients.
22
Subset 3.5 (Ventilation_AFN): Uses an Airflow Network Model for detailed ventilation dynamics, with measured inputs for window and mechanical ventilation behaviour.
Key performance indicators
The simulation outcomes are assessed by measuring the deviation between the simulated and actual data using the MBE and Cv(RMSE). The error indicators are examined in different time resolutions, namely, hourly, daily, and monthly. The building performance metrics examined are energy consumption for space heating (excluding domestic hot water, DHW) and indoor air temperature. In addition, the results are also evaluated on the annual heating use intensity, which is the heating energy normalised by the floor area (sqm).
In terms of the measured data used in the comparison, the space heating use was calculated using the measured volumetric flow rate, the supply, and the return water temperature at the system level.
Quantifying errors
The analysis results are presented as MBE and Cv(RMSE) values. For MBE, larger ranges indicate higher sensitivity and uncertainty in the input parameters, leading to greater deviations of the simulated data from the measured data. Similarly, larger ranges for CV(RMSE) suggest a high level of uncertainty in the input parameters, reflected in greater variation or deviation from the measured data. Conversely, smaller ranges for both metrics imply that the model is robust and performs consistently across different parameter conditions.
The mean biased error (MBE) and the coefficient of root mean squared error (Cv(RMSE)) are selected for metrics to quantify the error. A positive MBE indicates that, on average, the model overestimates the measured values, while a negative MBE suggests an underestimation. The MBE is calculated according to the following equation23,24:
The root mean squared error (RMSE) provides a measure of the model’s predictive accuracy. A lower RMSE value indicates better predictive performance, as it signifies smaller errors between the predicted and measured values. Cv(RMSE) is calculated as follows23,24:
Threshold of statistical indices for calibrating building energy models for heating use comparison.
Results
Space heating energy use
Simulation set 1 - Low data availability
Comparison of model accuracy on heating use for the Simulation Set 1, 1.1 is the most energy consuming case whereas 1.2 is the least energy-consuming case.

Comparison between simulated and measured daily space heating energy use for baseline simulation experiments.
Simulation set 2 - Medium data availability with questionnaire data and site-specific inputs
Comparison of model accuracy on heating use for the simulation set 2, 2.1 is questionnaire-only case whereas 2.2 is the questionnaire with site-specific inputs case.
Comparing with the results of Simulation experiment 1.1 and 1.2, as shown in Figure 7, the decrease of Cv(RMSE) are 53.5% and 19.4% hourly, 70.2% and 16.3% daily, and 94.6% and 40.9% monthly time resolutions. Simulation experiment 2.2, by adding the related weather file and context as site-specific input parameters into simulation experiment 2.1, reduced the Cv(RMSE) of heating use by approximately 11.8% hourly and 28.5% daily, while increasing 2.9% monthly. Comparison of the model accuracy on heating use for the simulation experiments with medium data availability; for simulation set 1, the blue bar represents the most energy consuming case, the orange bar represents the least energy-consuming case, (a) Cv(RMSE) for hourly heating use, (b) Cv(RMSE) for daily heating use, (c) Cv(RMSE) for monthly heating use.
Figure 8 depicts two scatter plots that compare simulated daily space heating energy (y-axis) against measured daily space heating energy (x-axis), for simulation experiment 2.1 and 2.2 respectively. Each plot has a 1:1 line (diagonal) as a reference to indicate where perfect agreement would be. Comparing the two scatter plots, Figure 9(b) shows the data cluster more closely to the 1:1 line indicating a better agreement between simulated and measured data. These results highlight that adding context-specific data provides more accurate and realistic results. The relative difference for annual heating use indicates that simulation experiment 2.2 results in a higher relative difference, highlighting the significant influence of context-related parameters. Comparison between simulated and measured daily space heating energy use for the simulation experiments with medium data availability. Cv(RMSE) for monthly heating use for the simulation experiments with high data availability level.

Simulation set 3 - High data availability with in-depth measurements and relevant modelling aspects
Comparison of model accuracy on heating use for the high data availability simulation experiments.
Among the simulation experiments: Simulation experiment 3.2 has the lowest Cv(RMSE) for all time resolutions, which makes simulation experiment 3.2 the most appropriate for this case study and this KPI (see Figure 9). Simulation experiment 3.2 involves operative temperature as a thermostat control. As compared to simulation experiment 3.1, operative temperature as a thermostat control reduced Cv(RMSE) by 2.3% hourly, 4.7% daily, and 6.0% monthly. Positive MBE values indicate slight over-prediction, but the errors are relatively small compared to other simulation experiments. In this respect, MBE for all time resolutions and monthly Cv(RMSE) values of simulation experiment 3.2 comply with threshold values outlined in ASHRAE Guideline 14 and IPMVP standards. Simulation experiment 3.5 results in the highest annual heating use intensity with 170 kWh/m2, while scenario 3.4 results in the lowest annual heating use intensity with 120 kWh/m2 as shown in Table 10: Comparison of annual heating use intensity for the simulation experiments with high data availability level. Simulation experiment 3.5 shows the highest MBE, Cv(RMSE), and relative difference in energy use for heating, highlighting the largest deviation in model bias and RMSE. In this respect, simulation experiment 3.5 also does not comply with threshold values outlined in ASHRAE Guideline 14 and IPMVP standards. Annual difference is also the highest, further confirming the large deviation in heating use by overpredicting. Simulation experiment 3.5 indicates that using AFN as a ventilation model introduces significant deviations from the measured values. Simulation experiment 3.4 also shows significant deviations in terms of negative errors, where it consistently underpredicts the heating use. Similar to simulation experiment 3.5, simulation experiment 3.4 including wind impact in the calculation introduces a significant deviation from the measured values. Comparison of annual heating use intensity for the simulation experiments with high data availability level.
In summary, across all simulation sets as Figure 10 demonstrates, the results consistently demonstrate that increased data availability significantly improves model accuracy in predicting heating energy use intensity. In Simulation Set 1 (low data availability), occupant behaviour caused wide variation in heating demand (80–296 kWh/m2 annually), with large errors due to the absence of specific behavioural inputs. Simulation Set 2 (medium data availability), which incorporated questionnaire and site-specific inputs, showed substantial reductions in Cv(RMSE), especially at hourly and daily resolutions, highlighting the value of contextual data. The results also show that simulation experiment 3.5 exhibits the largest deviations across all metrics, indicating significant inaccuracies and over-prediction in heating use. Simulation experiment 3.4 shows the most negative bias and under prediction, coupled with high error metrics. Simulation experiment 3.2 performs the best in terms of Cv(RMSE) over all indicators, making it the most consistent simulation set for this case study. Simulation experiment 3.3 has the most balanced and moderate results, with low bias and moderate errors, while simulation experiment 3.1 demonstrates moderate under prediction but higher variability than simulation experiment 3.2. In summary, the presented study shows that by providing more detailed input data to the simulation model, the simulated heating use can be aligned more closely with the measured conditions. However, increasing the model complexity does not always lead to better results (e.g., compare experiment 3.5 with 3.2). Comparison of annual heating energy intensity across simulation experiments with increasing data availability, green circle: low data availability; orange triangle: medium data availability; yellow square: High data availability.
Indoor temperature
Simulation set 1 - Low data availability
Comparison of model accuracy on indoor temperature for the baseline simulation experiments.
Simulation set 2 - Medium data availability with questionnaire data and site-specific inputs
Comparison of model accuracy on indoor temperature for the simulation experiments with medium data availability.
Figure 11 depicts the comparison between the measured temperature and the simulated temperature in the living room for a typical winter week. The orange line shows the measured temperature in the living room. The sensor is located next to the thermostat on a wall in the living room of the case study building. The green line shows the simulated mean air temperature of the living room. The yellow line shows the operative temperature of the living room. The purple line represents the predicted temperature of the surface where the sensor is located. In general, the simulated temperature is lower than the measured temperature in the range of 1°C to 3°C. Moreover, it can be observed that the simulated air temperature (represented by the green line) shows a steeper decline relative to the measured temperature following the residents’ adjustment of the heating setpoint in the afternoon. In contrast, the decrease in the simulated surface temperature and the operative temperature is less pronounced compared to the simulated mean air temperature. The hourly temperature in the living room for a typical winter week for simulation experiment 2.2.
Simulation set 3 - High data availability with in-depth measurements and relevant modelling aspects
Since the air temperature sensor is located next to the thermostat on a wall in the living room, it measures the temperature of the air close to the wall. This temperature can be quite different from the mean air temperature as indicated in Figure 11. Therefore, separate simulation experiments (simulation experiment 3.1 and simulation experiment 3.2) are performed to investigate the use of operative temperature to control the space heating system. Measurements of the ventilation system and window opening behaviour are applied in simulation experiment 3.3 to simulation experiment 3.5. Besides, the airflow modelling methods are explored. Results of model accuracy is listed in Table 13. Among the simulation experiments: For most of the scenarios, model accuracy for predicting indoor temperature is improved after implementing more detailed measurements for the use of heating system and ventilation. The accuracy of MBE reduced from around 1°C for simulation experiment 1.2 to around 0.4°C for simulation experiments 3.1–3.4. Simulation experiment 3.5 shows the lowest MBE and the highest Cv(RMSE) for the prediction of indoor temperature, highlighting the largest deviation in model bias and RMSE. In this respect, simulation experiment 3.5 does not comply with the threshold values. Simulation experiment 3.4 has the smallest bias error among the scenarios while slightly higher Cv(RMSE) compared to simulation experiments 3.2 and 3.3. Both simulation experiment 3.2 and simulation experiment 3.3 show similar model accuracy in terms of MBE and Cv(RMSE) values. Comparison of model accuracy on indoor temperature for the simulation experiments with high data availability level.
Figure 12 illustrates the variation between the simulated and measured monthly average temperatures in the living room across simulation experiments 3.1 to 3.5. The solid lines depict outcomes from simulation experiments where only heating system measurements, related occupant behaviour, and pertinent modelling aspects are incorporated. In contrast, the dashed lines illustrate results from scenarios focusing on ventilation usage and associated modelling aspects. It is evident that the results exhibit significant variability when different ventilation modelling methods are employed, as observed through the comparison of the three dashed lines. This underscores the considerable influence of ventilation behaviour and ventilation modelling approaches on the accuracy of indoor temperature predictions. Monthly average temperature difference between the measured data and simulated simulation experiments with high data availability level.
The comparison between simulation experiment 3.3 and simulation experiment 3.5 on the predicted hourly temperature in the living room is shown in Figures 13 and 14 for a typical winter week. The results of a typical summer week are shown in Figures 15 and 16. The results indicate that the simulated temperature from simulation experiment 3.3 correspond more closely with the measured data for the winter week, whereas simulation experiment 3.5 demonstrates a closer alignment with the measured data for the summer week. Hourly temperature in the living room in a winter week for simulation experiment 3.3-Ventilation_Constant. Hourly temperature in the living room in a winter week for simulation experiment 3.5-Ventilation_AFN. Hourly temperature in the living room in a summer week for simulation experiment 3.3-Ventilation_Constant. Hourly temperature in the living room in a summer week for simulation experiment 3.5-Ventilation_AFN.



In summary, simulation experiment 2.2, which includes site-specific inputs such as local weather data, significantly improves the prediction accuracy compared to simulation experiment 2.1, reducing the mean bias error (MBE) by approximately 0.5°C and increasing the accuracy by 1% to 3% at various temporal resolutions. Further exploration with simulation experiments 3.1 to 3.5, which incorporate detailed measurements of heating, ventilation, and occupant behaviours, reveals additional improvements in accuracy. Simulation experiment 3.4 achieves the lowest bias error, although simulation experiment 3.5 exhibits the highest deviation from the observed data, indicating poor compliance with accuracy thresholds. Ventilation behaviour and modelling approaches considerably influence temperature predictions, as seen in the different outcomes when employing varied ventilation methods. Simulation experiments 3.3 and 3.5 align more closely with the measured data in winter and summer weeks, respectively, highlighting the effect of seasonal conditions on model predictions.
Discussion
Influence of data availability on the performance gap
The results of this study demonstrate that the level of data availability strongly influences the performance gap, particularly for the simulation of heating energy use. For example, in Simulation Set 1, characterized by Low Data Availability, the predicted heating energy use showed a widespread (80–296 kWh/m2) with an hourly MBE range from −45% to 101.0%. This reflects the uncertainty introduced by generic occupant behaviour scenarios and non-site-specific inputs. In Simulation Set 2, which incorporated Medium Data Availability through site-specific inputs and questionnaires (heating setpoints, radiator use, window and ventilation grille use, mechanical ventilation and occupancy), the model accuracy improved substantially. The improvement is evident in the reductions in both MBE and Cv (RMSE) across all resolutions (monthly, daily and hourly). In particular, the MBE has been reduced to around −10% and −30% for the two Simulation Set 2 experiments. A detailed investigation of Simulation Set 3 reveals that further improving data availability (High Data Availability) can lead to greater simulation accuracy. Simulation experiment 3.2 achieves the best overall performance in predicting heating use, with an MBE of around 5% across all temporal resolutions.
Further analysis of the temperature simulations reveals similar trends. By incorporating local weather data and other site-specific inputs, the results of simulation experiment 2.2 show a reduction in MBE by approximately 0.5°C compared to simulation experiment 2.1, achieving a 1-3% increase in accuracy across different temporal resolutions. The five experiments of Simulation Set 3 demonstrated that experiment 3.2 provided the most accurate and consistent results with minimal bias, while experiment 3.5 showed significant overpredictions due to incomplete input data introduced by increasing the modelling complexity in the ventilation model. The findings indicate that while detailed data improves model accuracy, careful data selection and verification are crucial to prevent introducing new errors.
Although a comprehensive input dataset improves the model results, accessing a reliable and detailed dataset remains a challenge for most buildings.25,26 For practitioners who work with limited input data, our findings reveal that medium-level data, such as basic schedules and occupancy patterns, which can be relatively easy to obtain through a questionnaire, can already reduce the performance gap for key performance indicators such as heating usage. However, to meet higher accuracy simulation requirements, for example, for detailed indoor temperatures for overheating assessment, investing in richer input data becomes crucial.
Practical implications
The results offer a guidance for modelers working with both existing and planned buildings. They highlight the need for transparency around assumptions and simplifications made for input parameters and modelling methods. Modelers must also account for error margins to improve decision-making accuracy. Dealing with uncertainties is critical in improving model reliability. Several strategies can mitigate these uncertainties, such as scenario analysis, model calibration and validation,27,28 relying on expert judgment and guidelines, and performing sensitivity analysis. While employing scenario analysis enables testing various plausible scenarios to understand potential outcomes, model calibration and validation ensures models reflect real-world behaviour. Similarly, performing sensitivity analysis enables identifying the parameters that are the most influential on the results and evaluating the effect of their variability on the results. In addition, relying on expert judgment and guidelines leverages established standards and experienced professionals to inform the assumptions of the model.
It is important to recognize that model accuracy and decision-making value are not equivalent. The value of increased data granularity and model complexity depends on the specific purpose of the model. Beyond modelers, these findings have practical implications for stakeholders at different building lifecycle stages. For early-stage design decisions requiring relative comparisons between alternatives (e.g., comparing insulation levels), medium-level data provides sufficient reliability for ranking options. However, for applications requiring absolute predictions, such as performance contracting, or detailed overheating assessment, the investment in high-fidelity data becomes justified. For facility managers and energy auditors, these insights inform monitoring investment decisions, while for designers and engineers during detailed design and commissioning, understanding when medium-level versus high-level input data is appropriate helps allocate resources effectively.
This study also presents several practical implications for the other stakeholders. For policymakers, these findings emphasise the importance of setting clear data collection standards for new buildings and promoting best practices for transparency in modelling assumptions. For modelers and building service consultants, the study highlights the importance of robust calibration practices to improve model accuracy while clearly communicating uncertainties and considering potential error margins. These insights are pivotal for enhancing confidence in building models and ensuring their reliability across various contexts.
Limitations
Several limitations are acknowledged in this study. Primarily, reliance on real-world data collection, as opposed to controlled experiments or chamber studies, limits the scalability of the results. The assumptions and modelling simplifications are specific to the case study, potentially impacting the generalisability of findings. For example, using the flow coefficient that accounts for the effect of wind (the DOE-2 coefficients) for natural ventilation could produce different results compared to using the BLAST coefficients in different climate and contextual settings, underscoring the importance of context-specific assumptions and methods. Applying identical assumptions to other scenarios may not yield comparable outcomes, making it essential to be cautious when transferring these findings to different settings.
Conclusions
This study demonstrates that building performance simulation accuracy is significantly influenced by the level of data availability but also by specific modelling choices, particularly those related to occupant behaviour and ventilation systems.
Unlike previous studies that emphasised parameter importance in isolation, this work systematically quantifies these impacts through a controlled case study with three distinct data availability levels, enabling a practical evaluation of how different modelling approaches interact with available data to shape simulation reliability. By systematically evaluating three simulation sets with progressively increasing data availability, we provide quantitative evidence of how detailed and context-specific inputs reduce discrepancies between measured and simulated results. Our results show that improved data availability, specifically related to building characteristics and building occupants, substantially reduces the discrepancy between simulated and measured building performance.
Our systematic evaluation reveals substantial improvements in model accuracy with increased data availability. The progression from low to medium data availability (incorporating questionnaires and context-specific data) reduced the mean bias error (MBE) of hourly heating use from around 100% to around 30%. Further enhancement to high data availability, incorporating measured occupant behaviour data, achieved accuracy with annual heating use differences between simulated and measured data of around 5% (simulation experiments 3.2 and 3.3). The comparison on a monthly level revealed a mean biased error of 9.6% (Scenario 3.2) and 11.5% (Scenario 3.3) for heating use and −0.4°C for indoor temperature. These results provide the first systematic quantification of how data availability affects both energy and indoor climate outcomes, addressing a widely acknowledged but rarely empirically demonstrated factor in building performance simulation.
However, the study also reveals that greater model complexity does not guarantee improved accuracy. Certain modelling limitations remain, particularly related to behavioural uncertainties that could not be fully captured, such as radiator and window use patterns. Additionally, our findings highlight the importance of certain modelling choices: for example, experiment 3.5, despite high data availability, over predicted heating use due to unresolved uncertainties introduced by advanced ventilation modelling. This highlights the importance of aligning model abstraction levels with different levels of detail in available input data, and of managing uncertainty not only through richer data but also through transparent modelling assumptions and validation practices.
For the building simulation community, our findings emphasise the need for a more nuanced approach to model development, especially in contexts where detailed inputs are not readily available. In such cases, such as early-stage design or policy assessment, using medium-level, easy-to-obtain data (e.g., schedules or occupant surveys), combined with scenario testing and sensitivity analysis, can still yield reliable insights. Importantly, practitioners must recognize that increased accuracy does not automatically translate to better decisions. The potential benefit of additional data collection and increased model complexity should be evaluated against the specific decision context, acknowledging that increasing data availability and model complexity do not automatically lead to higher fidelity. The focus should be on whether any improvement in accuracy would actually alter design choices, equipment selections, or operational strategies. This study encourages practitioners to move beyond the binary choice of “simple versus detailed models” and instead adopt a data-informed modelling strategy that balances realism, transparency, and robustness.
Overall, this research underscores the need to align model complexity with the granularity of available input data. It also offers practical insights for modelers and practitioners, showing that even when full datasets are not accessible, as in the case of new or early-stage designs, transparent assumptions, sensitivity analyses, and scenario testing can help ensure reliable and informative simulation outcomes. By highlighting how varying levels of input detail affect model performance, this study contributes to a more informed and nuanced application of building performance simulation in practice.
Footnotes
Acknowledgments
The authors gratefully acknowledge the project IEBB (Integrale Energietransitie in Bestaande Bouw) Theme 2: Data-driven Optimization of Renovation Concepts, for providing access to the monitoring data that made this research possible.
Author contribution
All authors contributed to the study conception and design. Material preparation and analysis were performed by Luyi Xu and Günsu Merin Abbas. The first draft of the manuscript was written by Luyi Xu and Günsu Merin Abbas, and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research is part of the Renovation Explorer project (Renovatieverkenner in Dutch), which was funded by the Government of the Netherlands, represented by The Ministry of Economic Affairs and Climate Policy (BZK) and the Netherlands Enterprise Agency (RVO) (202012074).
Ethical considerations
The Ethical Review Board TU/e approved the reuse of the monitoring data from the project IEBB (Integrale Energietransitie in Bestaande Bouw) Theme 2: Data-driven Optimization of Renovation Concepts on January 19, 2023 (reference ERB2023BE3).
