Abstract
Background
Improving awareness of Technical Debt (TD) has been demonstrated in prior work using the serious game TDSIM. Assessing how effectively TDSIM supports epistemic change is essential for its continued development. Examining participant play strategies offers a way to quantify a serious game’s impact on shifting player epistemic states, while also revealing opportunities for refinement. Yet the literature on serious game player behavior remains largely silent on strategy assessment, a gap our methodology directly addresses.
Method
The serious game TD-SIM was developed in prior work using real-world players. We analyze prior participant game data for actions taken. Combinations of actions define the concept of agent archetypes. Bayesian analysis is used to calculate probabilities of real-world play conditioned on agent archetypes. Principal Component Analysis (PCA) is used to classify conditional probabilities to which K-Means clustering is applied. Clusters of probability data are the strategies players of TD-SIM use, referred to as player archetypes.
Results
Our method identifies four player archetypes, characterizing playing strategies as Gaming, Managers, Learning, or Testing. Player archetypes inform and quantify behaviors of players, providing insight into improvement opportunities.
Conclusion
Measuring player archetypes is complex, but not impossible. We evaluate player archetypes of TD-SIM quantifying design improvements. Identifying and modelling participant playing strategies is a novel enhancement to general serious game evaluation. The benefit of our approach is that it allows us to characterize a player’s behavior in a quantifiable manner, which we can measure against intended purpose.
Remote Operations Centres (ROCs) operate industrial equipment several hundred kilometres from a location to deliver nationally and commercially important resources and services in remote and often harsh environments (Lakshmanan et al., 2019). These ROCs orchestrate human decision makers and multiple software systems from numerous vendors. All software systems have Technical Debt (TD), the long-term impact of shortcuts and compromises taken during the software development lifecycle (Besker et al., 2019; Kruchten et al., 2019). Managing technical debt within ROCs is complicated by interactions between software systems and decision makers, producing adaptive and emergent behaviours characteristic of Complex Adaptive Systems (CAS) (Bekebrede et al., 2015; Gould et al., 2024).
Our use case is a ROC owned by an Australian mining company with export revenue of $37 billion AUD (US$ 25bn) in 2025. Significant technical debt issues can affect company profitability and State and Federal budgets (Government of Western Australia, 2025; Ramasubbu & Kemerer, 2021). However, the potential risk of technical debt is not visible to senior management decision makers.
Gould et al. (2024) developed the serious game TD-SIM based on a real-world ROC case study to address CAS complexities and improve awareness of technical debt. Improved situational awareness is a pre-cursor to improved decision making (Wehrle et al., 2022) and thus by extension more effective technical debt management. Evaluating TD-SIM’s effectiveness for different player strategies, which we call player archetypes, is essential for its ongoing refinement as a technical debt management tool. This paper presents a method to enhance serious game evaluation, with targeted insight into TD-SIM’s effectiveness in fostering epistemic change in technical debt awareness, by measuring player archetypes.
Background
Serious games, simulation games, and gaming simulations concern human participants playing a role in an artificial setting (Meijer, 2009). These formats abstract real-world scenarios for learning and are collectively referred to as Serious Games (SGs) (Hauge et al., 2012; Kriz, 2003; E. Leigh & Spindler, 2005). Evaluating serious games concerns measuring, validating and verifying serious game outcomes (Mayer et al., 2014; Petri & von Wangenheim, 2016). It is an integral part of serious game development (E. E. Leigh & Levesque, 2024; van den Hoogen et al., 2014). The primary form of serious game evaluation when focusing on player experience is achieved using pre/post-game questionnaires (Catalano et al., 2014), which suffer from issues of subjectivity (Loh et al., 2015).
Objective measures of play behaviour are possible by analysing in-game player data. (Loh et al., 2007). This can be used to assess if the strategy a player employs is in line with, or is exploiting an aberration of, a serious game’s abstraction of reality (Klabbers, 2009) such as gaming-the-game (Frank, 2012). By quantifying player strategies TD-SIM’s efficacy in promoting epistemic change in technical debt awareness can be measured, thereby quantifying opportunities to improve game design. Of the many perspectives in learning science on epistemic change (Arthars et al., 2024), we adopt that of Shaffer (2006), focusing on pedagogical goals for developing domain specific expert capabilities, to preserve relevance across serious game contexts.
Serious games research has largely examined strategy through pedagogical lenses (Bellotti et al., 2011; Cheng et al., 2015). Westera (2017) proposes a computational model of player strategy, but it has not been applied to an actual serious game. This leaves a gap in evaluating player strategy empirically. We address this by introducing a method for measuring strategies used by real-world players of TD-SIM (participants), analysing in situ game data within a Bayesian framework to quantify distinct player archetypes.
Overview of TD-SIM
A mid-game view of TD-SIM developed by Gould et al. (2024) is shown in Figure 1. TD-SIM was a single player serious game hosted on a stand-alone laptop, written in the Python language using the PyGame package. TD-SIM was designed to support an epistemic state change in industrial supply chain knowledge of technical debt. The bolded items in the following section correspond to the game design elements depicted in Figure 1. TD-SIM mid-game Showing Waste, Spanner and Cog Movements.
TD-SIM players move randomly created red
Balancing technical debt creation, removal, and cash generation, players explore the impact of their decisions (management). Strategies employed by players were central to their learning experience. For instance, players whose strategy was to proactively minimise technical debt creation whilst generating as much cash as possible act in line with TD-SIM’s intent. A less appropriate strategy (from the game designer’s perspective) was for players to perform well by achieving a high cash outcome with minimal technical debt management, signalling that TD-SIM was being gamed. Alternatively, when players’ choices reflected prolonged engagement with basic mechanics rather than task-relevant strategy, TD-SIM was imposing unnecessary cognitive load, making it harder to master. Evidence of these later behaviours indicates limitations in TD-SIM’s ability to foster intended epistemic change in technical debt awareness, whilst providing insight into game design improvement opportunities.
Methodology
Research Question
We address two questions: (RQ1) How to measure strategies used by simulation participants? and (RQ2) What insights do such strategies reveal?
Using statistical analysis within a Bayesian framework applied to prior TD-SIM data, our methodology identifies, classifies, and clusters individual strategies into player archetypes. These archetypes provide insight into player behaviour and inform the game’s effectiveness in supporting epistemic change. First, we outline our Bayesian framework, followed by a description of our methodology.
Bayesian Framework Overview
Our model calculates which predefined play strategies, our agent archetypes, a participant’s game data is probabilistically equal to. When repeated for all participants, clustered results provide insight into strategies used by players, which we refer to as player archetypes. These probabilities cannot be calculated directly from game data. We use a Bayesian framework to reveal conditional probabilities that can be calculated using available game data. A discussion of our Bayesian framework is provided in our supplementary information.
Model Methodology
Steps of our methodology are illustrated in Figure 2, each outlined below. Methodology.
Step 1: Define Agent Archetypes
Actions and Parameters Used to Identify Each Agent Archetype’s Strategy. Technical Debt (TD) Growth Parameters are Minimum (MIN), Maximum (MAX), or Randomised (RND).
Among the actions in Table 1, Target is the only one parameterized as continuous data, affording sensitivity testing of our method to closely related agent archetypes. To prevent trivial attainment of a target, prime number targets are used (53%, 43%, 23%), given technical debt upgrades occur in +5% (MIN) or +15% (MAX) increments. RND is a random strategy with equal probability of MIN and MAX occurring. The maximum achievable target under all schemes is 85%. Remaining parameters are binary (T/F), either enabled or disabled respectively. Each agent archetype is uniquely defined by concatenating actions (e.g., paying down TD, bypassing technical debt management) and associated parameters.
Step 2: Apply Strategy Model
Using identified agent archetypes, we calculate the probability of each agent archetype’s strategy occurring given participant game data. We augment probabilities with participant outcome data so that player performance can be incorporated into subsequent classification and clustering. Specifically, technical debt generated and cash reward earned per game are extracted from participant games. The resulting data matrix contains columns of conditional probabilities for all agent archetypes alongside technical debt generated and cash rewards achieved; each row representing a single participant’s game. This matrix is imported into Python 2025.2 as a Pandas Data Frame for analysis.
Step 3: Classify and Cluster Strategies
Principal Component Analysis (PCA) is used to reduce the columns of our data matrix (called dimensions) down to a smaller set of Principal Components (PCs). Each principal component is a unique combination of dimensions from the data matrix that captures a specific portion of data matrix variance (Abdi & Williams, 2010). PCA is an established means of classification in serious games, used for instance by Rodrigues and Brancher (2018). We perform PCA using the Python library sklearn and associated packages.
We first apply a StandardScaler to normalize column data to a mean of zero with a standard deviation of one. This is important because our analysis incorporates agent archetype probabilities, technical debt generated, and cash rewards achieved, with different scale dimensions, something PCA is sensitive to. Second, PCA is fitted to the normalized data. The standard practice of considering principal components with eigenvalues greater than one is adopted to identify principal components for further analysis (Orji et al., 2014).
K-Means clustering is applied to identified principal components. It is an unsupervised clustering method used in serious game data analysis, often in conjunction with PCA (Kanungo et al., 2002; Rodrigues & Brancher, 2018). Clustering is achieved by minimizing the within-cluster sum of square (WCSS) distances between each point and its corresponding centroid, for a number, k, of pre-selected centroids (Kanungo et al., 2002).
Over an increasing number of k centroid trials, WCSS decreases. The elbow method for cluster analysis is used to assess the optimum number of clusters, the point at which WCSS decreases become insignificant to the analysis (Humaira & Rasyidah, 2020). The player archetype of each cluster is assessed using techniques from principal component analysis.
Step 4: Characterise Player Archetypes
Significant principal components for a cluster are the ones with the highest average principal component importance scores across members. Cluster member importance scores are calculated as principal component squared cosine values (Abdi & Williams, 2010). Strategic characteristics are derived for each cluster’s significant principal components.
Strategic characteristics of each significant principal component are identified by analysing the dimensions from the data matrix that contribute most to it. Overall strategic characteristics of a cluster’s most significant principal component are its player archetype, subsequently named to reflect the behaviours observed. The amount a dimension contributes to a principal component is measured as the square of each dimension’s correlation coefficient, it’s so called loading score (Abdi & Williams, 2010). To reduce analytical effort the six highest ranking dimensions by loading score of each principal component are considered for strategy analysis.
Results
Agent Archetypes
We identify 57 unique agent archetypes. Equation (1), provides an agent archetype example. The strategy of this agent archetype is to target mid-range technical debt growth TG (53%) by balancing management MG(T) with optimised game outcomes OP(T), whilst making use of TD-SIMs functionality, A1(T) and A2(T). Technical debt growth is achieved using minimal increments of +5%, SC(MIN). From a design perspective, this strategy reflects a balanced combination of management, optimisation, and mechanics-focused play, making it suited to eliciting behaviours that TD-SIM is designed to provoke.
Model Application
In prior work 17 participants play four games each with five turns, providing 340 games in total, to which our model is applied. We augment calculated probabilities with percentage technical debt generated, and quantity of cash achieved per game. Our complete data matrix consists of 57 columns of conditional probabilities alongside one column of percentage technical debt generated and one column of cash reward obtained, 59 dimensions in total, by 340 rows of participant data.
Principal Component Classification and K-Means Clustering
Principal component analysis reduces 59 dimensions to 17 principal components, accounting for over 70% accumulated data matrix variance. Principal components are ordered and labelled from PC_1 (12.6% variance), down to PC_17 (1.8% variance).
Successive trials of K-Means clustering are applied to the 17 principal components. Four clusters are optimal using the elbow method. Final clusters are presented in Figure 3 below, with clusters labelled A, B, C and D for reference. PCA Fitted Data as 4 Clusters, each Labelled Separately as A, B, C, D in a Black Diamond, each Designated with Corresponding Player Archetype.
Significant Principal Components (Importance Score Analysis)
Normalised Aggregated Importance Scores for Each Cluster for PC_1 to PC_17.
aCluster size as a percentage of participant game data in the cluster (340 games in total).
bSee Figure 3 for cluster labels.
Strategic Characteristics (Loading Score Analysis)
Dimension Summary for the Six Highest Contributing Dimensions of Principal Components.
aAA = Agent Archetype / TD = Technical Debt.
bT = Percentage True / F = Percentage False.
In Table 3, TG is the arithmetic average of percentage target technical debt, SC is percentage contribution of a scheme used. For management (MG), optimisation (OP), and functionality (A1, A2) percentages of enabled (T) or disabled (F) are shown depending on which is greater.
Cluster A Strategic Characteristics (PC_2)
The single significant principal component of cluster A is PC_2. Referring to Table 3, PC_2 has an average mid-range technical debt target of 53%, and balanced technical debt growth with scheme MIN(50%)/RND (50%). Some rapid technical debt growth is present as the RND scheme is an equal mix of MIN and MAX. Moderate technical debt management occurs T (67%), with no attempt to optimise cash rewards F (100%), or use A1 or A2 functionality, both F (100%). The bias toward technical debt management, with some exploratory technical debt growth, with no attempt to optimise cash implies cluster A represents a Managers player archetype.
Cluster B Strategic Characteristics (PC_1)
The single significant principal component of cluster B is PC_1 with a mid-range target of technical debt creation (53%), using a minimalistic scheme, MIN(100%). There is a slight bias towards managing technical debt T (67%) across contributing agent archetypes, with a single focus on optimizing cash creation T (100%). There is largely no attempt to utilise TD-SIM functionality with A1 and A2 both being F (83%). The strategic behaviour of this cluster is to cautiously target mid-range TD, with a focus on optimising cash, suggesting an attempt to game TD-SIM rather than experiment with management. For these reasons we assess cluster B as a Gaming player archetype.
Cluster C Strategic Characteristics (PC_3)
Cluster C also has a single significant principal component. PC_3 strategy is about targeting high technical debt TG (72%), creating it as quickly as possible with a MAX (80%) scheme, whilst optimising in-game cash reward T (100%). Minimal technical debt management occurs T (40%)/F (60%) with no attempt to use A1 or A2 mechanics, both F (100%). For these reasons cluster C is assessed as a Learning player archetype, with cluster members experimenting how far to push technical debt growth but not necessarily experimenting with how to manage technical debt. This assessment is reinforced by the occurrence of the TD dimension, the only principal component to feature a non-agent archetype dimension.
Cluster D Strategic Characteristics (PC_1, PC_3, PC_4)
Cluster D has three significant principal components, of which all but PC_4 have been addressed above. PC_4 strategy has a low technical debt target of 43%, achieved predominantly using a RND technical debt growth strategy RND (83%). MG, OP, A1 and A2 are all employed suggesting an all-in strategy with no obvious single intent. We attribute a Testing strategic theme to PC_4. Cluster D is a combination of strategies with cluster members spreading their game efforts approximately equally between Gaming (34%), Learning (36%) and Testing (29%) when apportioned by importance scores in Table 2 using data from the Significant/PCs column.
Discussion
Player Archetypes, Epistemic Change, and Insights
Evaluation of TD-SIM participant game data yields four clusters, representing four unique player archetypes, shown in Figure 3. TD-SIMs efficacy for epistemic change in technical debt awareness and insight into improvement opportunities stem from understanding player archetypes as a reflection of player behaviour.
Managers are exploring technical debt management through creating and paying it down, a positive re-enforcement that TD-SIM provided these cluster members an opportunity for epistemic change. This cluster however only represents 20% of total participant games, Table 2. TD-SIM is only 20% effective in achieving intended outcomes for the total player cohort. For TD-SIM to become an embedded technical debt management tool, its efficacy for epistemic change must increase. By quantifying current performance, future improvements have a baseline to compare against.
Cluster C players correspond to a Learning player archetype, exhibiting strategic patterns that closely align with Managers, differentiated primarily by use of optimisation mechanics (see Table 3). This proximity indicates a design lever. Targeted adjustments to optimisation-related mechanics may promote future Learning archetypes to adopt behaviours more aligned with Mangers. Accordingly, our method identifies and quantifies specific areas of the game for refinement.
The least desired player archetype, and one with the most opportunity for improvement, is cluster D, a mix of Gaming, Learning and Testing. Over 50% of participant game data resides in cluster D, suggesting further player archetype breakdown is warranted. As a data driven approach our method provides information for deep diving into potential sub-clusters. Further work is needed in this space to understand opportunities and capabilities of our methodology.
Cluster B player archetype is Gaming. The ability of players to game TD-SIM indicates an imbalance of reward and challenge. It is considered a positive outcome that only 11% of total games played used this strategy, small compared to the 50% of participant games employing a mixed player archetype. This contrast indicates where design improvements could be prioritised.
Our findings provide quantitative confirmation of insights reported in earlier TD-SIM research (Gould et al., 2024). For instance, TD-SIM was previously assessed to be unbalanced regarding technical debt management versus gaming-the-game. Our evaluation of player archetypes agrees, quantifying 11% of TD-SIM players employed a Gaming strategy.
Method Execution
Our model and methodology are conceptually complex. In addition to the mathematical rigour of our Bayesian analysis, the application of importance scoring and loading score evaluation represent non-trivial processes essential to achieving our outcomes. Despite being challenging, and the literature being silent on player strategy assessment, it is possible for player strategies to be measured. Notwithstanding this complexity, the execution of our method is relatively straightforward once established, as it is supported by automated Python scripts.
Some subjective interpretation of clustering, classification, and strategic analysis is necessary, such as the number of contributing principal component dimensions to use in strategy characterisation, or the elbow method to determine the optimal number of clusters, albeit guided by accepted data science principles (Humaira & Rasyidah, 2020). Sensitivity analysis of our method to such subjective decisions could provide guidance on the extent to which such choices influence the outcomes observed and thus the overall objectivity of our method. Correspondingly, no significant sensitivity to closely related agent archetypes is observed.
Model Findings
Our goal is to address the research questions, (RQ1) How can we measure strategies used by simulation participants? And (RQ2) What insights do such strategies reveal?
We answer (RQ1) by evaluating TD-SIM using our method, we find and measure four player archetypes. Our player archetypes are; cluster A with a Managers player archetype, cluster B a Gaming player archetype, cluster C a Learning player archetype, and with cluster D a mixed player archetype consisting of Learning, Gaming, and Testing archetypes.
In answer to (RQ2) we assess TD-SIM’s efficacy for promoting epistemic state change, finding 20% of participants have a Mangers strategy which best aligns to the intent of TD-SIM. Additionally, we find quantified opportunities to improve this outcome. Our approach advances serious game evaluation by offering a systematic method for characterising player strategies, measuring serious game effectiveness, and quantifying opportunities for improvement. Our method, while analytically complex, remains feasible to implement.
Limitations and Suggestions for Future Research
Notwithstanding these findings, the work presented here remains experimental and requires further validation. The available sample is limited in size and derived from a single serious game. Because our method is generalisable, it can be applied across a broader range of games, creating opportunities to extend this research with larger and more diverse data sets.
Our method requires subjective choices albeit guided by accepted data science principles such as the elbow method (Humaira & Rasyidah, 2020). Such choices naturally influence our outcomes and potentially our findings. Future work on sensitivity analysis of our method could address such weaknesses.
Our Bayesian model assumes all game states are visible, which holds for agent archetypes, but not necessarily for participants, as players for instance may not visualise all game states during a game. Future work to align player and agent archetype state visibility through potentially the use of Hidden Markov Models could address this.
Conclusions
This paper presents a novel method for measuring serious-game participant strategies through the construction of player archetypes. Using Bayesian analysis, principal component analysis classification, and K-means clustering, it is possible to derive and measure player archetypes of real-world game data. Although analytically demanding, our methodology demonstrates that strategy measurement is feasible. We improve serious game evaluation by providing a means of measuring a game’s effectiveness in supporting players’ epistemic development within a target knowledge domain, quantifying opportunities for serious game improvement.
Supplemental Material
Supplemental material - Measuring Simulation Participant Playing Strategies
Supplemental material for Measuring Simulation Participant Playing Strategies by David Gould, Tim French, Melinda Hodkiewicz in Simulation & Gaming.
Footnotes
Declaration of Conflicting Interests
All opinions and conclusions drawn are those of the author alone. To the extent that this publication draws on information provided by study participants in this or prior work, it should not be assumed that any views expressed herein are also necessarily those of such participants. The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
No funding was required for this research; it is a contribution toward the lead author’s PhD.
Informed Consent
The research submitted for consideration only uses data from prior research. The prior research did use human participants, to this extent the outcome of the ethics review as conformation of informed consent has been provided. Please note, no human participants were used in this current research.
Supplemental Material
Supplemental material is available online.
Author Biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
