Abstract
This study presents a data-driven approach for smart tire management through the application of unsupervised machine learning techniques. Using a real-world dataset comprised with synchronized records collected at 1 Hz from a sensor-equipped fleet vehicle, the research investigates how the use of clustering algorithms – K-Means and BIRCH with Agglomerative Clustering – can be employed to identify distinct operational stages in the usage cycle of tires. A comprehensive descriptive analysis was first conducted to understand the behavior and correlations among pressure, temperature, and speed data. The clustering analysis, applied both globally and by individual tire positions, revealed that the optimal number of clusters can vary depending on the tire’s location. The findings highlight the importance of position-aware tire analytics and support the development of intelligent tire management systems capable of optimizing performance, enhancing safety, and extending tire lifespan.
Introduction
Rubber, a crucial commodity in various daily products, reached a production volume of 29.6 million metric tons by 2022, with a substantial portion allocated to tire manufacturing. 1 The quality of tires on the market can be substantially improved through advanced monitoring, with the automotive industry investing in solutions such as Tire Pressure Monitoring Systems (TPMS), which will soon be mandatory in Europe. 2
In this context, the project “Traceable Smart Products and Interoperability in the Automotive Industry Supply Chain with the Application to Sensorized Tires – Tire IoT,” financed by FINEP (Brazil), aims to create a smart tire equipped with Radio-Frequency Identification (RFID) tags, as well as pressure and temperature sensors, for ongoing tire monitoring from production to disposal. This device has been engineered to withstand the demanding conditions of the vulcanization process and daily use; besides, it should also feature real-time communication capabilities, connecting the tire to both the vehicle’s Controller Area Network (CAN) network, other instruments, and to the cloud, ensuring traceability and data connectivity throughout the production chain.
Internet of Things (IoT) tires’ benefits to the market include optimizing manufacturing and maintenance processes, reducing operating expenses, and increasing vehicle safety. 3 For consumers, these tires provide greater safety through continuous tire condition monitoring, and sustainability through extended tire life and reduced waste. 4 However, these benefits can only be achieved through the development of models that allow the analysis of the acquired sensor data. Thus, the development and application of IoT and Artificial Intelligence (AI) technologies in tires represent a significant advance for the automotive industry and for the end-user experience.
It is important to note that the use of machine learning algorithms to identify patterns in sensor-equipped tires has the potential to alert about out-of-specification usage conditions and contribute to the prediction of tire lifespan. 5 This can lead to an extended tire lifespan, a better maintenance schedule, and consequently a lower Cost per Kilometer (CPK). These capabilities support efficient management of operating conditions and aid in stakeholder decision-making, providing advantages for both consumers and the environment.
However, traditional supervised machine learning techniques are not straightforward in this domain because there is a lack of labeled data. 6 For example, fleet managers lack the labeled data (i.e. known examples of “good,”“acceptable,” or “disposable” conditions) required to train classification algorithms. This gap necessitates an alternative approach.
Accordingly, this study adopts an exploratory, data-driven approach that combines descriptive analysis and unsupervised machine learning to analyze the tire usage cycle. By applying clustering models to the observable variables – tire pressure, temperature, and vehicle speed – the primary objective is to identify the distinct operational stages a tire naturally undergoes during normal service.
The unlabeled dataset, comprised of 1 Hz time-series data from six sensor-equipped tires on a fleet bus over a 30-day period, is leveraged to achieve this goal. The unlabeled dataset used in this study was provided by Prometeon Tyre Group (publicly available in Harvard Dataverse 7 ).
Thus, this work seeks to investigate operational patterns in fleet tires and address the following research question: can an exploratory approach combining descriptive analysis and unsupervised machine learning applied to fleet tire sensor data reveal distinct operational states associated with tire position and usage conditions, thereby supporting the development of intelligent, data-driven tire management strategies?
Determining the optimal number of clusters in a dataset of fleet tire data can be a crucial step toward uncovering underlying patterns in tire usage behavior. Clustering facilitates the identification of intrinsic groupings within the dataset, which may correspond to variations in operational conditions, such as route types, vehicle loads, or driving styles. 3 These groupings enable a more nuanced understanding of tire performance, supporting data-driven approaches to predictive maintenance, resource allocation, and lifecycle optimization. Additionally, clustering serves to reduce the dimensional complexity of the data, allowing for clearer interpretation and the development of targeted strategies for each subgroup. 8 Establishing the most appropriate number of clusters also provides a foundation for subsequent modeling tasks, including anomaly detection and condition-based forecasting. As such, this analysis plays a pivotal role in advancing intelligent fleet management practices grounded in empirical evidence.
This work makes three primary contributions: (i) the development of an integrated framework that combines descriptive statistics and clustering algorithms to extract operational patterns from fleet tire data, enabling a deeper understanding of this domain; (ii) the application of unsupervised clustering techniques (K-Means and BIRCH) to identify tire operational phases without prior labeling; and (iii) the characterization of tire-specific behaviors across different wheel positions, highlighting the importance of position-aware analysis to improve maintenance strategies, safety, and service life. Collectively, these contributions demonstrate the potential of integrating Internet of Things (IoT) technologies with machine learning approaches for intelligent tire management.
This paper is organized as follows: the next section reviews the state-of-the-art in tire wear modeling and the application of machine learning to tire data. Subsequently, unsupervised learning techniques for clustering tasks are introduced. The following section presents the proposed methodology, which encompasses a descriptive analysis of the data and the application of K-Means and BIRCH, followed by Hierarchical Agglomerative Clustering, algorithms to group tires based on similarity. The experimental results are then presented and their respective discussions are provided. Finally, the paper concludes with a summary of the main findings and future perspectives.
Literature review
This section provides a comprehensive overview of existing research on smart tire technology, focusing on two key domains: the development of tire wear models and the diverse applications of machine learning for tire analysis. The aim is to contextualize the present study within the broader academic landscape by examining the state-of-the-art and identifying current research gaps.
Tire wear models
Accurately predicting tire wear is a critical aspect of smart tire technology, with ongoing research dedicated to improving model fidelity under real-world conditions.9,10 The literature reveals two primary streams of investigation: one centered on physics-informed models and another on data-driven methods for real-time monitoring.
The first stream focuses on developing comprehensive, physics-based, and hybrid models. A notable example is the work of Sakhnevych and Genovese, 10 who presented an integrated model unifying adhesive and hysteretic friction by incorporating rubber viscoelastic properties, road roughness, and thermodynamic states. Their theoretical work was complemented by an experimental methodology using camera-assisted observation of micro-damage, which broadened the understanding of abrasion processes by highlighting the critical role of material relaxation. Similarly, Napolitano Dell’Annunziata et al. 11 proposed a hybrid model that combines physical principles with statistical analysis to create a more robust predictive methodology than purely phenomenological approaches. Further blending these techniques, Kim et al. 12 integrated finite element analysis with a one-dimensional convolutional neural network (CNN), demonstrating how physics-based simulations can deepen the understanding of wear dynamics and improve predictive accuracy.
In parallel, a second research stream has prioritized data-driven and sensor-based methods for adaptable, real-time applications. Seeking to dispense with complex physical models, Han et al. 13 developed a machine learning system that uses acceleration signals from an intelligent tire to train a deep learning model for wear detection. This approach is highly viable for practical, on-the-fly monitoring. Focusing on sensor integration, Chang et al. 14 engineered a solution combining three-axis and Hall sensors to estimate vehicle mileage and tire wear with over 99% accuracy. Its high precision and low power consumption make it a promising method for embedded intelligent tire systems.
While these approaches have advanced the field, a persistent gap remains in integrating heterogeneous fleet data and validating models under the full spectrum of real-world operating conditions. To address this gap, the present study shifts the focus from direct wear prediction to a more foundational analysis of tire operational behavior. Instead of predicting a final wear value, our primary objective is to apply clustering models to identify and characterize the distinct operational stages a tire naturally undergoes during normal service. This approach provides a crucial contextual layer – defining what a tire is doing at any given moment – that is essential for developing the next generation of granular and context-aware wear models.
Applications of machine learning for tires analysis
The increasing availability of tire data, often facilitated by Internet of Things (IoT) technologies, has spurred significant research into the application of machine learning (ML) for tire analysis. ML algorithms provide powerful tools for pattern identification, prediction, and optimization, addressing a range of tire-related challenges. 15 Recent contributions can be broadly grouped by their primary objective: monitoring tire health through indirect means, predicting tire life and manufacturing quality, and estimating real-time dynamic states.
A prominent research trend involves the indirect monitoring of tire health, which leverages existing vehicle signals to infer tire conditions without dedicated sensors. Wei et al., 16 for instance, developed a low-cost framework that detects pressure loss by extracting features from wheel-speed signals, achieving 96.18% accuracy with a support vector machine (SVM). In a similar vein, Svensson et al. 17 created axle-specific classifiers to detect both incorrect pressure and insufficient tread depth across a vehicle fleet. These sensor-less approaches are complemented by studies like that of Rahman et al., 18 who demonstrated that supervised algorithms like XGBoost could robustly classify tire condition with over 95% accuracy using simple physical measurements of tread-depth and pressure.
Beyond real-time monitoring, another critical application area is the prediction of tire longevity and the analysis of manufacturing quality. Zhu et al. 19 integrated image processing with ML to estimate tire life by classifying texture features from images of tires at various wear levels. In the context of manufacturing, Biantoro and Hernadewita 20 applied the unsupervised K-means clustering algorithm to identify sources of quality issues from Radial Run Out data. Their approach successfully linked defect clusters to specific manufacturing processes, enabling targeted improvements. This study is particularly notable for its use of unsupervised learning for root-cause analysis.
In a more dynamic context, other researchers have focused on the real-time estimation of tire-road forces and slip parameters using intelligent tire systems. Xu et al. 21 developed a slip ratio estimation model using accelerometer data from the tire’s inner liner, finding that vertical acceleration provided the most robust features. A subsequent study by the same authors extended this to estimate longitudinal, lateral, and vertical forces with high accuracy. 22 To overcome the challenge of extensive data collection, Strano et al. 23 proposed a novel framework that uses a physics-based model to generate large virtual datasets for training supervised algorithms, which can then predict tire forces from real-time sensor data.
The reviewed literature demonstrates that supervised machine learning is a well-established and effective tool for predictive tasks in tire analysis, such as estimating pressure loss, 16 predicting wear, 19 and estimating forces. 22 However, there has been less focus on applying unsupervised learning for the exploratory analysis of tire behavior from real-world operational data. While Biantoro and Hernadewita 20 successfully used clustering to analyze manufacturing data, its application to in-service tire performance remains largely unexplored. Therefore, the present study addresses this gap by leveraging an unsupervised, data-driven methodology not to predict a predefined variable, but to discover and characterize the distinct operational stages of a tire throughout its service life.
Methodological background
This section provides the methodological background for the unsupervised machine learning techniques employed in this study. It introduces the fundamental principles of data clustering and details the specific algorithms applied: K-Means and BIRCH followed by Agglomerative Clustering. Furthermore, it describes the internal validation metrics, such as the silhouette score and the Davies-Bouldin score, used to evaluate the quality and determine the optimal number of clusters in the subsequent analysis.
Unsupervised machine learning for clustering task
Unsupervised machine learning usually handles two main problems: clustering and dimensionality reduction. Especially for data clustering, the idea is to use algorithms to analyze and cluster unlabeled datasets without human intervention. There are several clustering algorithms, for example, K-Means Clustering,8,24 Hierarchical Clustering,25,26 and BIRCH. 27
The selection of K-Means and BIRCH algorithms for this study was influenced not only by their effectiveness in handling unlabeled data but also by their computational efficiency for potential real-time deployment scenarios.
The computational complexity of the K-means algorithm is
K-Means is a clustering algorithm that divides a data set into distinct groups, where each data point is assigned to the cluster closest to its centroid, representing the cluster’s center.
29
K-Means algorithm’s basic idea can be described in four fundamental steps
30
: definition of the number of clusters (
The K-Means’ often challenge is determining the ideal number of clusters, represented by
Another two approaches to determine the optimal number of clusters
Silhouette score evaluates how similar a data point is to its own cluster (cohesion) compared to other clusters (separation). The score ranges from
For each piece of data
To evaluate the overall quality of the clustering, the average of the silhouette coefficients of all points is calculated according to equation (2).
Davies-Bouldin score measures the average similarity between each cluster and its most similar one, based on intra-cluster dispersion and inter-cluster separation. It then penalizes high intra-cluster distances (loose clusters) and low inter-cluster distances (clusters too close).
33
Equation (3) presents how this score is calculated, where
Another algorithm used for clustering is the Hierarchical Clustering. This method creates a hierarchy of clusters, where clusters are grouped into more significant subgroups, forming a tree structure or dendrogram. Hierarchical clustering methods can be broadly categorized into divisive and agglomerative. For divisive clustering, the main question is how to select a cluster for the next splitting procedure according to dissimilarity and how to split the selected cluster. For the agglomerative approach, clustering begins with each cluster containing one object. It recursively merges the two most similar clusters in terms of the similarity measure until all objects are included in a single cluster. Although both methods yield a dendrogram representing the data’s hierarchical structure, the clustering results can vary significantly based on the similarity or dissimilarity measure applied. 34
One key point of the Agglomerative method
35
is the similarity measure used to select the two most similar clusters for the next merge. Many agglomerative clustering algorithms have been proposed in terms of the different ways in which the similarity measure is defined. The merging of clusters is usually done by using linkage algorithms based on distance between clusters. There are four important types of linkage algorithms: ward, single, average, and complete.
While it yields good results, the Agglomerative Hierarchical Clustering is not scalable due to its computational and memory complexity. The primary limitation arises from the need to compute and store a pairwise distance matrix between all data points. In practice, this means that as the dataset grows, the number of required computations increases quadratically. For large datasets, this results in prohibitive memory usage and excessively long processing times, making hierarchical clustering impractical without significant approximations or subsampling.
To address this limitation, BIRCH (Balanced Iterative Reducing and Clustering using Hierarchies) 26 was introduced as a scalable alternative that combines the strengths of hierarchical clustering with efficient data summarization. BIRCH incrementally builds a compact tree structure known as a Clustering Feature Tree (CF Tree), which stores small groups of similar data points, called subclusters, along with summary statistics such as the number of points, linear sums, and squared sums.
These subclusters can either serve as the final clustering result or be used as input to another clustering algorithm, such as Agglomerative Clustering, for further refinement. The final clustering configuration, including the optimal number of clusters, can be evaluated using internal validation metrics such as the silhouette score.
Proposed methodology and dataset description
This work aims to present methods to manage and analyze tires based on real data. To this end, the Fleet Tire Metrics dataset
37
was used. This dataset is comprised over
The tires were labeled as FR (Front Right), FL (Front Left), Rear Right Inner (RRI), Rear Left Inner (RLI), Rear Right Outer (RRO), and Rear Left Outer (RLO), which indicates the position of the tires, according to Figure 1. The dataset also includes an attribute labeled Old Data, which indicates whether a record is newly acquired (

Bird’s-eye view of the position of tires on the fleet bus during data acquisition. FR stands for Front Right, FL for Front Left, RRI for Rear Right Inner, RLI for Rear Left Inner, RRO for Rear Right Outer, and RLO for Rear Left Outer.
External factors such as weather and road conditions were not included because no corresponding data was available. Moreover, as the vehicle is part of a bus fleet, its load conditions can be assumed to remain practically constant during data acquisition.
The internal temperature of the tire is particularly important, as it determines the aging rate of the tire, with higher temperatures accelerating aging and consequently reducing the product’s useful life. 37 Thus, analysis that takes temperature into account can provide fleet managers with the information needed to decide whether to discard or retread, for instance, a tire.
The methodology was divided into two phases. First, a descriptive analysis was carried out to enhance understanding of the data. In the second phase, K-Means and BIRCH algorithms were employed to explore the usage cycles of a typical fleet tire: the objective was to identify the number of operational stages that emerge during normal service. The quantity of clusters that exist for each configuration was assessed with internal validation metrics (silhouette and Davies–Bouldin score), and, exclusive for K-Means, the Elbow Method was also used. External validation metrics were not employed in this work because the dataset does not contain labeled data. 38
Descriptive analysis was performed using statistical information, variable distributions, potential correlations, and temporal behavior of the data. Due to different units and scales among the variables, data standardization was applied when appropriate, depending on the specific requirements of each analysis.
The clustering analysis was initially conducted on the entire dataset, and subsequently applied to each tire individually, according to its position on the vehicle. In this work, BIRCH result was always refined by using its subclusters as input for the Agglomerative Clustering (BIRCH followed by Agglomerative Clustering).
Several tools were employed to implement the algorithms. Python was used as the programming language, as well as several libraries, such as Pandas 39 for data processing, NumPy 40 for vectorization support, Matplotlib, 41 Seaborn, 42 and Plotly 43 for graphs, and Scikit-Learn 36 for standardization, metrics, and machine learning algorithms.
While the Fleet Tire Metrics dataset comprises time-series data with inherent temporal structure, our methodological approach to the clustering process deliberately treats individual sensor readings as independent observations. This design decision is justified by several theoretical and practical considerations:
Limited temporal window of the dataset: the data acquisition period of approximately 30 days (around 20,000 km) is relatively short to robustly capture temperature and pressure variations arising from changes in the operational state of fleet vehicle tires. Therefore, a 30-day window may be sufficient for preliminary analysis, but longer periods may be necessary to observe significant trends in the tires’ behavior.
Operational state versus temporal transitions: The primary objective of this study is to identify distinct operational states that tires experience during normal service, rather than modeling the temporal transitions between these states. Each sensor reading (pressure, temperature, and speed) represents a snapshot of tire operational conditions that can be meaningfully categorized regardless of its temporal context. This approach aligns with the goal of developing position-aware tire analytics for fleet management applications.
Real-time deployment considerations: For practical implementation in intelligent tire management systems, the ability to classify individual sensor readings into operational categories without requiring historical context enables real-time decision-making. This independence assumption allows immediate anomaly detection and operational state identification without maintaining temporal buffers or complex state transition models.
Computational efficiency: The independent observation approach significantly reduces computational complexity from
For both K-Means and BIRCH with Agglomerative Clustering, it is necessary to set initial parameters. Therefore, the following parameters were used for the K-Means:
random state =
init = k-means++– ensuring that the initialization method is k-means++ to speed up convergence (default value in scikit-learn).
max_iter =
n_cluster =
For BIRCH with Agglomerative Clustering, the parameters were as follows:
threshold =
linkage = ward– to ensure the formation of well-defined and distinct clusters – Agglomerative Clustering parameter.
n_cluster =
Experiments and results
This section presents the experimental procedures and results of the study. Initially, a descriptive analysis was conducted to explore the statistical properties, distributions, and relationships among the variables, providing a comprehensive understanding of the dataset. Following this, unsupervised machine learning techniques were applied to identify patterns and groupings among the tires based on their usage cycles. The goal was to uncover the number of clusters that exist during a ordinary usage of fleet tire, so it could support data-driven strategies for tire management and maintenance.
Descriptive analysis
The first experiments were conducted to better understand the behavior of the explored relationships between the variables involved. Figure 2 shows the evolution of the variables over time. Due to significant fluctuations in vehicle speed, a moving average of

Evolution of the observed variables over time considering all tires’ position. Vehicle speed is the moving average of
It can be seen that there is a correlation between the measurements; however, physical quantities like temperature (°C), pressure (PSI), and vehicle speed (km/h) have different units and scales, which can make the analysis difficult and tricky. Therefore, a standardization of the data was performed by removing the mean and scaling to unit variance, known as the z-score normalization,
44

Evolution of the observed variables over time after standardization considering all tires’ position. Vehicle speed is the moving average of
The strong correlation between pressure and temperature was expected based on the ideal gas law, often written as

Pairwise Pearson correlation.
The subsequent analysis was performed without incorporating the temporal dimension, considering the three variables jointly. The results indicate a clear linear relationship between Temperature and Pressure, which becomes evident in the 3D plot shown in Figure 5. This consistent relationship is further explored through the pairwise interactions among the three variables in Figure 6. Additionally, Figures 7 to 9 provides a closer examination of these relationships using raw data (non-standardized). Figure 7 shows that the most frequently observed tire temperature during operation is approximately

All tires’ positions together: Pressure versus Temperature versus Vehicle speed (standardized data).

Pairwise relationship between the three variables (standardized data).

Tire temperature distribution – all tires’ positions together.

Tire pressure distribution – all tires’s positions together.

Vehicle speed distribution.
A Q–Q (Quantile–Quantile) plot 46 was employed (Figure 10) to visually assess whether Temperature and Pressure data follow a normal distribution. Since the data points align closely with the reference diagonal line, it suggests that both variables approximate a normal distribution.

Q–Q plot for normality check of tire Temperature and Pressure.
After analyzing the overall distributions across all tire positions, the study proceeded to examine the distributions for each specific tire position on the vehicle. Figures 11 and 12 present box plots of Temperature and Pressure, respectively, for each tire position. Box plot graph summarizes the distribution of the dataset, showing the median, quartiles, minimum and maximum values, and potential outliers. An in-depth examination of the plots reveals that the tires in the RLO and RRI positions exhibit slightly lower temperatures – median value around

Temperature by tire position in the vehicle.

Pressure by tire position in the vehicle.
Variations in the internal air temperature of the tires are primarily driven by differences in the heat dissipation mechanism, which is influenced by material properties, operational conditions, and contact characteristics. 48 These differences can also be affected by the performance and action of the braking system.48,49
In this case study, these three positions are the most distinct from the others, a pattern that is supported by the computation of effect sizes for each tire position, when compared to all other positions. Table 1 presents the Cohen’s
Cohen’s
Boldface values indicate the negative and positive extremes
Notably, the RLO position shows a large negative effect for both temperature
Unsupervised machine learning for clustering
The experiments conducted in this section were applied to identify patterns and groupings among the tires. The goal is to uncover the optimal quantity of groups that can be formed using the available data and, with this, to understand the number of cycles the fleet tire has during ordinary usage. To this end, the experiments were conducted considering the temperature, pressure of the tires, and vehicle speed. All the data used by the models of this section were first standardized.
As observed in the statistical analysis, tire temperature and pressure tend to vary depending on the tire position. This suggests that a greater number of clusters may emerge when all tire positions are analyzed together, as the overall data becomes more dispersed.
All tires’ positions together
In this experiment, all tire positions were analyzed together for each of the models considered: K-Means and BIRCH with Agglomerative Cluster.
K-Means
It was used with the number of clusters

Elbow method for number of cluster analysis.

Silhouette score for different number of clusters found by K-Means.
Although

Silhouette analysis of K-Means clustering considering all tire positions together.
Finally, the analysis was performed using the Davies–Bouldin score (Figure 16), which yielded its lowest value (and therefore the best result) for

Davies-Bouldin score for different number of clusters using K-Means.
The application of the K-Means clustering algorithm with

Clusters found by K-Means per tire position for
Overall, the clustering results reinforce that tire position has a significant influence on the analyzed variables (e.g. Temperature and Pressure), which supports the decision to conduct a position analysis over the quantity of clusters.
BIRCH with Agglomerative Cluster
The second applied method was the BIRCH with Agglomerative Cluster. It was studied with the number of clusters
To support the selection of the optimal number of clusters (

Silhouette score for different number of clusters and thresholds found by BIRCH with Agglomerative Cluster.

Silhouette analysis for BIRCH with Agglomerative Cluster.
The configuration with
Alternatively, the configuration with
Then, the analysis was performed using the Davies–Bouldin score (Figure 20) across different values of

Davies-Bouldin score for different number of clusters (
The application of the BIRCH algorithm with a

Clusters found by BIRCH with
Individual tires’ positions
Although the all-together analysis – where all tire data is submitted to cluster algorithms regardless of the positions – can reveal general behavioral patterns and assist in unsupervised clustering, it may overlook critical specific dynamics due related to tire location, as each position (FR, FL, RRO, RRI, RLI, and RLO) may exhibit distinct dynamics that can be lost in an overall evaluation.
The study of this section was made with the same two algorithms considered in the last section: K-Means, and BIRCH followed by Agglomerative Clustering.
K-Means
The Elbow Method was initially applied, and the resulting plot is presented in Figure 22. To enable a consistent comparison of the inflection points across all tire positions, the WCSS of each curve was normalized using its corresponding value at

Elbow method per tire position using normalized within-cluster sum of squares (WCSS).
The silhouette score was also used to support the selection of the optimal number of clusters. As shown in Figure 23, for all tire positions,

Silhouette analysis across tire positions using K-Means clustering with multiple values of

Davies-Bouldin score per tire position using K-Means.
This may mean that the front tires go through more wear steps in a normal usage cycle, as they can wear quickly than the rear ones. In general, the individual analysis of positions followed what was found in the analysis of all the tire positions together.
BIRCH followed by Agglomerative Clustering
The silhouette score was utilized to support the identification of the optimal number of clusters. The experiment was made with BIRCH

Silhouette analysis for BIRCH with
The analysis of the Davies-Bouldin score across different tire positions, as shown in Figure 26, reveals notable variations in cluster compactness and separation as a function of the number of clusters (

Davies-Bouldin score per tire position using BIRCH with
In turn, while FR had a distinct number of
Conclusion
This study presents a comprehensive and data-driven approach to smart tire management through the application of unsupervised machine learning techniques. Leveraging real-world data acquired from a sensorized fleet vehicle equipped with Tire Pressure Monitoring Systems (TPMS), the research highlights how clustering algorithms, specifically K-Means and BIRCH followed by Agglomerative Clustering, can be employed to identify recurring patterns in tire usage cycles without the need for labeled data.
The first phase of the analysis focused on descriptive statistics, revealing meaningful correlations among temperature, pressure, and vehicle speed. These variables exhibited significant variance across different tire positions, with rear outer tires (RRO and RLO) showing notably more dispersed values compared to other positions (Table 1). This positional dependency reinforces the need for individualized analysis rather than treating the tire system as a homogeneous entity.
In the second phase, the application of unsupervised learning enabled the segmentation of the tire usage data into distinct operational clusters. Internal validation metrics such as the silhouette score and Davies-Bouldin Index were systematically used to guide the selection of the optimal number of clusters. When analyzing all tire positions together, a configuration of four clusters consistently emerged as a suitable choice for both K-Means and BIRCH-based methods. For K-Means, the RRO position stands out as the most distinct (Figure 17), being strongly associated with cluster 3. In the BIRCH-Agglomerative approach, the RLO position was the only one with a majority of examples assigned to class 2 (Figure 21), whereas all other positions had most examples in class 0. This suggests that the RLO position may be considered an outlier relative to the others, and this insight could be extended to other tire cases to detect anomalies, for example.
Further refinement through individual tire analysis using K-Means revealed that front tires (FR and FL) tend to progress through more usage phases – indicated by a higher optimal number of clusters (
The practical implications of these findings are significant. By enabling the identification of distinct operational states in tire behavior, the proposed approach lays the groundwork for real-time predictive maintenance, enhanced safety diagnostics, and more sustainable tire lifecycle management. Fleet managers and manufacturers can benefit from position-aware models that inform replacement schedules, load balancing strategies, and anomaly detection frameworks. Moreover, the methods presented here can be generalized to other vehicular components or systems where sensor-rich environments and unlabeled data dominate.
From a scientific perspective, the combination of clustering methods and internal validation metrics demonstrated here offers a robust methodology for extracting knowledge from complex, unlabeled datasets. The integration of BIRCH with Agglomerative Clustering proved particularly effective for the global analysis, when all tires’ positions were studied together as a single dataset. In turn, K-Means seems to be more valuable during the individual position analysis.
A current limitation of this work is the lack of end-of-life tire data, which would allow correlating the identified clusters with retreading levels and thus improve their practical interpretability.
Another limitation is that the analysis is restricted to a single vehicle type and sensor configuration; broader validation across diverse fleet compositions would strengthen generalizability. From a computational deployment perspective, while both K-Means and BIRCH demonstrate relatively low computational requirements (evidenced by their successful processing of our dataset with minimal resource demands), practical implementation in embedded tire systems requires addressing hardware constraints including processing power limitations, memory restrictions, and energy consumption optimization.
Future work should extend this research by: (i) integrating temporal modeling to capture transitions between operational states; (ii) incorporating derivative and fluctuation-based features, such as temperature rise rate and pressure fluctuation amplitude; (iii) exploring semi-supervised learning when limited labels become available; (iv) incorporating additional sensor modalities such as vibration and acceleration, and (v) developing optimized implementations for resource-constrained embedded systems. Moreover, deploying the proposed models in real-time fleet management systems would enable adaptive decision-making in vehicle safety and maintenance strategies, though practical recommendations for fleet management based on tire cluster migration patterns still require development.
This work underscores the transformative potential of combining IoT data acquisition with advanced machine learning techniques in the domain of automotive engineering. It not only enhances the understanding of tire behavior under real-world conditions but also provides practical tools to improve the efficiency, reliability, and sustainability of modern transportation systems.
Footnotes
Handling Editor: Chenhui Liang
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by the Financiadora de Estudos e Projetos (FINEP), grant number 01.22.0262.00 (Ref. 1164/21), as part of the Rota 2030 program, and by the Ministério da Ciência, Tecnologia e Inovação (MCTI) and the Ministério da Economia. This publication was also financed by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) – Finance Code 001. All these funding agencies are based in Brazil.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
