Abstract
Fleet telematics is facilitating the adoption of new billing models in the automotive insurance industry. Advancements in the Internet of Things domain enable real-time transmission of data that can be utilized to monitor the location and status of vehicles and extract information on drivers’ behavior. While these developments can inform risk assessments of drivers of both private and corporate vehicles, existing research has mainly focused on usage-based insurance models for drivers of private vehicles. This paper analyzes naturalistic data from a sample of drivers (N = 3,854) of corporate vehicles rented from a fleet rental company in South Korea in the year 2018. We compared different algorithms to classify collision-free and collision-involved drivers and found random forest to exhibit the highest area under the receiver operating characteristic (ROC) and precision-recall curves. Further investigation showed that both business- and driver-related variables were related to collision involvement. The amount of running driving time (i.e., non-idling time), trip frequency (i.e., per 1,000 km), and rapid speed changes were the most influential variables on collision involvement. Business-related variables (i.e., running driving time and trip frequency) can inform fleet rental companies on how to instate insurance and rental rates for their corporate customers; variables that indicate risky driver behaviors (i.e., rapid speed changes) can inform the design of feedback systems incorporated to telematics devices aimed at correcting risky behaviors. Our findings extend the knowledge on collision factors and provide insights into building models for commercial fleet services that are based on business and driver-related variables.
Keywords
Latest advancements in the Internet of Things (IoT) domain enable real-time transmission of data that can be used to monitor the location, status, and behavior of a vehicle or fleet of vehicles ( 1 – 3 ). Data related to miles driven as well as time and location can be transmitted—either in real time or at set intervals—to a central server, where fleet managers or insurance providers can extract additional information on drivers’ behavior ( 4 , 5 ). Along with IoT devices, smartphones have also been shown to represent a scalable solution for collecting driving behavior data ( 6 – 8 ).
The growing availability of such telematics data is transforming the automotive insurance industry. In particular, the information extracted from telematics devices and smartphones allows insurers to provide usage-based insurance (UBI) policies that are based not only on mileage and other contextual driving factors (e.g., time of day or road type), but also on driving behaviors ( 9 ). Prior studies have applied telematics data to understand driver behavior and its relationship with road safety, and some have also explored the implications for insurance modeling ( 10 – 12 ). For example, a recent review of published research in road safety telematics indicates that providing drivers with feedback on their driving behavior using telematics data can lead to a 20%–43% reduction in road crashes and to a 37%–50% reduction in crash risk ( 11 ). Similarly, a recent study has examined the use of smartphone-based telematics and showed how personalized feedback via apps can lead to improvements in driver behavior, suggesting the potential for telematics data to influence behavior and UBI pricing ( 12 ).
Insurance policies that are based on telematics data can be classified into two main categories: those based on pay-as-you-drive (PAYD) models and those based on pay-how-you-drive (PHYD) models. While rates in PAYD are calculated mostly based on mileage, rates in PHYD originate from a multitude of trip-related factors, including measures of vehicle control such as speed and acceleration/deceleration rates. However, implementing usage-based rates can be challenging in certain industries. For example, one of the long-standing challenges in the car rental industry is the inability to differentiate pricing based on a user’s accident history or driving behavior. Rental fees are stratified primarily by factors such as age and gender, which fail to reflect individual risk levels. As a result, some high-risk drivers of private vehicles may shift to long-term rentals to avoid elevated insurance premiums, placing additional burden on rental companies. The possibility for rental companies to align insurance premiums with measures of driver behavior offers many benefits to both insurers, who can provide more accurate pricing, and customers, who can have more control on their premiums and are incentivized to adopt safer driving behaviors ( 13 , 14 ). For these reasons, PHYD is considered the most promising UBI model for both personal and corporate vehicles ( 15 ).
Ensuring customized insurance policies for corporate vehicles requires the investigation of additional business-related variables that are out of the drivers’ control ( 16 ). For example, while mileage is known to increase drivers’ collision risk, not only do corporate drivers tend to travel longer distances than drivers of private vehicles, but also how much and how far they drive is dictated by their businesses ( 17 , 18 ). Despite this difference between corporate and non-corporate drivers, previous studies have mainly examined insurance risk factors for personal vehicles. This paper analyzes naturalistic driving data (i.e., data collected in natural driving contexts) from 3,854 corporate drivers of passenger-vehicle fleets. The two main goals are: 1) extending the knowledge on collision risk factors among corporate drivers, and 2) informing UBI policies for corporate fleets.
Crash Risk Factors among Commercial Drivers
Drivers who regularly drive a company-owned or -financed vehicle for business purposes are commonly referred to as “commercial” drivers ( 19 , 20 ). This categorization includes both drivers who use a company-owned car to reach their worksite or meet customers (i.e., corporate drivers) and drivers who drive a company-owned vehicle as their work (e.g., bus and truck drivers). The focus of this paper is on corporate drivers; however, existing research on commercial drivers is informative of the factors that can put corporate drivers at a greater risk of collision than drivers of private vehicles. One of the established risk factors contributing to the heightened collision risk of commercial drivers is higher mileage ( 18 – 21 ). For example, Lynn and Lockwood investigated collision risk factors for commercial drivers through a postal questionnaire and found that frequency of collision increased with mileage and particularly for younger, less experienced drivers ( 18 ). Further, through surveys, Downs et al. found commercial drivers to have a higher percentage of collisions than non-commercial drivers, when controlling for age, experience, and mileage ( 19 ). Moreover, commercial drivers are often subject to time- and work-related pressures ( 20 , 22 ). This can, in turn, increase their speed and the chance of driving under fatigue, which have also been linked to increased probability of collision involvement ( 19 , 21 ). More recent studies have used naturalistic data to investigate crash risk factors among commercial drivers ( 23 , 24 ). However, these studies have focused on truck and bus drivers, as opposed to corporate drivers of passenger-vehicle fleets. This paper extends this line of research on crash risk factors by comparing different machine learning models trained on naturalistic data from corporate drivers of passenger-vehicle fleets.
Machine Learning Approaches to Predicting Crash Risk
Using machine learning to analyze naturalistic driving data can help inform the risk assessment process in the insurance industry as well as accurately estimate the insurance premium that rental companies could charge to their corporate customers. Existing studies on private vehicles have used a variety of classification methods for collision risk assessment. Among the most common methods, logistic regression (LR) has been used to analyze driving data and assess critical factors related to collisions ( 25 – 27 ). Other methods used include random forest (RF) algorithms, support vector machines (SVM), and k-nearest neighbors (kNN) ( 28 – 31 ). Paefgen et al. evaluated PAYD insurance rate factors by conducting a classification analysis across collision-free and collision-involved vehicles ( 32 ). They compared LR, decision trees, and neural networks, and found that neural networks achieved the highest overall accuracy, while LR showed the highest area under the receiver operating characteristic (AUROC) curve, with vehicle mileage being the strongest predictor. Baecke and Bocca compared LR, RF, and artificial neural networks (ANN) and found that the best-performing model changed based on the combination of the set of variables considered (e.g., customer specific, car specific, claim history, telematics) ( 28 ). For example, when the models were only trained with traditional input variables, such as customer-specific, car-specific, and claim history information, RF showed higher AUROC than both ANN and LR. However, RF showed lower AUROC than ANN and LR, both when the models were only trained with telematics data and when they were trained with a combination of traditional and telematics data. Wang et al. compared RF, kNN, ANN, and SVM and found that ANN showed the highest accuracy, followed by RF ( 30 ). Finally, Huang and Meng used LR as a baseline model and compared it with RF, SVM, ANN, and XGBoost (i.e., an advanced realization of gradient boosted trees [GBT]) in collision risk prediction ( 29 ). They found all models had higher AUROC than LR, with XGBoost exhibiting the highest AUROC.
The studies presented above reveal two limitations that offer the opportunity to further investigate collision risk for informing UBI policies. The first limitation is the lack of classification approaches that focus on corporate driving settings, where business-related factors that do not depend directly on drivers’ behavior can affect collision risk. The issue of isolating business-related factors was already acknowledged by Downs et al., but the authors lacked information on how business-related factors were distributed across exposure variables such as mileage and trip frequency ( 19 ). The second limitation relates to the exploration of relatively newer machine learning algorithms for informing UBI. Specifically, while GBT has shown high performance in predicting both auto crash loss cost and collision severity, we found only one study using a variation of GBT for predicting collision risk ( 29 , 33 , 34 ).
Methods
We were provided in-kind with a large fleet telematics dataset by our partner organization. The dataset included both business/environment-related variables and driver-behavior-related variables. After an initial variable selection process, we trained and tested five machine learning classification algorithms (LR, RF, kNN, SVM, and GBT) to classify drivers into two categories: whether they were collision-free or collision-involved during the data collection period. We selected these five algorithms because they span a range of complexity and interpretability, and are commonly used in both foundational and recent crash prediction literature. Although Kiang informed our initial methodology, our selection is further supported by recent studies that demonstrate the continued relevance of these models ( 28 , 29 , 35 ).
Data
We gained access to the fleet dataset through a collaboration with SK Networks, one of the largest car rental companies in South Korea. SK Networks collected naturalistic driving data as a by-product of equipping their vehicles with an advanced fleet management system (FMS) to optimize operating efficiency and manage emergencies. The FMS devices generating the data consisted of five major parts: 1) an On-Board Diagnostics (OBD) connector, 2) a GPS device that recorded location and speed information every 10 s, 3) a long-range (LoRa) modem, 4) a gravity sensor to monitor speed changes, and 5) a Bluetooth chipset connected with drivers’ smartphones. This was one of the first fleet applications of LoRa technology, which is becoming the leading IoT solution for connecting sensors to an online network and enabling real-time transmission of information across that network. The power of LoRa technologies mainly lies in their capacity, a relative low power consumption, and secure data transmission ( 36 ).
The original dataset included information from 9,412 drivers across 141 businesses completing 5,025,867 trips. A “trip” is defined as the period of driving activity that begins when the vehicle’s engine is turned on (ignition-on) and ends when the engine is turned off (ignition-off). Drivers in our sample drove more than one vehicle during the test period. Thus, we matched all trip data with drivers’ ID by Bluetooth pairing with their smartphone and aggregated the trip data by driver. For example, drivers could have completed 50% of their trips using a compact car, another 25% using a sedan, and another 25% using a recreational vehicle. SK Networks allowed drivers to rent their fleets for personal use at cost price outside working hours. The variable personal use (%) (Table 1) captures the percentage of trips drivers traveled for personal use. The dataset was collected from January 1 to December 31, 2018, and it included unique driver IDs and information related to the trip, such as the type of vehicle utilized or the frequency of speed changes, as well as collision records and traffic violation records of each driver while driving a fleet vehicle, issued by the police and municipal governments. Collision records and violation records were matched with drivers’ trip data through the unique driver ID. An analysis of the two-digit province codes from the drivers’ license numbers showed that the distribution of sampled drivers closely mirrored the national distribution of licensed drivers by province, according to data from the Korea National Police Agency.
List of Preliminary Predictor Variables and Summary Statistics
Note: Avg. = average; RAF = Rapid accelerations frequency; RDF = Rapid decelerations frequency; RV = recreational vehicle; SD = standard deviation; SKW = South Korean won; SUV = sports utility vehicle.
Drivers consented to having their data used for academic research. However, their demographic information (i.e., age and gender) was not released to the authors to ensure privacy protection. SK Networks agreed to share information about the age and licensure of 247 drivers who participated in a survey. These respondents were between the ages of 27 and 60 years (
To remove unreliable GPS data, we excluded trips that 1) lasted less than 10 min, 2) had an average speed of over 150 km/h (highest speed limit in South Korean highways is 110 km/h; usual highway speed limit is 100 km/h), and 3) were less than 1 km or more than 600 km long. Then, we removed drivers whose total mileage was less than 1,000 km, as this threshold is usually considered as a minimum requirement to assess driver behavior and underwrite a UBI contract ( 37 , 38 ). These exclusion criteria reduced the data to 3,854 drivers, completing 2,193,723 trips. Out of these drivers, 13.3% (N = 514) had at least one collision during the data collection period, while the remaining 86.7% (N = 3,340) had none.
Dependent Variable
Collision involvement was used as a dichotomous dependent variable. Drivers who had experienced any collision in 2018 with a fleet vehicle were assigned a value of 1; otherwise, 0. Most of the collisions involved another vehicle (66.2%), followed by collisions with objects (23.5%) and people (2.0%). In this study, collisions were defined based on SK Networks corporate customer (i.e., businesses) reports that indicated the need of vehicle repair. Damage cost of collisions was used to assess the severity of collisions. The monetary threshold used to differentiate between non-severe and severe collisions was set at 2,000,000 South Korean won (KRW) (∼1,617 USD) based on data reported by the Korea Insurance Development Institute ( 39 ). The damage costs ranged from 0 to 9,741,961 KRW, with 94% (N = 483) of drivers involved in non-severe collisions and 6% (N = 31) of drivers involved in severe collisions. We counted collisions involving people as “severe,” regardless of the damage cost.
Predictor Variables
Predictor variables were selected based on previous research that utilized naturalistic data to identify factors affecting collision risk. For example, elevated gravitational force events (i.e., acceleration and deceleration) were found to have high correlation (r = 0.60) with both crash and near-crash involvement ( 40 ). In addition, collision-involved drivers who had at least one collision on their insurance record showed significant differences in rapid speed changes and overspeed frequency compared with collision-free drivers, while at the same time showing relatively small differences in mileage and average speed ( 25 ). Earlier studies found that the impact of speed on crash rates was more significant on minor and urban roads than on major and rural ones, and highlighted that not just absolute speed but speed variance between vehicles plays a major role in crash risk ( 41 ). Later research showed that the correlation between speeding and collision involvement varies considerably depending not only on the type of road but also the time of day ( 42 ). Further, the rate of trips completed at night, the rate of trips driving above 90 km/h, and the amount of hard braking had a stronger relationship with collisions than mileage did ( 26 ). Traffic violation records have also proven to be useful in predicting collision risk. A survey in Australia of speeding offenders found that higher speed offenders (over 30 km/h above limit) have been involved in more multiple-vehicle collisions than lower speed offenders ( 43 ). Based on these studies, we selected 47 preliminary predictor variables, which we further divided into two groups (Table 1). Table 1 presents the initial set of predictor variables considered in the analysis, along with descriptive statistics for collision-free and collision-involved drivers and their point-biserial correlations with crash involvement.
The first group of variables related to the drivers’ business and driving environment. Drivers could exercise little control over these variables, as they were mainly dictated by the context of their business. These business/environment variables characterize how much drivers traveled, in what type of vehicle, on what types of road, and during what times of the day. We adopted our time zone and road type categories from Aarts and Van Schagen ( 41 ). Variables under the “Road” heading (Table 1) were obtained by matching GPS data to a mapping tool produced by SK Telecom, which provided information on road type, number of lanes, and speed limit. These data were then aggregated to the driver level to obtain the percentage of trips traveled on mutually exclusive road types (e.g., highway, urban, rural) by a driver. Variables under the “Time” heading were obtained by calculating the percentage of trips a driver had in the respective time slot. Variables Start: 10 p.m.–2 a.m. (percentage of trips starting between 10 p.m. and 2 a.m.) and End: 2–7 a.m. (percentage of trips ending between 2 and 7 a.m.) were created to capture night-time driving; preliminary exploration of the data showed that trips traveled in these time slots had the greatest differences in collision rate between collision-involved drivers and collision-free drivers.
The second group of variables related to driving behavior, including speeding, percentage of uninterrupted driving exceeding 2 h (to capture fatigue), rapid speed changes, and violations of traffic regulations. Under the “Speed” heading (Table 1), the variables speed, running speed, and overspeed time (OST) were directly generated by the FMS device. The FMS device could only record the cumulative time of driving above 110 km/h (the speed limit on South Korean expressways ranges between 100 and 120 km/h) and did not record information related to the roadway, such as the speed limit for different roads. Therefore, we calculated the variable overspeed frequency (OSF) by matching GPS data with SK Telecom mapping data. We further classified speeding frequency into three categories depending on the extent to which drivers exceeded the speed limit: 20 km/h, 40 km/h, and 60 km/h. In South Korean expressways, speed limits for passenger vehicles are uniform across lanes. Variables related to rapid speed changes were calculated by proprietary internal logic in the IoT FMS chipset. Finally, we selected variables related to traffic violation activities based on a study on habitual traffic violators ( 43 ).
Analysis
We selected the predictors to train our classification models based on whether any of the preliminary variables listed in Table 1 differed significantly (p < 0.05) between the collision-free and the collision-involved groups and had a point-biserial correlation coefficient with the outcome variable exceeding ±0.10. Likely because of the large sample size, the resulting p-values were all significant, except for the variables speed (p = 0.13) and running speed (p = 0.16). In addition, the magnitude of the differences between collision-free and collision-involved drivers were relatively small for the variables average speed limit and highway. For this reason, they were excluded from the analysis. We also excluded the variable 9 a.m–noon as it showed no contribution to the models during preliminary explorations. Further, multicollinearity was checked through Pearson correlation coefficients to ensure that highly correlated variables (±0.8) that satisfied both the significance and the point-biserial correlation thresholds were not used together in the final classification models (Figure 1). Figure 1 shows the Pearson correlation coefficients among variables that passed both the statistical significance and point-biserial correlation thresholds, allowing assessment of potential multicollinearity.

The correlation matrix of significantly tested explanatory variables.
Although multicollinearity did not pose a problem to the analysis, we decided, for practical reasons, to exclude variables when multicollinearity was present: this way, we avoided considering too many variables in the models, which may be hard to track in real world applications. Variables from the “Travel” category were highly correlated with each other, except for trip frequency. For this reason, we decided to use running driving time as the most comprehensive and representative travel-related variable and to include the variable trip frequency in the classification models even though its point-biserial correlation coefficient fell under the threshold of ±0.10. Further, to remediate the multicollinearity between the variables related to rapid speed changes, we summed RAF and RDF into one new variable labelled RAF+RDF.
Finally, given the limited number of observations for each type of violation, we grouped the variables from the “Violation” category into two new variables: safety-related violations (i.e., summed count of traffic sign, lane, and speed violations) and non-safety-related violations (i.e., summed count of toll and parking violations). This feature selection process resulted in nine final predictor variables: running driving time and trip frequency from the “Travel” category, older car and compact car from the “Vehicle” category, 6–9 a.m. and end: 2–7 a.m. from the “Time” category, RAF+RDF from the “Rapid Speed Change” category, and safety violations and non-safety violations from the “Traffic Violation” category.
The modeling was conducted in the programming environment R version 3.5.3, using the package caret, version 6.0–48 ( 44 , 45 ). A 5-fold cross-validation was applied for building the models ( 46 ). The data were split into a 70% training set for cross-validation and 30% for testing. Before running the models, we standardized the data (i.e., variables were centered around 0 and scaled with respect to their standard deviations). Standardization was applied to both training and test datasets. Because of the class imbalance (13.3% versus 86.7%), different sampling techniques were applied to the training dataset, including oversampling (up), undersampling (down), and synthetic minority oversampling technique (SMOTE) to balance the classes and improve classification accuracy. Each sampling technique, including non-sampling, was used to train each of the five machine learning algorithms, thus producing 20 candidate models. The models were evaluated using established methods, including accuracy, AUROC, precision and recall (and their trade-off through the area under the precision and recall curve [AUPRC]), and the F1 score ( 47 ).
Results
Table 2 reports the performance metrics for each classification algorithm under the four sampling strategies, with the highest value per metric highlighted in bold. The highest AUROC for the algorithms ranged from 0.773 to 0.823, indicating a 77%–82% probability of correct discrimination between collision-free and collision-involved drivers. Concerning trade-offs between precision and recall, the highest AUPRCs for the algorithms ranged from 0.362 to 0.417, with a baseline of 0.133. RF (sampling: none) showed the highest performance for both AUROC (0.823) and AUPRC (0.417).
Performance of Classification Models
Note: AUPRC = area under the precision and recall curve; AUROC = area under the receiver operating characteristic [curve]; Down = undersampling; SMOTE = synthetic minority oversampling technique; Up = oversampling.The highest performance value across sampling methods is highlighted in bold.
High recall may be prioritized over high precision if the objective is to protect against potential losses that the insurance or rental companies would face by giving lower insurance rates to risky drivers who are misclassified as non-risky (i.e., false negatives). This may particularly be the case for rental companies given that they are not privy to historical crash records of individual drivers as insurance companies are, missing an important piece of information to identify risky drivers. Figure 2 illustrates the areas under the ROC and precision-recall curves of the five models that had the highest recall values; all were down-sampled, except for kNN (up-sampled). It can be observed how, at low levels of recall (up to about 15%), down-sampled GBT was the only model holding high levels of precision (75%–100%), indicating that at this cut-off, the rate of false positives was low. However, the five classifiers reached high levels of recall only at low levels of precision, and thus correctly classified many collision-involved drivers only at the cost of misclassifying many collision-free drivers as collision-involved. Confusion matrices for the five models with highest recall are included in the Appendix.

(a) Receiver operating characteristic (ROC) and (b) precision-recall curves of the five models with highest recall.
Relative Variable Importance and Sensitivity Analysis
We further analyzed the contribution each predictor made to the classification performance. Figure 3 illustrates the contribution of each predictor variable to the performance of the classification models. For the sake of visual clarity in the variable importance graph, we only included the three models with the highest recall. Since kNN does not provide a value for variable importance, we replaced it with the next model with highest recall (i.e., GBT down-sampled). Thus, the three models illustrated in Figure 3 are GBT, LR, and RF (all down-sampled). The variables with highest relative importance belonged to both the business/environment and the driver behavior categories. Variables such as running driving time and RAF+RDF showed consistent contribution across the three models. This is not surprising, considering the existing knowledge on the effect of mileage on collision risk and the high exposure of commercial fleet drivers to the road ( 17 , 20 , 22 , 26 , 32 ). Further, the influence of RAF+RDF confirms findings from both Bian et al. and Simons-Morton et al., who found that acceleration and deceleration events were highly correlated with collisions and that collision-involved drivers showed significant differences in rapid speed changes compared with collision-free drivers ( 25 , 40 ). The contribution of the other variables was less consistent across models: for instance, the variable End: 2–7 a.m. had higher importance in tree-based models (RF and GBT) than LR. Conversely, the variable safety violations (i.e., signal, speed, and lane violations) was particularly important for LR (down-sampled) but not for the other two models. Finally, trip frequency (trip rate per 1,000 km) showed high contribution for LR (down-sampled) and RF (down-sampled), but not for GBT (down-sampled). These inconsistencies may reflect the different ways in which models capture variable interactions and nonlinearity. For example, higher contributions of the time-related variables in the tree-based models may suggest better suitability of those models to identify relationships between nighttime driving and collision risk, which may relate to fatigue, lower visibility, or riskier road conditions. These model-driven differences underscore the importance of algorithm choice when interpreting feature importance.

Relative variable importance.
The LR model (down-sampled) was further analyzed by conducting a sensitivity analysis and by observing changes in the probability of collision involvement. Table 3 summarizes the LR outcomes for the down-sampled dataset, showing odds ratios for each predictor and sensitivity analysis based on a 50% increase from mean values. Significant predictors are highlighted in bold. These results were obtained from non-standardized data to aid the reader in the interpretation of odds ratios and the sensitivity values. We increased each of the predictor variables by 50% from their mean values while controlling for other variables to calculate the changes in expected probability. The results of the sensitivity analysis were generally consistent with the effects observed in Figure 3. RAF+RDF, trip frequency and running driving time were the most influential variables, increasing the probability of collision involvement by 14.7%, 13.0% and 11.7%, respectively.
Down-Sampled Logistic Regression Results and Sensitivity Analysis
Note: RAF = Rapid acelerations frequency; RDF = Rapid deceleration frequency. Bold type indicates significant predictors.
Precision-at-k
In contrast to precision over the entire test sample, precision-at-k analysis assesses precision when the models are used to identify k cases that belong to a particular class (e.g., collision-involved drivers, or collision-free drivers) from the test sample. The five models with the highest precision reported earlier were used for this analysis for both the collision-involved and collision-free classes. Figure 4 compares the precision-at-k metrics for the five models with the highest overall precision scores. Unsurprisingly, precision was generally lower for collision-involved drivers across all models. The two models with highest precision for the collision-involved class were GBT and RF. With k set at less than 5, only GBT held a precision of 100%, but it decreased gradually toward 55% with k set around 50. RF was also the best-performing model for the collision-free class, although all five models held high precision at all levels of k, indicating that 90%–100% of the drivers predicted to be collision-free were actually collision-free. Precision-at-k analysis enables the selection of models that are better at detecting highest-risk drivers and models that are better at detecting lowest-risk drivers; use of multiple models to identify the extremes can then be used to instate penalties or incentives.

Precision-at-k comparison of top five models for (a) high-risk drivers and (b) low-risk drivers.
Discussion
The results of this study are in line with previous naturalistic studies investigating factors related to collision involvement. Running driving time, trip frequency, and rapid speed changes emerged as important factors according to both the relative variable importance and the sensitivity analysis. We also found that critical factors associated with collision involvement related to both drivers’ behavior (e.g., rapid speed changes and safety violations) and business requirements (e.g., running driving time and trip frequency). While business-related variables can inform insurance and fleet rental companies on how to instate insurance and rental rates for their corporate customers, variables that indicate risky driver behaviors can inform both the design of driver feedback systems incorporated to fleet telematics devices and training programs aimed at correcting drivers’ specific hazardous behaviors ( 48 – 51 ).
With regard to classification performance, the models in this study showed higher accuracy and AUROC than previous classification studies for collision risk assessment with telematics data. For example, all our five models showed higher accuracy (ranging from 0.86 to 0.87) than the models in Wang et al. (up to 0.83), possibly because of the lower number of drivers in their sample (i.e., 64) ( 30 ). Further, while in Baecke and Bocca the AUROC for RF and LR were, respectively, 0.594 and 0.608 for the models including both traditional and telematics data, the AUROC for RF and LR in our study reached values of 0.823 and 0.785, respectively ( 28 ). This discrepancy might be because of differences between the two datasets: while the data analyzed in Baecke and Bocca was collected by a European insurance company, our data was collected by a South Korean fleet rental company, and thus includes additional information, such as traffic violation records ( 28 ).
While all five classifiers achieved relatively high recall, this came at the cost of lower precision: the models correctly identified many collision-involved drivers but also misclassified a substantial number of collision-free drivers (see Appendix). This tradeoff has important implications for real-world use. In PAYD/PHYD insurance schemes, a high rate of false positives may reduce customer trust and lead to unfair premium adjustments. To mitigate these issues, insurers could integrate model predictions with domain-specific rules or enrich the input data with additional contextual variables to better discriminate between high- and low-risk drivers. Ensemble approaches or post-prediction calibration techniques applied in other high-risk domains such as healthcare may also help balance precision and recall more effectively ( 52 , 53 ).
The results in this paper suggest that IoT-connected fleet rental companies could become a useful resource of naturalistic driving data for collision risk assessment. Currently, although both insurance and rental companies offer PAYD premiums (e.g., based on mileage), only insurance companies offer PHYD premiums based on telematics data and driver behavior. Generally, fleet rental companies pay the insurance premiums for all their vehicles in advance to insurance companies, and then charge their corporate customers with rental fees that include the average insurance premium costs. In case the crash rate of a fleet rental company exceeds the cost coverage paid in advance, the insurance company increases their premiums. Therefore, the fleet rental company either increases the rental fees in their next contract with their corporate customers or rejects high-risk customers.
The insights from this study can inform the calculation of UBI premiums for different markets and countries. First, this study suggests the possibility for fleet rental companies to also offer PHYD premiums. Although we noted that there could be differences in the data of insurance and rental car companies, we propose that more customized PHYD premiums could originate from cooperative efforts between insurance and rental car companies. The possibility for rental car companies to offer PHYD premiums would allow for better customization of their fees based on their customers’ driving behavior. For drivers with sufficient behavioral data, rental companies could calculate premiums by applying risk multipliers proportional to the predicted collision probabilities generated by statistical or machine learning models, enabling a tiered premium structure reflecting individual risk. For users with limited or no driving history, such as first-time renters, rental companies can implement an initial premium determined by demographic information, coupled with a short-term monitoring period using telematics devices or driving apps. Once sufficient behavioral data is collected during this monitoring phase (e.g., a minimum threshold of driving hours), premiums can be updated to reflect the individual risk profile. In cases where telematics data is unavailable, proxy measures such as driving scores from navigation apps may serve as preliminary indicators.
Second, this study shows that collision risk in corporate driving is not exclusively determined by driving behavior, but also by business-related factors. The calculation of insurance premiums for corporate drivers, in South Korea and worldwide, should consider that some variables related to collision involvement can be outside of drivers’ control. In addition, results from the precision-at-k analysis suggest that our models performed better at identifying drivers with a higher probability of being collision-free, rather than the opposite (i.e., identifying drivers with higher probability of being collision-involved). GBT showed high precision in identifying drivers with high probability of being collision-free, and RF showed high precision in identifying drivers with high probability of belonging in each collision group, which suggests that—at least in this sample—tree-based models might be more reliable than similarity-based models (e.g., kNN and SVM).
Limitations
SK Networks’ privacy protection policy prevented us from including drivers’ demographic data in our analysis. Since collision rates follow different patterns across different age groups and genders, including this kind of information could have offered the opportunity to control for these variables while assessing the impact of others, such as traffic violation records and trip frequency. Additionally, since we lacked specific information on drivers’ education levels, there remains uncertainty about how representative our sample is in this regard. This limitation is acknowledged in interpreting the transferability of our findings. In addition, because the vehicles were corporate-owned and not privately owned by the drivers, behavior may differ from that of private vehicle owners. On the one hand, this may reduce personal investment; on the other hand, drivers could be more cautious because of potential monitoring or reprimand from their employers. This dual possibility is noted as a limitation when considering the transferability of our findings to broader driving populations.
Second, this study considered drivers who were involved in only one collision the same as those who were involved in more than one. The 1 year period considered in this analysis was not considered to be long enough to build models that could further classify drivers who were involved in more than one collision, or who were involved in more-severe collisions. Future studies using samples collected over longer time periods could explore models that predict, for example, individual collision counts or the collision rate per 1,000 km. At the same time, while the proposed models were trained on long-term driving, we have not yet validated their performance on short-term driving data. This is an important area for future work, as accurately classifying high- and low-risk drivers based on limited mileage is critical for UBI applications. Future research will focus on evaluating and adapting the models to ensure applicable risk assessment when only short-term driving data is available. Finally, while our dataset does not include the detailed trajectory or interaction data necessary to compute surrogate safety measures such as time-to-collision, future research could leverage such data to simulate crash risk and complement analyses based on observed collision outcomes ( 54 , 55).
Conclusions
We conducted a classification analysis with the objective of investigating collision risk factors in a corporate driving sample and informing UBI policies for corporate customers. This paper suggests the possibility for fleet rental companies to offer PHYD premiums based on measures of their customers’ driving behavior, but also noted that variables affecting collision risk are based on both business- and driver-related factors. We found that running driving time, rapid speed changes, and trip frequency had a consistent effect on the probability of collision involvement across the different models built. Our results from the precision-at-k analysis suggest that low-risk drivers can be identified more effectively than high-risk drivers. Insurance and rental companies could use this information to give lower rates to low-risk drivers, while charging the rest of the customers with fixed fees. When analyzing naturalistic data, insurance and rental companies should be mindful of differentiating between factors that are directly under drivers’ control and those that depend on their business requirements. Future studies could investigate the implications of model interpretability in the context of naturalistic data and automotive insurance. In conclusion, the analysis presented here suggests not only a way to reduce high collision rates in corporate driving settings, but also an opportunity for fleet rental companies to charge their corporate customers based on the adoption of safe driving behaviors.
Supplemental Material
sj-docx-1-trr-10.1177_03611981251372464 – Supplemental material for Assessing Risk of Collision with Fleet Telematics for Usage-Based Insurance: Case Study from South Korean Car Rental Operations
Supplemental material, sj-docx-1-trr-10.1177_03611981251372464 for Assessing Risk of Collision with Fleet Telematics for Usage-Based Insurance: Case Study from South Korean Car Rental Operations by Davide Gentile, Birsen Donmez, Dongsoon Min and Trevor Waite in Transportation Research Record
Footnotes
Acknowledgements
The authors acknowledge and thank SK Networks Co. Ltd. for their collaboration. In particular, Seungwoo Chung, Hyunjoong Kim (rent-a-car sales and development team leader) and Young Ee Cho (executive of rent-a-car business division) of SK Networks for providing the data and pre-processing them in support of this research.
Author Contributions
The authors confirm contribution to the paper as follows: study conception and design: D. Gentile, B. Donmez, D. Min, T. Waite; data collection: SK Networks; analysis and interpretation of results: D. Gentile, T. Waite; draft manuscript preparation: D. Gentile, D. Min. All authors reviewed the results and approved the final version of the manuscript.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by the Natural Sciences and Engineering Research Council of Canada (Grant No. RGPIN-2016-05580).
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
