Abstract
This paper investigates the impact of in-vehicle noise on ride comfort of range-extended electric vehicles (REEVs) under the parked idle condition. Aiming at the problem of the need to improve the noise quality of the range extender, systematic subjective and objective sound quality evaluation experiments were carried out, and a grid-search-optimized support vector regression model (GS-SVR) was constructed for sound quality prediction and analysis. Steady-state noise samples of the range extender were collected in a semi-anechoic environment. Psychoacoustic parameters and key physical acoustic features were extracted, and a correlation analysis was conducted in combination with subjective scores to screen out special acoustic indicators significantly related to auditory perception. Subsequently, multiple groups of sound quality prediction models were established with traditional indicators, special indicators, and their combinations as inputs, respectively, and a grid-search-optimized random forest model was introduced for comparison. The results show that the GS-SVR model performs best when using special indicators as inputs. Metrics such as RMSE, MAE, R2, and precision are superior to those of other models, demonstrating stronger nonlinear fitting ability and generalization stability. This study not only validates the effectiveness of the GS-SVR model in range extender sound quality modeling but also establishes a set of special acoustic indicator systems with clear physical meaning and engineering applicability, providing a methodological reference for the quantitative evaluation of REEV noise quality and the optimization of subjective comfort.
Keywords
Introduction
In recent years, range-extended electric vehicles (REEVs) have gained significant market attention due to their dual advantages: the high efficiency of electric propulsion and the extended driving range offered by traditional internal combustion engines.1–3 Within the REEV system, the range extender functions as a core auxiliary power unit, and its operational characteristics are closely linked to overall vehicle ride comfort. Under idling conditions in particular, vibration and noise issues become more perceptible to vehicle occupants, emerging as primary sources of user dissatisfaction and deteriorated driving experience. Compared to driving conditions, idling is characterized by a lower background noise level, a more concentrated vibration source, and a heightened subjective sensitivity to sound quality. These factors make idling a critical scenario for analyzing and improving acoustic performance. The interior noise at idle speed reflects not only the structural and control design characteristics of the range extender but also plays a pivotal role in shaping user perceptions of vehicle noise, vibration, and harshness (NVH) performance. Therefore, conducting sound quality prediction and analysis under idle conditions is essential for enhancing user experience and vehicle competitiveness.
With increasing consumer expectations for NVH refinement, noise quality—beyond traditional sound pressure level (SPL) metrics—has emerged as a key dimension in the evaluation of in-vehicle comfort. Blauert 4 introduced the concept of sound quality, emphasizing that the human auditory response involves complex perceptual mechanisms that cannot be captured by SPL alone. Recent studies have proposed various objective indices for evaluating vehicle sound quality, combining subjective assessments with experimental data. For instance, Huang 5 developed a feature fusion approach integrating psychoacoustic metrics and critical frequency band energy features, improving both prediction accuracy and perceptual relevance. Liu et al. 6 utilized ensemble empirical mode decomposition and Hilbert transform to extract energy characteristics of diesel engine noise, demonstrating strong correlation with subjective ratings. Lee et al. 7 introduced a whine index for evaluating electric vehicle warning sounds. Zhao et al. 8 applied a genetic algorithm-optimized random forest model to assess vibration comfort in construction machinery, establishing a predictive system using both subjective and objective data. Dai et al. 9 proposed a binaural measurement-based sound quality evaluation system for construction equipment, incorporating a PSO-RF algorithm to enhance adaptability. Ali et al. 10 identified key subjective parameters of in-cabin noise and proposed a benchmark evaluation method based on both subjective and objective measures. Fang et al. 11 established a sensitivity frequency band energy ratio parameter that demonstrated strong linearity with subjective annoyance ratings. Additionally, Huang et al. 12 developed an abnormal noise identification system for shock absorbers using wavelet packet sample entropy feature extraction.
In recent years, some researchers have explored the optimization of the NVH performance of range extenders from the perspectives of structure and control.13–16 Andert et al. 17 achieved the coordinated optimization of gear meshing noise and NVH performance by combining a pre-loaded split-gear mechanism with a real-time torque regulation strategy. Hooper et al. 18 integrated the intake regulation and balance shaft system through mechatronic design, taking into account both vibration control and fuel economy. Guo et al. 19 precisely controlled the crankshaft stop position phase using an electric motor, effectively reducing the vibration during the start-stop process of the range extender. Although the above-mentioned research has made certain progress in structural optimization and control strategies, most of them focus on NVH control at the mechanical and dynamic levels. However, research on the sound quality characteristics of range extender operating noise and the subjective-objective correlation laws is still scarce. Existing range extender control strategies usually take fuel economy as the main optimization goal. Their frequent intervention during vehicle operation leads to obvious noise characteristics, which has become an important source affecting driving and riding comfort. 20 At the same time, the noise of the range extender is formed by the superposition of mechanical, electromagnetic, and combustion-exhaust noise, with a complex composition and diverse frequency band characteristics, posing significant challenges to sound quality evaluation.21,22 Therefore, in-depth research on the noise quality characteristics and the subjective-objective correspondence of range extenders under the parked idle condition is of great significance for enhancing user experience and product competitiveness. 23
Subjective evaluation remains a crucial component of sound quality analysis, as it directly reflects human auditory perception. Huang et al. 24 proposed a TRS noise optimization system that incorporates group pairwise comparison and interval analysis under engineering uncertainty, offering a new approach for improving acoustic quality in electric vehicles. However, conventional subjective evaluations are resource-intensive, lacking efficiency and repeatability. Early objective models also suffered from poor performance in capturing nonlinear relationships, resulting in limited prediction accuracy. To address these limitations, researchers have increasingly adopted data-driven approaches to model the nonlinear mapping between noise features and subjective ratings, thereby reducing the burden of subjective testing.25–30 Among these, support vector machines (SVMs) have shown effectiveness in handling small-sample and nonlinear problems in sound quality prediction.31–34 He et al. 35 applied SVM to vehicle acceleration sound quality prediction, while Zhang et al. 36 compared SVM and backpropagation neural networks for in-cabin noise modeling in electric vehicles, with SVM demonstrating superior performance for small datasets. Neural networks, due to their nonlinear learning capabilities, have also been widely utilized. Huang et al.37,38 proposed a hybrid knowledge graph and network fusion method to optimize noise in tire–road interactions, enhancing both computational efficiency and interpretability. Zhu et al. 39 addressed broadband acoustic comfort prediction in electric vehicles using weighted multi-task learning and a vibration-acoustic knowledge graph. The XGBoost algorithm, introduced by Chen et al., 40 has also been adopted for sound quality modeling. Wang et al. 41 applied XGBoost to predict in-cabin sound quality in both electric and fuel-powered vehicles. Zhang et al. 42 used XGBoost to establish a predictive model for electric bus noise, achieving an average relative error of 4.67%, meeting the 5% target threshold.
In summary, two primary issues persist in current research on range extender sound quality. First, there is a lack of systematic subjective-objective evaluation frameworks tailored to the complex acoustic features of range extender noise. Traditional psychoacoustic metrics are often inadequate for accurately capturing these characteristics. Second, the time-varying nature and rich harmonic content of range extender noise, coupled with environmental variability and the nonlinearities inherent in subjective evaluations, pose challenges for existing prediction models. To address these challenges, this study establishes a specialized feature index system for idle conditions to compensate for the limitations of conventional indicators. Furthermore, a prediction framework is developed based on a grid-search-optimized support vector regression (GS-SVR) model, enabling accurate nonlinear mapping between characteristic acoustic parameters and subjective evaluations, and thereby enhancing model applicability and reliability.
The remainder of this paper is structured as follows: “The Proposed Method” section outlines the methodology; “Experiment” section details the experimental setup and dataset construction. “Sound quality feature extraction of range extender” section presents the feature extraction process, including both traditional psychoacoustic indices and specialized indicators relevant to range extender noise. “Prediction of sound quality of range extender noise based on GS-SVR model” section discusses the development and comparative analysis of the prediction models. Finally, the last section provides the conclusions. The overall research framework is presented in Figure 1. The overall research framework.
The proposed method
Introduction of SVR
SVM, originally proposed by Vapnik et al., 43 has been widely applied to classification tasks. Its core principle involves mapping nonlinear input data into a high-dimensional feature space via a kernel function, where a linear model is subsequently constructed. Building upon this foundation, the SVR algorithm extends the SVM framework to address regression problems, while preserving its strengths in handling high-dimensional data, capturing nonlinear relationships, and ensuring strong generalization performance. SVR aims to construct an optimal regression hyperplane in the high-dimensional feature space, thereby identifying a smooth function within a defined error tolerance to predict continuous target variables. 44
The fundamental objective of SVR is to identify a function that fits the training data while maintaining a prediction error within a specified threshold ε. The optimization process balances the minimization of model complexity (i.e., the smoothness of the regression function) and the control of prediction error. Given a training dataset:
SVR seeks to determine a regression function defined as follows:
In order to solve the regression problem, Vapnik introduces the insensitive loss function
When the difference between the predicted value
The SVR optimization objective is to minimize the norm of the weight vector while allowing a small fraction of samples to exceed the
Subject to the following constraints:
By introducing the kernel function
Reliability control and screening of subjective scores
To ensure the reliability and consistency of the subjective evaluation data used for the training of the models proposed in this paper, two-stage screening and quality control were carried out on all subjective scores prior to modeling. The aim was to eliminate samples and evaluators with significant scoring discrepancies or inconsistent evaluation criteria, thereby improving the statistical stability and credibility of the model training data.
At the sample level: Consistency variance test of scores
The consistency of scores given by different evaluators for the same sample is tested on a sample-by-sample basis. Let the score set of the
When
At the evaluator level: Consistency correlation coefficient test of scores
On an evaluator-by-evaluator basis, the consistency between each evaluator’s scoring results and the overall trend is evaluated. Let the scoring sequences of two evaluators
Through the above two-stage screening strategy, abnormal samples and unstable evaluators in the subjective evaluation data can be effectively removed. This ensures the credibility and stability of the input data in the modeling stage, providing a reliable subjective evaluation basis for the training and prediction of the models proposed later in this paper.
The propose of GS-SVR
After completing the reliability screening of the subjective scoring data, this paper takes the screened sample data with high consistency as input to construct a SVR model for predicting the noise quality of the range extender under the idle condition. To enhance the predictive performance of the SVR model, several parameter optimization methods have been widely employed, including cross-validation, gradient descent, and grid-search techniques. 46 Among these, the cross-validation method offers stability and ease of implementation; however, it involves a high computational cost and is often insufficient for optimizing SVR parameters independently. The gradient descent method, commonly applied to unconstrained optimization problems, may converge to local optima during the iterative process, thereby potentially missing the global optimal solution. In contrast, the grid-search method systematically explores the optimal parameter set by discretizing the defined parameter space. Specifically, the value ranges for the parameters are predefined, and the space is divided into a grid structure where each grid point represents a unique parameter combination. All combinations are then exhaustively evaluated, and the step size and search range are iteratively adjusted based on the performance of the objective function until the optimal solution is identified. 47
In the SVR model, the regularization parameter (1) Define the search ranges for the parameters (2) Set the step sizes within these ranges and discretize the parameter space into a grid, where each node corresponds to a unique combination of (3) Traverse all grid nodes, constructing and training SVR models for each parameter combination. (4) Evaluate model performance based on training error and identify the parameter combination that yields the optimal performance. (5) Apply the optimal parameter combination to the final SVR model for prediction. Algorithm flow chart of the GS-SVR.

To evaluate the predictive performance of the model, this paper uses precision, root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination (
Experiment
Range extender noise sample collection
In this study, the range extender of a hybrid electric vehicle was selected as the test object, and the noise data were collected in a semi-anechoic chamber. The experimental setup is illustrated in Figure 3. During testing, the ambient temperature was maintained at 25°C. Test conditions were conducted in accordance with the standard GB/T 18697-2002 Acoustic measurement methods for vehicle interior noise.
49
Interior noise signals were recorded using an LMS SCADAS SM32 data acquisition and playback system, in combination with a binaural sound pressure sensor, The binaural sensors are worn on the ears of the test personnel (see Figure 4(c)), which are used to simulate the actual auditory positions of human ears in the vehicle, so as to truly restore the spatial sound-field distribution of subjective auditory perception. Data were acquired from four representative seating positions: the driver seat, the front passenger seat, the right seat in the middle row, and the left seat in the rear row. All measurements were performed under engine idle conditions with the vehicle in a steady state. The sound pressure sensor, positioned at the headrest of each seat, is shown in Figure 4. The acquisition parameters were conducted as follows: a sampling frequency of 51,200 Hz, a frequency resolution of 1 Hz, and a sampling duration of 10 s. Test site configuration. Arrangement of sound pressure sensor. (a) Main driver, (b) co-driver, (c) right seat in the middle row, and (d) left seat in the back row.

Following acquisition, the raw noise signals collected by the binaural sound pressure sensor were imported into the signature data post-processing module of LMS Test. Lab 21A software for further analysis. To ensure consistent evaluation, each signal was truncated to a duration of 6 s. The left and right ear channel signals (C13 and C14) are combined into a single stereo channel (C17) to restore, as much as possible, the auditory experience of the driver in an actual driving state. To verify the accuracy of this synthesis process, a noise sample was randomly selected, and the time-domain waveforms of its original left and right ear channels and the synthesized stereo channel were compared (see Figure 5). The results show that the synthesized signal is consistent with the original signal in terms of waveform amplitude, rhythmic fluctuations, and energy distribution. No phase drift or amplitude distortion was observed, indicating that this synthesis process only achieves channel integration without introducing significant errors and can truly reflect the in-vehicle spatial sound field. Subsequently, for each working condition and measurement point, the interception interval was selected based on the steady-state characteristics of the signal (such as the stable sound pressure level section), and segments with abnormal pulses or environmental noise interference were removed. Two NVH professional engineers then conducted manual screening to ensure the representativeness and quality stability of the samples. A total of 105 noise samples were finally obtained, covering different engine speeds and sampling seat positions. It should be noted that the left and right ear signals of each sample are derived from the binaural channels of the same measurement event. During playback, they are presented synchronously in stereo, that is, the two ears play the dual-channel signals of the same noise event, thus maintaining the consistency and authenticity of the spatial auditory perception. In the subsequent subjective evaluation experiment, all subjects listened to the same 105 binaural stereo samples to ensure the comparability and repeatability of the evaluation results. Time-domain signals before and after stereo synthesis.
Subjective evaluation of range extender noise
In this study, a total of 45 healthy subjects with normal hearing were recruited to participate in the subjective evaluation experiment. Among them, 35 were male and 10 were female. Their ages were mainly distributed in the range of 20∼60 years old, with the 20∼35-year-old group being the majority (36 subjects aged 20∼35 years old and 9 subjects aged 35∼60 years old). According to their work experience, the subjects were divided into a professional group (engaged in automotive NVH-related work for at least 3 years, 26 subjects) and a non-professional group (19 subjects) to ensure the representativeness and stability of the evaluation results.
Subjective evaluation table of sound quality of range extender.
To ensure the reliability and consistency of the subjective evaluation results of sound quality, this paper conducts two-stage data screening and quality control on all evaluation data after the sample evaluation is completed. This process is carried out according to the methods described in Section 2.2, and the discreteness and consistency of the evaluation results are examined from the sample level and the evaluator level, respectively.
In the first stage, for each sound sample, the variance of the scores given by each evaluator is calculated to measure the dispersion of scores. When the score variance of a certain sample exceeds the set threshold (0.8), it indicates that different evaluators have significant differences in their perception of this sample and inconsistent subjective judgments. Such a sample is regarded as having poor evaluation stability and is thus excluded.
51
After this screening, a total of 23 samples with excessive variances are excluded, and 82 sound samples with stable evaluation results are retained. Their variance distribution is shown in Figure 6. The variance distribution of 105 samples.
In the second stage, the consistency among evaluators is examined. Using the scoring results of all evaluators for the 82 samples, the Pearson correlation coefficient between each evaluator’s scores and the overall scores is calculated to identify differences in consistency among evaluators. If the average correlation coefficient of a certain evaluator is lower than 0.6, it is considered that their scoring results deviate significantly from the group trend and have low evaluation reliability, and thus this evaluator needs to be excluded from the final analysis. According to this principle, a total of 9 evaluators are excluded, and the data of 36 evaluators are finally retained. The distribution of their average correlation coefficients is shown in Figure 7. Average correlation coefficient of 36 evaluators.
Acoustic sample comfort score statistics table (summarized excerpt).
To further analyze the acoustic differences among noise samples with different rating levels, three typical samples with average ratings of low (1.3 points), medium (3.0 points), and high (4.8 points) were selected from the 82 samples, representing auditory experiences of different comfort levels. Their spectral characteristics are shown in Figure 9. As can be seen from Figure 8, the three types of samples all exhibit significant energy differences within 2000 Hz, mainly concentrated in the low-frequency order range below 200 Hz, the mid-frequency range of 200∼800 Hz, and the mid high-frequency range of 1000∼2000 Hz. Among them, the low-rated sound samples have high energy in the 200∼800 Hz frequency band, with large fluctuations in local frequency components. The medium-rated samples have moderate energy in this range, with a relatively clear but not overly prominent harmonic structure. The high-rated sound samples have relatively low energy in this frequency band, with a smoother spectral line distribution and a harmonious harmonic structure. Overall, the low-rated samples are characterized by uneven energy distribution and more local sharp frequency peaks, reflecting a strong sense of auditory roughness. The high-rated samples have a more uniform energy distribution and a smooth spectral curve, demonstrating better acoustic comfort. Spectral comparisons of sound samples with different subjective scores.
Sound quality feature extraction of range extender
Extraction of psychoacoustic and physical acoustic indices for range extender noise
Objective parameters of sound quality of range extender (summarized excerpt).
Correlation between subjective evaluation and objective parameters.
The A-weighted SPL exhibited a strong negative correlation with subjective evaluation scores, with a coefficient of −0.85. This indicates that higher A-weighted SPL values are associated with lower subjective comfort ratings, suggesting a deterioration in perceived noise quality. This outcome is consistent with auditory perception principles, as A-weighted SPL reflects sound intensity adjusted for human ear sensitivity. Higher values typically correspond to louder and more discomforting sound, leading to reduced subjective acceptance.
Loudness, a psychoacoustic index that quantifies perceived sound intensity, also showed a significant negative correlation with subjective scores (correlation coefficient: −0.80). This result indicates that increases in loudness are generally associated with lower subjective ratings. The calculation of loudness incorporates time-domain integration and critical band analysis, which emphasize low-frequency energy—characteristic of the steady-state noise produced by range extenders. Such low-frequency components are often perceived as more oppressive or fatiguing, contributing to lower auditory comfort.
In contrast, sharpness and unweighted SPL showed only moderate correlations with subjective evaluation, with coefficients of 0.49 and −0.43, respectively. The limited correlation of sharpness may be attributed to its emphasis on high-frequency content, whereas the energy distribution of range extender noise is predominantly concentrated in low-to mid-frequency bands, with relatively weak high-frequency components. Similarly, the unweighted SPL does not incorporate perceptual weighting and fails to account for the nonlinear characteristics of auditory perception, which likely contributes to its weak association with subjective evaluations.
The remaining parameters—AI, roughness, fluctuation strength, prominence ratio, and tonality—exhibited low absolute correlation coefficients (all below 0.25), indicating minimal linear relationship with subjective evaluation. This can be explained by the nature of the range extender noise, which under idle conditions produces relatively stable sound with limited temporal fluctuation. Consequently, roughness and fluctuation strength, which assess modulation in the time domain, are less relevant in this context. Additionally, indicators such as AI and tonality are designed primarily for speech and tonal or high-frequency sounds, which differ significantly from the low-frequency mechanical and combustion noise of range extenders. The prominence ratio, intended to quantify the salience of narrow-band components, is also ineffective under these conditions due to the broad-spectrum nature of the idle-state noise.
Extraction of special indices for range extender noise
Based on the analysis of representative acoustic samples with low, medium, and high subjective ratings (as discussed in Section 3.2), it was observed that specific frequency domain features vary significantly across different scoring groups. First, the RMS value within the 200∼800 Hz frequency band was identified as an effective indicator of signal energy distribution in this range, which generally encompasses the dominant energy components of range extender noise. Measurement results revealed that samples with lower subjective ratings exhibited significantly higher energy densities within this frequency band, suggesting a strong correlation between this feature and subjective perception. Second, the 2nd-order frequency component and its corresponding energy distribution were found to be closely related to mechanical resonance and structural vibration modes within the range extender. In automotive noise analysis, pronounced harmonic distortion or structural imbalance can lead to auditory discomfort and reduced perceived sound quality. A clear distinction in 2nd-order frequency energy was identified among samples with varying subjective scores, with low-rated samples typically exhibiting higher energy levels in the second order.
Figure 9 presents the order analysis diagrams for sound samples with average scores of 1.3, 3.0, and 4.8. It can be seen that low-rated samples exhibit higher sound pressure levels in the second and fourth orders, while medium-rated samples show reduced energy levels, and high-rated samples display the lowest values. Since the A-weighted SPL accounts for the frequency sensitivity of the human auditory system, the sum of the A-weighted SPLs of the 2nd- and 4th-order (Order 2&4 SPLs) serves as a more perceptually aligned measure of low-order harmonic influence on subjective auditory perception. Order spectra of range extender noise samples with different subjective ratings.
Based on these findings, three special indices were selected to characterize the objective acoustic properties of range extender noise: (1) RMS value within the 200∼800 Hz frequency band. (2) 2nd-order frequency component. (3) Order 2&4 SPLs.
Special indicators of range extender noise samples (summarized excerpt).
Correlation between subjective evaluation and special indices.
Prediction of sound quality of range extender noise based on GS-SVR model
Development of GS-SVR model
Among the objective sound quality indicators, loudness, A-weighted SPL, sharpness, and unweighted SPL exhibit medium to high correlation with subjective evaluation scores. Specifically, the Pearson correlation coefficient between loudness and A-weighted SPL is as high as 0.964, indicating strong collinearity. In contrast, the correlation coefficients between the remaining indicators and the subjective scores are all below 0.52. To avoid multicollinearity in the model inputs, A-weighted SPL, sharpness, and unweighted SPL were selected as the input parameters for sound quality characterization. To investigate the influence of different types of features on model performance, three input feature groups were constructed for modeling: (1) Special index group: comprising the RMS value in the 200∼800 Hz frequency band, the 2nd-order frequency, and the order 2&4 SPLs. (2) Sound quality index group: including A-weighted SPL, sharpness, and unweighted SPL. (3) Combined index group: integrating both the special index group and the sound quality index group.
For all groups, the subjective evaluation score was used as the model’s output target, and a regression prediction model for range extender sound quality was established based on SVR. During the data preprocessing phase, all input features and target scores were normalized to ensure uniform scaling. The dataset, consisting of 82 samples, was randomly divided into training and test sets in a 7:3 ratio, resulting in 57 samples for training and 25 samples for testing.
To optimize the hyperparameters of the SVR model, the grid-search method was employed. The penalty parameter
Results of sensitivity analysis of hyperparameters for GS-SVR.
Sound quality prediction and analysis of range extender noise
Figure 10 presents a comparison between the predicted and actual subjective scores obtained using the GS-SVR model under different input feature configurations. It should be noted that the true values in the three sub-figures are not exactly the same. This difference stems from the independent random partitioning strategy employed during the model validation phase. To avoid data dependencies and information leakage among different input feature schemes, this paper conducts independent data partitioning and model training for three types of input features: special indicators, sound quality indicators, and their combination. Specifically, within the same overall sample library, 70% of the samples are randomly selected as the training set, and 30% are selected as the test set. Since the partitioning processes are independent of each other, the test sample sets corresponding to each model during evaluation do not completely overlap. Therefore, there are slight differences in the distribution of true scores in the three sub-figures. This design aims to ensure the statistical independence and generalization reliability of model performance evaluation under different feature input conditions, avoiding potential over-fitting or data memorization effects that may occur when using the same test set. It is emphasized that this approach does not affect the validity of the overall conclusion. Instead, it enables each model to exhibit a more objective performance under independent conditions. Prediction results of GS-SVR model under different input feature sets.
When the special indicators were used as input variables, the GS-SVR model demonstrated the highest predictive accuracy, with a maximum absolute error of 0.7 and an average absolute error of only 0.26. These results indicate a strong fitting capability and excellent consistency with subjective evaluation. In contrast, when traditional sound quality indices were employed as model inputs, the predictive performance declined noticeably. The maximum error increased to 0.9, while the average error rose to 0.38. This outcome suggests that conventional sound quality parameters are limited in their ability to capture the unique acoustic characteristics associated with range extender noise. Notably, when both the special indicators and sound quality indices were combined as joint input features, the maximum error increased to 1.1 and the average error to 0.33. Although this combined input model outperformed the model using only sound quality indices, it remained inferior to the model using special indicators alone. This phenomenon indicates that the inter-relationships among feature sets have a significant impact on model performance. Further analysis reveals that there is a certain degree of correlation and overlapping features between special indicators and traditional sound quality indicators. For example, some energy parameters in the frequency domain or sharpness indices reflect similar information structures in the sound spectrum distribution. When these highly correlated features are simultaneously input into the model, it leads to an increase in feature collinearity. Consequently, the model’s discriminative ability for key dimensions is weakened, the effective margin of the support vector distribution is reduced, and the model’s generalization ability declines. Therefore, the combined input does not bring about performance gains, mainly due to the combined effects of feature redundancy, collinearity, and scale conflicts. This suggests that in small-sample sound quality prediction tasks, correlation tests and redundancy elimination should be carried out before feature fusion. Input dimensions should be controlled through feature selection or regularization constraints to achieve a balance between information content and model complexity.
To further validate the effectiveness of the proposed indicators and the GS-SVR modeling approach, a comparative analysis was conducted using a GS-RF model. As shown in Figure 11, the GS-RF model exhibited inferior prediction accuracy under all input configurations compared to the GS-SVR model. When special indicators were used as input, the GS-RF model produced a maximum error of 1.0 point and an average error of 0.39. The predictive performance further deteriorated when sound quality indices were used, resulting in a maximum error of 1.2 and an average error of 0.46. When both indicator types were input simultaneously, the GS-RF model yielded intermediate performance, with a maximum error of 1.4 and an average error of 0.44. These findings further confirm the superior generalization ability and robustness of the GS-SVR model in predicting the subjective evaluation of range extender noise, particularly when special acoustic indicators are utilized as input features. Prediction results of GS-RF model under different input feature sets.
Subjective rating prediction of GS-SVR and GS-RF models under different input.
The robustness of the GS-SVR model can be attributed to the global optimization of its hyperparameters via grid search, which allows it to maintain high prediction accuracy even with a limited dataset. In contrast, the GS-RF model consistently performs worse across all input configurations. For instance, when the special indicators are used, the GS-RF model yields RMSE = 0.49, MAE = 0.39, R2 = 0.7056, and precision = 90.01%. The relatively poor performance of GS-RF may result from its ensemble decision tree architecture, which, due to its local feature-splitting strategy, may struggle to capture complex global nonlinear relationships effectively. The model’s prediction capability further deteriorates when only sound quality indices are used as input, with RMSE reaching 0.56 and R2 dropping to 0.6073. These findings underscore the critical role of physically grounded acoustic features in accurately modeling perceived noise quality. When both types of indicators are input into the GS-RF model, RMSE slightly improves to 0.54; however, this remains substantially inferior to the GS-SVR model’s performance, suggesting that the random forest method is more sensitive to the negative effects of feature redundancy.
Although the GS-RF model’s key hyperparameters, such as the number of estimators and maximum depth, are also optimized using a grid-search approach, its fundamental structure—as an ensemble of weak learners—limits its capacity to express highly nonlinear relationships. The reliance on local partitioning mechanisms prevents the model from fully capturing the intricate dependencies between subjective ratings and acoustic features. In contrast, the GS-SVR model benefits from the high-dimensional mapping capability provided by the kernel function, enabling it to learn complex nonlinear relationships between features and outputs. The incorporation of a global hyperparameter optimization strategy further enhances its predictive accuracy and generalization ability.
In summary, the GS-SVR model, when trained with a special indicator set composed of the RMS values in the 200∼800 Hz range, 2nd-order frequency, and the order 2&4 SPLs, enables high-precision prediction of the sound quality of the range extender under idle conditions. These findings offer a practical and effective approach for modeling complex noise characteristics and provide valuable insights for future research and engineering applications in acoustic evaluation.
Conclusion
In this study, a subjective score prediction model based on GS-SVR was developed to address the steady-state noise quality assessment of extended-range electric vehicles under parking conditions. A total of 105 sample groups were collected and screened. Three types of input features—special indicators, sound quality indicators, and their combinations—were employed, with subjective ratings used as the model output. A comparative modeling analysis was conducted, in which the GS-RF model served as a benchmark. The results demonstrated that the GS-SVR model outperformed the GS-RF model across all input configurations. The best performance was achieved when the special indicators were used as input, yielding an RMSE of 0.34, MAE of 0.26, R2 of 0.8847, and precision of 93.36%. These values were significantly better than those obtained using only the sound quality indicators (R2 = 0.7794). Although the use of combined features led to marginal improvements over the sound quality indicators alone, the performance did not exceed that of the special indicators alone, suggesting that feature redundancy limits further gain in predictive accuracy. Moreover, the GS-SVR model exhibited strong robustness under limited sample conditions, highlighting its suitability for nonlinear noise quality prediction tasks. This work confirms the effectiveness of employing physically grounded acoustic features in modeling the subjective perception of noise in range extenders. The proposed approach offers a reliable methodological basis and practical reference for future efforts in sound quality evaluation and optimization in automotive engineering.
Although this study has systematically investigated the sound quality prediction of the range extender of hybrid electric vehicles under the parked idle condition, certain limitations still exist and need to be further improved and expanded in subsequent research. First, the sample size is relatively limited. After screening, only 82 sets of valid sample data were obtained. Although this quantity can support model training and validation, the insufficient number of samples may affect the statistical stability of the model on specific sample sets to some extent. In future research, by introducing multi-batch and multi-environment test data based on a larger sample size, the generalization performance of the model can be further enhanced. Second, the research conditions are relatively single. This paper only focuses on the steady-state noise characteristics of the vehicle under the parked idle condition, without considering dynamic operation processes such as acceleration, deceleration, and energy recovery. Due to the more complex load changes of the vehicle powertrain, structural coupling characteristics, and the superposition effect of aerodynamic noise under driving conditions, the prediction results of the existing model may not be fully applicable to real-world driving scenarios. Therefore, in the future, research on sound quality feature extraction and time-varying models can be carried out under dynamic conditions to achieve a comprehensive sound quality evaluation of the vehicle’s operation process. In addition, the extended-range hybrid electric vehicle tested in this study is a specific model, and its range extender structure layout, sound insulation design, and control strategy are unique. The conclusions obtained reflect the sound quality characteristics of this model to a certain extent, but there may be differences under other platforms, different displacements, or different acoustic package structures. Subsequent work can conduct comparative research based on multiple models to explore the common laws of the sound quality of different structural types of REEVs, thereby enhancing the generalizability and engineering guidance value of the research results.
In summary, the findings of this study provide a basic framework for the quantitative prediction of the idle sound quality of extended-range hybrid electric vehicles. However, continuous deepening is still required in aspects such as sample expansion, condition expansion, and cross-platform verification to achieve the robustness and practicality of the model in a wider range of application scenarios.
Supplemental Material
Supplemental Material - Exploration on sound quality prediction and evaluation of a range-extended electric vehicle under idle conditions
Supplemental Material for Exploration on sound quality prediction and evaluation of a range-extended electric vehicle under idle conditions by Wei Duan, Jingyuan Peng, WeiWei Dai, Hong Jin, Zhuo Chen, and Ruijun Liu in Journal of Low Frequency Noise, Vibration, and Active Control.
Footnotes
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
