Abstract
Exhaust sound quality is an important part of vehicle performance. In this paper, the sporty exhaust sound quality of an economical vehicle equipped with a 4-cylinder and 4-stroke engine is evaluated, analyzed, and improved under acceleration. Firstly, a sporty feeling evaluation method with engine speed divided is proposed, and the influence of exhaust sound order components on sporty exhaust sound is analyzed. The results show that while the A-weighted sound pressure level (ASPL) of Order 2 is lower and the ASPLs of Orders 4 and 6 are higher, the exhaust sound is sportier. Then, a hybrid predicted model of vehicle sporty exhaust sound under acceleration is established based on convolutional neural network (CNN) and support vector regression (SVR) algorithm. The relative errors between the predicted results of CNN-SVR hybrid model and the subjective evaluation results are limited within 2%, which indicates that the CNN-SVR hybrid prediction model achieves a high accuracy in assessing the sporty feeling of exhaust sound. Finally, considering the frequency ranges corresponding with the above order components under the practical accelerating condition, a strategy is proposed to enhance the sporty feeling of exhaust sound by reducing the sound energy within 100 Hz and increasing the sound energy within 100–450 Hz. Based on this strategy, a muffler with different structure is selected and installed on the economical vehicle, and the sporty feeling of exhaust sound is 0.63 points higher than before.
Introduction
The sound quality of a vehicle is an important factor in the willingness of customers to buy. Nowadays, economical vehicles with sporty sound are favored by young customers. The design of exhaust sound is one of the main methods to enhance the sporty feeling of economical fuel vehicles. Therefore, it is interesting to study the sporty exhaust sound of economical fuel vehicles.
The evaluation of sound quality includes both subjective and objective evaluations. Subjective evaluation can directly reflect people’s perception, but it is highly related with the experience and working state of the evaluators. The stability of the subjective evaluation results is relatively poor, which is not conducive to the accumulation of data in enterprises 1 Therefore, many researchers have attempted to model the correlation between subjective evaluation results and objective metrics of sound with the aim of using objective metrics to predict subjective evaluation results. Many methods, such as multiple linear regression (MLR), 2 artificial neural network (ANN),3–6 and support vector machine (SVM), 7 have been successfully applied to build sound quality predicted model. Based on the interior sound samples under the acceleration condition of wide open throttle (WOT), Kwon et al. 8 analyzed the influence of psychoacoustic objective metrics such as roughness, sharpness, and pure tone on the vehicle sporty sound quality and established a predicted model for vehicle sporty sound quality using MLR. Some researchers have pointed out that the predicted accuracy of vehicle sound quality based on ANN model is better than MLR model.9,10 The deep learning method is applied to establish a sound quality prediction model under stable operating conditions, and the experiments show that the prediction results of the model match well with the actual subjective evaluation results.11,12 Recently, the deep convolutional neural network (CNN) has also been used to build an interior sound quality prediction model for vehicles with the temporal spectrum of sound as input, and satisfactory results have been obtained. 13
In terms of sporty exhaust sound, many studies on subjective and objective evaluation, analysis, and improvement have also been reported. Hetherington et al. 14 synthesized sound samples based on the simulated exhaust order sound and the measured background noise and guided the design of the exhaust order sound according to the subjective evaluation results. Hiscutt and Ishikawa 15 pointed out, based on their study, that an appropriate increase the sound level of harmonic order and half-order can improve the sporty feeling of vehicle sound. Reddy et al. 16 established a quantitative model of exhaust sound quality with loudness, roughness, sharpness, and pure tone as objective metrics and carried out the improved design of mufflers based on this model. Yan et al. 17 analyzed the influence of Orders 2, 4, and 6 on exhaust sound quality of the 4-cylinder and 4-stroke engine based on the subjective evaluation results and proposed a design strategy of order components to enhance the sporty feeling. The above research results provide a useful reference to analyze and improve the sporty feeling of exhaust sound. However, there are two problems that still require new solutions to be tried. The first one is that during the subjective evaluation, the sound samples under acceleration have a longer duration, which may make it difficult for the tester to fully remember the sound performance in the full speed range of the engine and may easily cause misjudgment. The second one is that there is frequency crossover of components of different orders, such as Orders 2, 4, and 6, in the full speed range of the engine, which increases the difficulty of sound tuning and muffler design. In this study, we are dedicated to analyzing, evaluating, and improving the sporty feeling of exhaust sound of economical vehicles under acceleration conditions and will circumvent both problems by dividing the engine speed range into multiple intervals.
The main contributions of this paper are as follows: (1) The sporty feeling evaluation method with engine speed divided is proposed to avoid the misjudgment of evaluators due to the long duration of sound samples. (2) The CNN-support vector regression (SVR) hybrid model has been proposed to evaluation sporty exhaust sound quality. (3) Based on the CNN-SVR hybrid model, the exhaust sound sporty feeling of an economical vehicle was analyzed and evaluated, and by re-matching the muffler to the economical fuel vehicle, the subjective evaluation score of exhaust sound sporty feeling was improved by 0.63 points.
The remainder of this paper is organized as follows. In Section 1, CNN-SVR hybrid model is described. In Section 2, the subjective evaluation of the sporty feeling of the exhaust sound under acceleration was completed, and the splitting speed was used to avoid misjudgment by the tester due to the long duration of the sound sample. In Section 3, a sporty feeling prediction model of exhaust sound under vehicle acceleration is established based on the CNN-SVR hybrid algorithm. In addition, the prediction results of the model are compared with those of prediction models based on CNN, SVR, and MLR algorithms to demonstrate its advantages in terms of accuracy. In Section 4, a strategy compatible with frequency crossover characteristics of different order components within the full engine speed range is proposed to enhance the sporty feeling of exhaust sporty. Based on this strategy, a muffler with a different structure was chosen as an improvement measure, which greatly enhanced the sporty feeling of the exhaust sound of this vehicle. In Section 5, the conclusions of this paper are summarized.
CNN-SVR hybrid model
The exhaust sound during acceleration is a typical non-stationary signal with complex nonlinear characteristics, increasing the difficulty of subjective and objective evaluation of sound quality. CNN, which achieves excellent performance in the image classification, 18 objection detection,19,20 and other fields, are directly driven by data and establish mapping relationships between inputs and outputs through multi-dimensional nonlinear feature extraction. CNN has strong data characterization and mapping capabilities21,22 and can achieve higher accuracy and robustness compared to traditional shallow ANNs. 23 SVR algorithm is a machine learning method based on statistical theory and structural risk minimization principle. It maps multi-dimensional inputs into a higher dimensional feature space through a nonlinear kernel function and then performs regression operations to obtain the nonlinear mapping relationship between inputs and outputs, which can solve the problems of small sample size, nonlinearity, curse of dimensionality, and local minimum. 7 In view of this, this paper proposes to construct a CNN-SVR hybrid model to evaluate the sporty feeling of exhaust sound of economical vehicle under acceleration, which can simultaneously enjoy the multi-dimensional nonlinear feature adaptive extraction and characterization capability of CNN and the nonlinear mapping advantage of SVR algorithm in small sample size scenarios. 24 That is, we attempted to combine the SVR algorithm with the CNN, in order to obtain good sporty feeling evaluation results with only a small number of accelerated exhaust sound samples.
The structure of CNN-SVR hybrid model is shown in Figure 1. The structure of CNN-SVR hybrid model.
This hybrid model includes an input layer, two convolutional layers, a pooling layer, a fully connected layer, and an SVR output layer. The input layer is used to accept data input and conduct data preprocessing. The convolutional layer is the core of CNN, and its main function is to automatically extract data features through convolution kernel. The convolution can be expressed as follows:
The pooling layer is mainly used to reduce the dimension of the feature matrix obtained by the convolutional layer for improving the calculation efficiency and reducing the risk of overfitting. Maximum pooling and average pooling are common pooling methods. Maximum pooling is applied in this paper. The output features of the pooling layer are expanded into a one-dimensional vector, which is then fed into the fully connected layer network expressed by equation (2).
The final layer is SVR output layer, which can be expressed in equation (3):
The common activation functions include Sigmoid function, Tanh function, and ReLU function. More complex exponential operations are involved in the Sigmoid function and Tanh function, which cause high computational effort and the problem of gradient disappearance. The ReLU function is relatively simple, which can significantly improve the calculation efficiency, speed up the model training, and be used more widely when computing power is limited. 25 Therefore, ReLU function is selected in this paper.
In summary, the hybrid CNN-SVR model developed in this paper extracts data features of exhaust sound based on convolutional neural networks and uses the extracted features as input to SVR to evaluate the sporty feeling of exhaust sound.
Exhaust sound acquisition and subjective evaluation
As shown in Figure 2(a), the exhaust sound of an economical vehicle equipped with a 4-cylinder, 4-stroke engine is acquired by the Head Measure System (HMS). The vehicle was accelerated at full throttle on a chassis dynamometer in a semi-anechoic chamber with an engine speed range of 1000–4500 r/min. The height of the ears of HMS from the ground is the same as that of the exhaust tailpipe. The connecting line between the center point of ears of HMS and the exhaust tailpipe nozzle forms 45° with the axis of the exhaust tailpipe and is 0.5 m away from the exhaust tailpipe nozzle. The total A-weighted sound pressure level (ASPL) and order components of exhaust sound are shown in Figure 2(b). In order to obtain better generalization performance of the model, the exhaust sound of 29 other competing vehicles was also collected. Data acquisition of exhaust sound. (a) Site photo and (b) overall ASPLs and ASPLs of typical order components.
10-rating scale method.
Design of order components for exhaust sound samples.
The subjective evaluation is performed in a special sound quality evaluation room, as shown in Figure 3(a). The sound playback system consists of Sennheiser HD600 headset, LabP2 sound card, and computer. Kendall correlation analysis is used to validate the evaluation results of the 52 evaluators, as shown in Figure 3(b). From Figure 3(b), it can be concluded that the average correlation values of 44 evaluators are above 0.75, which indicates that the subjective evaluation results of the 44 evaluators achieve good consistency. Therefore, we average the subjective evaluation scores of these 44 evaluators and use the obtained average score as the sporty feeling score of each exhaust sound sample. Subjective evaluation. (a) Site photo and (b) the Kendall correlation coefficient of evaluators.
Establishment of predicted model
Objective parameters
Establishing a mathematical model for predicting the sporty feeling of exhaust sound can not only avoid misjudgments caused by individual differences of evaluators but also provide effective guidance for further improvement of exhaust sound. A reasonable choice of objective parameters is the key to the validity of the prediction model. Pearson correlation coefficients are often used to verify the correlation between objective parameters and subjective evaluation results. While the absolute values of Pearson correlation coefficients are greater than 0.6, it is usually considered that the objective parameters, which have a strong correlation with the subjective evaluation results, can be regarded as the input of the prediction model. Figure 4 shows the Pearson correlation coefficients between sporty feeling scores of exhaust sounds and the ASPLs of Orders 2, 4, and 6 within each engine speed interval. From Figure 4, it can be concluded that Orders 2, 4, and 6, which highly correlate with sporty feeling, can be applied to build the predicted model. The Pearson correlation coefficients between sporty feeling scores of exhaust sounds and the ASPLs of Orders 2, 4, and 6 within each engine speed interval. (a) Order 2, 1000–2000 r/min, (b) Order 4, 1000–2000 r/min, (c) Order 6, 1000–2000 r/min, (d) Order 2, 2000–3000 r/min, (e) Order 4, 2000–3000 r/min, (f) Order 6, 2000–3000 r/min, (g) Order 2, 3000–4500 r/min, (h) Order 4, 3000–4500 r/min, and (i) Order 6, 3000–4500 r/min.
Considering the regression coefficients of MLR can directly reflect the influences of objective parameters on the sporty feeling of exhaust sound, MLR is applied firstly to establish the predicted model corresponding to each engine speed interval, expressed as equation (4).
Regression coefficients of MLR within each engine speed interval.
Predicted model of CNN-SVR
Exhaust sound under acceleration is a typical non-stationary signal. The prediction model for MLR described above uses the average ASPLs over each speed interval as objective parameters, ignoring the variation of SPLs with engine speed and the effect of this variation on tester perception. In addition, the nonlinear characteristics of the human perception of sound and the complexity of extracting the objective features of non-stationary signals make the linear model difficult to achieve satisfied results. Therefore, the CNN-SVR hybrid model, described in detail in Section 2, is applied to establish a predicted model of sporty feeling of exhaust sound in this subsection.
From the conclusion in Section 3.1, ASPLs of Orders 2, 4, and 6 have high correlations with sporty feeling of exhaust sound. Therefore, the sound feature matrixes including Orders 2, 4, and 6 are chosen as the input of CNN-SVR model. The ASPLs of Orders 2, 4, and 6 are calculated with every 100 r/min of engine speed as the elements of sound feature matrixes. Therefore, within 1000–2000 r/min and 2000–3000 r/min, the dimensions of sound feature matrixes are 3 × 10. Within 3000–4500 r/min, the dimensions of sound feature matrix are 3 × 15, as shown in Figure 5. Sound feature matrixes of exhaust sound. (a) 1000–2000 r/min, (b) 2000–3000 r/min, and (c) 3000–4500 r/min.
In the CNN-SVR hybrid model of sporty feeling of exhaust sound, the kernel size of the first convolutional layer is 2 × 3 and the number of channels is 6. The stride is 1 and the number of padding is 2 in the first convolutional layer, where the activation function is ReLU function. The kernel size of the second convolutional layer is 3 × 3 and the number of channels is 12. The stride is 1 and the number of padding is 0 in the second convolutional layer, where the activation function is also ReLU function. The pooling layer uses maximum pooling. The size of pooling layer is 2 × 2 and the stride is 2. The last layer is SVR output layer. In the predicted model of CNN-SVR trained over 400 epochs, the initial learn rate is 0.0001 and the training mini-batch size is 4. Stochastic Gradient Descent with Momentum (SGDM) optimization algorithm is used as the solver.
In each engine speed interval, six exhaust sounds are selected randomly as the test set to verify the validity and fitness of predicted model. In total, 187 exhaust sound samples are randomly selected as the training set and 47 exhaust sound samples are selected as the validation set. In the CNN-SVR model, the sound feature matrixes of training set and validation set are inputs and subjective evaluation results are outputs. Figure 6 shows the change of root mean square errors (RMSEs) expressed in equation (5). The RMSEs and the mean absolute errors (MAEs), expressed in equation (6), of CNN-SVR hybrid model, traditional CNN model, SVR model, and MLR model are listed in Table 4. From Figure 6 and Table 4, it can be concluded that with the increase of epochs, the RMSEs of the training set and validation set of the CNN-SVR hybrid model gradually decrease. After 400 epochs, the CNN-SVR hybrid model converges. The RMSEs and MAEs of CNN-SVR hybrid model are smaller than that of traditional CNN model, SVR model, and MLR model, which indicates that the CNN-SVR hybrid model enjoys higher prediction accuracy. The training process of CNN-SVR hybrid model for sporty feeling of exhaust sound. (a) 1000–2000 r/min, (b) 2000–3000 r/min, and (c) 3000–4500 r/min. The RMSEs and MAEs of predicted models.

In order to verify the validity and fitness of predicted model, the test set of six exhaust sound samples is analyzed. The comparisons between the predicted results of the above four models and the subjective evaluation results of the test set are shown in Figures 7 and 8. Figure 7 shows the comparisons within each engine speed interval. Figure 8 shows the comparisons within the full engine speed range, where the average of predicted score and subjective evaluation scores in engine speed intervals are regarded as the predicted and subjective evaluation results in the full engine speed range. From Figures 7 and 8, it can be obtained that the relative errors of CNN-SVR hybrid model mainly are less than 2% and those the other three models, which indicates that the CNN-SVR predicted model can more accurately predict sporty feeling of exhaust sound. Therefore, the CNN-SVR hybrid model is more favorable for future muffler design and exhaust acoustics tuning. The comparisons between the predicted results and the subjective evaluation results within each engine speed interval. (a) Scoring results of 1000–2000 r/min, (b) relative errors of 1000–2000 r/min, (c) scoring results of 2000–3000 r/min, (d) relative errors of 2000–3000 r/min, (e) scoring results of 3000–4500 r/min, and (f) relative errors of 3000–4500 r/min. The comparisons between the predicted results and the subjective evaluation results within the full engine speed range. (a) Scoring results and (b) relative errors.

Improvement on sporty feeling of exhaust sound for the economical vehicle
From the above analysis in Section 3.1, reducing the sound energy in the low frequency band of 33–100 Hz and increasing the sound energy in the middle frequency band of 100–450 Hz can improve sporty feeling of exhaust sound for the economical vehicle. Therefore, the structure of the front and rear muffler is modified, shown in Figure 9. The external dimensions of the original muffler and the improved muffler are the same. The transmission loss of the original and improved mufflers is shown in Figure 10. Within the frequency band of 0–75 Hz, the transmission loss of the improved muffler is greater than that of the original muffler. Within the frequency band of 75–100 Hz, the transmission loss of the original and improved mufflers is almost equivalent. Within the frequency band of 100–450 Hz, the transmission loss of the improved muffler is less than that of the original muffler. Therefore, the difference in muffler transmission loss before and after the improvement coincided with the tuning strategy previously developed for the sporty feeling of the exhaust sound. Figure 11 demonstrates the exhaust sound of the economical vehicle with the original muffler and the improved muffler at full throttle acceleration in 3rd gear. It shows that within 1000–4500 r/min, the 4th and 6th order ASPLs of the economical vehicle with improved muffler increased by 10–20 dB(A) compared to that with original muffler. The original and improved mufflers. (a) Original front muffler, (b) original rear muffler, (c) improved front muffler, and (d) improved rear muffler. The transmission loss of the original and improved mufflers. The ASPL comparison of exhaust sound of the economical vehicle with the original muffler and the improved muffler at full throttle acceleration in 3rd gear. (a) Overall, (b) Order 2, (c) Order 4, and (d) Order 6.


The sporty feeling of exhaust sound of the economical vehicle with the original and improved mufflers.
Conclusions
In this paper, the sporty feeling of the exhaust sound of an economical vehicle equipped with a 4-cylinder, 4-stroke engine during acceleration is evaluated, analyzed and improved. Firstly, a sporty feeling evaluation method with engine speed divided is proposed to avoid the misjudgment of evaluators due to the long duration of sound samples. The influences of exhaust sound order components on sporty feeling are analyzed. The results show that the exhaust sound has a better sporty feeling when the ASPL of Order 2 is lower and the SPLs of Order 4 and 6 are higher. Then, a CNN-SVR hybrid model for predicting the sporty feeling of exhaust sound was developed. The model has a prediction error of less than 2% and outperforms traditional CNN, SVR and MLR models. Finally, a sound tuning strategy to adapt the frequency crossover characteristics of different order components within the full engine speed range of the engine is proposed to enhance the sporty feeling of the exhaust sound of the economical vehicle. Based on this strategy, a new muffler with different structure is selected and installed on the economical vehicle, and the sporty feeling of exhaust sound achieves 0.63 points higher than before.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Open Foundation of the State Key Laboratory of Vehicle NVH and Safety Technology (Grant No. NVHSKL-202202).
