Abstract
With the increasing demand for marine resource development, national security, and marine environmental monitoring, the importance of underwater target detection technology has become increasingly prominent. The underwater target radiated noise signal is the core basis for identifying underwater targets, and its accurate recognition is crucial. This paper proposes a classification method for underwater target radiated noise signals based on enhanced images and a bio-inspired algorithm. The underwater target radiated noise signals collected by the hydrophone are converted into enhanced images, and a convolutional neural network is established for image classification. By leveraging the powerful advantages of convolutional neural network in image processing, classification accuracy is improved. To enhance computational efficiency and classification accuracy, advanced bio-inspired optimization algorithms are used to optimize the hyperparameters of the convolutional neural network, providing a new approach for underwater target radiated noise signals classification and recognition.
Keywords
Introduction
With the growing demand for ocean resource exploration, national security, and marine environmental monitoring, the importance of underwater target detection technology has become increasingly prominent. Underwater targets such as submarines, unmanned underwater vehicles (UUVs), and marine organisms generate unique radiated noise, which serves as the core basis for sonar systems to identify, track, and classify targets. However, the ocean itself is a nonlinear system, and underwater acoustic signals exhibit non-stationary, non-Gaussian, and nonlinear characteristics. Additionally, continuous advancements in stealth technology for underwater targets have rendered traditional detection methods increasingly ineffective. Meanwhile, civilian applications such as fishery resource monitoring, subsea pipeline maintenance, and ecological protection also rely on high-precision underwater target recognition. As the primary information source for passive sonar, radiated noise plays a crucial role, and its effective feature extraction directly determines the performance of detection systems. The complexity of the underwater environment and the stealthy nature of target noise present significant challenges to feature extraction.
Traditional methods rely heavily on time-frequency analysis (e.g. short-time Fourier transform, wavelet transform)1–4 and statistical features (e.g. Meier frequency cepstral coefficients).5–7 But it is difficult to handle non-stationary signals. Machine learning methods in recent years (e.g. support vector machines, convolutional neural networks).8–10 Although the classification accuracy is improved, it faces the problems of insufficient model generalization ability and high computational complexity. In addition, physical mechanism-based characterization (e.g. modulation spectrum analysis)11–13 is highly dependent on a priori knowledge and less adaptable.
Current research is moving toward multimodal fusion (acoustic, image, etc.), adaptive noise reduction and small sample learning. Sun et al.14,15 proposed a deep learning-based single-channel multi-target underwater acoustic signal recognition method. Experimental results show that when using complex Short-Time Fourier Transform (STFT) spectrum, magnitude STFT spectrum, and logarithmic Mel spectrum as network inputs, this method can effectively identify synthetic multi-target ship signals. Ren et al.16–19 proposed a Feature Enhancement Network (FEAN) based on an attention mechanism for underwater acoustic target recognition. This method utilizes the line spectrum and modulation information of underwater target radiated noise signals and incorporates a learnable attention module. Experimental results on the publicly available DeepShip and ShipsEar datasets demonstrate that, compared to traditional recognition models using Short-Time Fourier Transform (STFT) and Fbank features as inputs, FEAN achieves an accuracy improvement of 4.5% and 1.2% on the two datasets, respectively. Yang et al.20–24 proposed a novel end-to-end deep neural network called the Auditory Perception-Inspired Deep Convolutional Neural Network. This model incorporates a set of multi-scale deep convolutional filters to decompose raw time-domain signals into signals with different frequency components. Experimental results indicate that the auditory perception-inspired deep learning approach shows promising potential in enhancing the classification performance of underwater acoustic signals.
This paper presents a classification method for underwater target radiated noise signals based on enhanced images and bio-inspired optimization algorithms. The hydrophone-collected underwater target radiated noise signals are converted into enhanced images, and a convolutional neural network (CNN) is established for image classification. Leveraging the powerful image processing capabilities of CNNs improves classification accuracy. To further enhance computational efficiency and classification performance, advanced bio-inspired optimization algorithms are applied to optimize the structural parameters of the CNN. This approach provides a novel perspective for underwater target radiated noise signals classification and recognition.
The structure of this paper is as follows: Section “Background” provides an overview of the enhanced image method, convolutional neural networks, and bio-inspired optimization algorithms. Section “Proposed method” details the mathematical foundation of the joint classification method. Section “Experimental results” presents experimental comparisons of various optimization methods, including descriptions of the experimental data, methodology, and results. Section “Discussion” discusses the findings, and Section “Conclusions” concludes the study.
Background
In this section, we will discuss the fundamentals of enhanced image conversion method, Convolutional Neural Networks (CNN), and Bio-inspired Optimization Algorithms.
Enhanced image conversion method
The detailed steps for the Enhanced Image Conversion Method are outlined as follows:
Step1: Given a set of underwater target radiated noise signals. X = x1, x2, …xN, the extracted trajectory is:
Where m is the dimension of the trajectory and τ is the time delay.
Step2: Calculation of the matrix R used to represent the similarity relation between the points in the phase space
Where Θ is the Heaviside function and ε is the threshold.
Step3: Calculate the image matrix I
Through color mapping technology, matrix R and color mapping function are used to obtain
CNN
A Convolutional Neural Network (CNN) is a powerful deep learning model commonly applied to tasks such as image processing, natural language processing, and computer vision. A typical CNN architecture comprises several key components: convolutional layers, pooling layers, activation functions, and fully connected layers. CNNs are particularly effective in tasks like object detection, image classification, and facial recognition, owing to their ability to efficiently capture spatial hierarchies in data. This makes CNNs an ideal choice for the classification of underwater target radiated noise signals. The basic framework of a CNN is illustrated in Figure 1.

The basic framework of CNN.
Bio-inspired optimization algorithms
With the increasing complexity of practical problems, traditional mathematical optimization methods face many challenges in solving high-dimensional, nonlinear, and multi-objective problems. For this reason, Bio-inspired Algorithms have gradually become an important tool for solving complex optimization problems. These algorithms learn from the mechanisms of biological evolution, group cooperation, and adaptive adjustment in nature, and explore the large-scale search space to find the optimal or near-optimal solution with the ability of global search and adaptive optimization. The algorithms used in this work include:
Bitterling fish optimization algorithm (BFO): BFO is a bio-inspired optimization algorithm that simulates the foraging and social behaviors of bitterling fish to find optimal solutions in complex search spaces. The algorithm is inspired by the natural behaviors of bitterling fish, particularly their movement patterns, social interactions, and foraging strategies, which are modeled to perform efficient optimization. It belongs to the category of swarm intelligence algorithms, where multiple individuals (or agents) cooperate and share information to converge toward an optimal solution.
The Horned Lizard Optimizer (HLO) is a bio-inspired optimization algorithm based on the unique defense and survival behaviors of the horned lizard, a reptile known for its remarkable ability to adapt to harsh environments and escape predators. The algorithm simulates the horned lizard’s adaptive mechanisms, such as its ability to change its behavior based on threats and its defense strategies, to solve optimization problems in complex, high-dimensional search spaces.
The Red-tailed Hawk Algorithm (RTH) is a bio-inspired optimization algorithm based on the hunting behaviors and strategies of the red-tailed hawk, a bird of prey known for its exceptional agility, keen vision, and strategic hunting techniques. This algorithm mimics the hawk’s hunting tactics, particularly its approach to tracking, capturing, and handling prey, to search for optimal solutions in complex optimization problems.
The Wild Horse Optimizer (WHO) is a bio-inspired optimization algorithm based on the natural behaviors and social interactions of wild horses, particularly their herding and foraging behaviors. It belongs to the class of swarm intelligence algorithms and is designed to find optimal solutions in complex, high-dimensional search spaces. The algorithm takes inspiration from how wild horses move in herds and adapt to their environment in search of food and safety.
The Love Evolution Algorithm (LEA) is a bio-inspired optimization technique that simulates the behaviors and processes associated with romantic love and mating rituals in nature. The algorithm is based on the concept of “love,” where solutions (representing individuals) in a population evolve over time through interactions, exploration, and competition, similar to how living organisms evolve through the process of natural selection and courtship behaviors.
The Butterfly Optimization Algorithm (BOA) is a bio-inspired optimization algorithm based on the foraging and mating behaviors of butterflies, which are known for their graceful flight patterns and complex interactions with their environment. The algorithm mimics these natural behaviors to find optimal solutions in complex optimization problems, particularly in high-dimensional and multimodal landscapes.
The Starfish Optimization Algorithm (SFOA) is a bio-inspired optimization algorithm based on the natural behaviors of starfish, a marine species known for its remarkable regenerative abilities and efficient foraging techniques. The algorithm mimics the movement, foraging, and regenerative capabilities of starfish to explore and exploit the solution space effectively, making it a promising tool for solving complex optimization problems.
Proposed method
In this section, since the underwater target radiated noise signals exhibit nonlinearity, non-stationarity, and a high level of noise, it poses significant challenges for accurate classification while preserving essential signal characteristics. To address these issues, we propose a BFO optimized VGG-16 model to classify the enhanced images generated through the enhanced image conversion method. This approach aims to improve classification accuracy while maximizing the retention of crucial features from the underwater target radiated noise signal. We will elaborate on the preprocessed CNN model VGG-16 in Part a, a Bio-inspired optimization Algorithm model Bitterling fish optimization algorithm in Part b, and our proposed framework in Part c.
VGG-16
We select VGG-16 to complement the Bio-inspired Optimization Algorithm primarily due to its consistent use of 3 × 3 convolution kernels throughout the architecture. This design allows the network to capture fine-grained details and contextual information while effectively expanding the receptive field. Such a configuration enables deeper network structures, facilitates the extraction of richer feature representations, and enhances overall model performance. Additionally, the combination of multiple small convolution kernels increases the nonlinearity of the network, thereby improving its expressive capability. Furthermore, VGG-16 employs 2 × 2 max-pooling with a stride of 2, which efficiently preserves the most salient features by selecting the maximum value within local regions, thereby strengthening feature representation and improving discriminative power.
Bitterling fish optimization algorithm
In the animal kingdom, many species aim to reproduce as early as possible. For fish, reproduction typically occurs when the male and female approach each other, releasing sperm and eggs into the water. However, this method of reproduction has a significant drawback: the newly hatched fish are exposed to various external dangers, making them highly vulnerable to predation by other animals.
The bitterling fish exhibits a distinct reproductive mechanism. This species feeds on freshwater mollusks, particularly oysters, and it is the male’s responsibility to locate suitable shells for spawning. Males seek larger oysters, which provide more space to accommodate the eggs. Once the male has found one or more suitable oysters, it must defend them from potential competitors. Male bitterlings exhibit aggressive behaviors during this process, as other males also attempt to claim the oysters as their own. In these territorial battles, the males’ body color darkens, signaling their readiness to fight and defend their chosen spawning site.
During the mating phase, sexual selection plays a key role. Female bitterlings select mates based on the males’ color and physical strength. Females tend to prefer larger males with more vibrant body colors. Once a mate is chosen, the female lays her eggs inside the oyster, and the male fertilizes them with sperm. This reproductive strategy not only ensures the successful fertilization of eggs but also facilitates the survival of offspring through selective mate choice, with the larger, more colorful males often being selected due to their superior traits.
The steps of the algorithm are as follows:
Step 1: Initialization
The initial positions of the bitterling fish in the search space are defined randomly. Let the population size be N, and the number of decision variables be D. The position of the
Step 2: Objective function
The objective function
Step 3: Bitterling fish movement
The movement of each bitterling fish combines exploration and exploitation. The movement is defined as:
Where
Step 4: Evaluation
After updating the positions, each fish’s fitness is evaluated by the objective function
Step 5: Update
The position of the fish is updated based on its fitness. If the fitness of fish
Step 6: Termination criteria
Define criteria to terminate the algorithm. Mathematically, the termination condition can be expressed as:
Where T is the Maximum population.
Step 7: Result analysis
Once the algorithm terminates, the best solution found is:
Where
The architecture of proposed method
We employ the enhanced image conversion method, CNN model, bio-inspired optimization algorithms, and t-SNE for the visualization and classification of underwater target radiated noise signals. The overall architecture of the proposed method is illustrated in Figure 2 and the framework proceeds as follows:
Step1: Transform the original underwater target radiated noise signals into enhanced images;
Step2: Load the preprocessed VGG-16 model.
Step3: Use Bitterling fish optimization algorithm to optimize the parameters inside the VGG-16 network and find the best values.
Step4: Extract 70% of the transformed enhanced image as the training set, and use the optimized VGG-16 convolutional neural network for classification training and verification.
Step5: Extract 30% of the transformed enhanced images as the test set, and use the optimized VGG-16 convolutional neural network for classification training and verification.
Step6: Obtain the underwater targets radiated noise signals recognition results and classification accuracy.

Overall architecture of proposed method.
Experimental results
Overall of the dataset
The sampling time of underwater target radiated noise signal adopted in this paper is 1 s and the sampling frequency is 50 kHz. The data parameters are shown in Table 1, and the underwater target radiated noise signal is shown in Figure 3.
The parameters of underwater target radiated noise signal simulation.

The simulated underwater target radiated noise signal.
Experimental result
In this study, 70% of the converted images are randomly selected as the training set for model training, utilizing the same enhanced image conversion method. The remaining 30% are randomly assigned to the test set for model evaluation. The experiments were conducted in a computing environment with a Windows 10 operating system, equipped with a single GPU, specifically the NVIDIA GeForce RTX 4070 SUPER.
We compare the use of different bio-inspired optimization methods, including BFO, HOLA, RTH, WHO, LEA, BOA, SFOA, and VGG-16 model (with Max Epochs set to 30 and an initial learning rate of 1e-5), for tuning the hyperparameters of VGG-16, which yielded varying classification accuracy results. Each optimization method demonstrated distinct strengths and weaknesses in terms of improving the model’s performance for classifying underwater target radiated noise signals. Figure 4 shows the classification results of images converted by enhancement image conversion method and optimized VGG-16 convolutional neural network. Figure 5 illustrates the classification visualization of underwater target radiated noise signals using VGG-16 optimized with different bio-inspired methods. The figure highlights the performance variations across the optimization techniques, demonstrating their effectiveness in enhancing classification accuracy for the given task.

Experimental accuracy results of different Bio-inspired Algorithms.

t-SNE 2D scatter plot: (a) BFO, (b) HOLA, (c) RTH, (d) WHO, (e) LEA, (f) BOA, (g) SFOA, and (h) VGG-16.
Analysis of classification results
In the comparison of the 2D scatter plots generated by the BFO-optimized VGG-16 and the manually tuned VGG-16, for the classification tasks of jiang5 and jiang6, as highlighted in the red box in Figure 6, we can observe that BFO successfully identifies suitable hyperparameters for these tasks, leading to clear and well-defined clusters. In contrast, the manually tuned VGG-16, without bio-inspired optimization, fails to classify jiang5 and jiang6 effectively, with a noticeable overlap of speed20 and jiang4 samples in these categories.

Comparison of 2D scatter plots for BFO-optimized and manually tuned VGG-16 in classifying underwater target radiated noise signals.
For the classification tasks of jiang4, speed10, and speed20, as shown in the blue box in Figure 6, we can see that the BFO-optimized VGG-16 demonstrates improved convergence and better clustering compared to the manually tuned VGG-16. The results reveal that BFO optimization enhances the classification ability, making the model more focused and better at distinguishing between different categories, while the manually tuned VGG-16 shows greater dispersion and less accurate classification. This comparison further underscores the advantage of using bio-inspired optimization methods to fine-tune the hyperparameters of VGG-16, particularly when dealing with complex, noisy underwater target radiated noise signals.
For the classification tasks of jiang5 and jiang6 (highlighted in the red box in Figure 7), the BFO-optimized model demonstrates significant performance improvements, with scatter points forming tightly clustered distributions that reveal distinct classification boundaries. This optimization effectively mitigates inter-class confusion, particularly when handling noisy and complex signal patterns. In comparison, the LEA-optimized approach shows comparatively inferior performance for these categories. Although partial clustering is observable in the scatter plot, it exhibits reduced fine-grained discriminative capability between jiang5 and jiang6 compared to the BFO optimization.

Comparison of 2D scatter plots for BFO-optimized and LEA-optimized in classifying underwater target radiated noise signals.
Regarding the jiang4 classification task (denoted by the green box in Figure 7), the BFO-optimized model maintains competent classification performance. While the jiang4 data points show less compact clustering than those of jiang5 and jiang6, they still maintain moderate spatial concentration. The LEA-optimized method achieves comparable but less effective clustering for this category, with visibly lower aggregation density and diminished classification precision relative to the BFO approach.
For the speed10 and speed20 classification tasks (indicated by the blue box in Figure 7), the BFO-optimized model exhibits strong dimensional discriminability. Along Dimension 1, data points demonstrate high-density clustering with minimal inter-point distances, indicating superior feature convergence and classification efficacy. Conversely, the LEA-optimized results display suboptimal performance across both Dimensions 1 and 2, characterized by dispersed point distributions that reflect weaker categorical separation capabilities.
Discussion
In this study, we demonstrated the effectiveness of using VGG-16 optimized by bio-inspired optimization algorithms (BFO) for classifying underwater target radiated noise signals. The results indicated improvements in classification accuracy, highlighting the power of bio-inspired optimization methods in tuning hyperparameters for deep learning models BFO, through simulating the cooperative predation and search behavior of fish schools in water, exhibits strong global search capabilities. This makes it particularly well-suited for dynamic and complex environments, such as the task of classifying underwater target radiated noise signals, which are often nonlinear, noisy, and prone to high levels of environmental interference. BFO demonstrated its strength in optimizing the hyperparameters of the VGG-16 model, leading to improved classification accuracy for complex signal categories like jiang5 and jiang6. However, the approach faced challenges in classifying jiang4, speed10, and speed20 signals, where the choice of optimal Max Epochs and initial learning rate proved difficult. These difficulties suggest that further refinement of the optimization algorithms is needed to better handle the complexity and noise inherent in these tasks. The enhanced image conversion method played a crucial role in improving the signal-to-noise ratio, making the classification task more feasible, while t-SNE visualizations provided valuable insights into the feature clustering process.
Despite the promising results, some limitations were identified, particularly the sensitivity of the model to hyperparameter selection and the complexity of certain signal categories. Future work should focus on improving the robustness of the model through more advanced optimization strategies, such as grid search or adaptive methods, and exploring hybrid models like RNNs or Attention Mechanisms to capture temporal and spatial dependencies. Moreover, testing the model on real-world underwater signals and examining other bio-inspired optimization algorithms could further enhance performance and generalization.
Conclusions
Accurate classification of underwater target radiated noise signals is crucial for applications such as underwater monitoring, sonar signal processing, and marine life tracking, as it aids in better detection and understanding of underwater environments. This study successfully applied VGG-16, optimized by bio-inspired optimization algorithms (BFO), to classify underwater target radiated noise signals, achieving notable improvements in classification accuracy, particularly for categories like jiang5 and jiang6. The enhanced image conversion method preserves all relevant information in the signal, minimizing the loss of key features that may occur with some time-frequency conversion methods. This results in better feature extraction and improved classification performance.
The use of CNN in this study plays a critical role in effectively learning hierarchical spatial features from the enhanced images, enabling the model to distinguish between complex underwater target radiated noise signals. CNNs excel in handling image data, automatically learning features without the need for manual intervention. The bio-inspired optimization algorithm (BFO) contributes by optimizing the hyperparameters of the VGG-16 network, allowing the model to converge toward better solutions and improve overall performance. The combination of CNN’s ability to capture spatial relationships and the optimization power of bio-inspired optimization algorithms provides a strong foundation for improving classification accuracy.
Despite these promising results, challenges remain in classifying more complex signal categories, and the model’s sensitivity to hyperparameter choices suggests that further optimization is needed.
Footnotes
Handling Editor: Divyam Semwal
Author contributions
Conceptualization, Lei Zhufeng and Lei Xiaofang; methodology, Zhou Chuanghui; software, Shi Zanrong; validation, Yuan Muye and Wang Tianjiao.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability statement
Data are contained within the article.
