Sage Journals: Discover world-class research

Abstract

Directly applying original high-dimensional data as the input for machine learning leads to curse of dimensionality, decline of generalization ability and even misleading conclusion. Feature engineering technique, which can effectively reduce feature size and data dimension, is the core of data mining. However, feature selection has strong interpretability and low computational expense but cannot explore deep information. Feature extraction can capture deep and complex information but has the large computational cost and poor interpretability. To integrate the advantages of two feature engineering techniques, a novel method based on feature subset selection and multi-feature extraction is proposed in this paper. The proposed method first performs feature selection to generate initial feature subsets through an improved binary nutcracker optimization algorithm. Then initial feature subsets with preliminary dimensionality reduction are used for feature extraction through dynamic convolution to generate optimal feature subsets. The method for feature selection is compared as an independent part with five high-performing metaheuristic wrapper-based methods and five widely used filter-based methods. The complete method incorporating dynamic convolution as a feature extraction method is compared with the proposed method without feature extraction, the proposed method without feature selection and six other effective feature dimensionality reduction methods. All these methods are experimentally analyzed and comparatively evaluated on twenty datasets with various sizes. The results demonstrate the superior performance of the proposed method compared to other similar techniques.

Keywords

Data mining dimensionality reduction metaheuristic intelligence dynamic convolution feature selection feature extraction

Get full access to this article

View all access options for this article.

References

Liu

. Recent advances in feature selection and its applications. Knowl Inf Syst 2017; 53: 551–577.

Moustafa

Abuelnasr

Abougabal

. Efficient mining fuzzy association rules from ubiquitous data streams. Alexandria Eng J 2015; 54: 163–174.

Cai

Luo

J-W

Wang

S-L

, et al. Feature selection in machine learning: a new perspective. Neurocomputing 2018; 300: 70–79.

Pölsterl

Conjeti

Navab

, et al. Survival analysis for high-dimensional, heterogeneous medical data: exploring feature extraction as an alternative to feature selection. Artif Intell Med 2016; 72: 1–11.

Luo

Nie

Chang

, et al. Adaptive unsupervised feature selection with structure regularization. IEEE Trans Neural Netw Learn Syst 2018; 29: 944–956.

Liu

Zhang

. Symmetric positive definite manifold learning and its application in fault diagnosis. Neural Netw 2022; 147: 163–174.

Saeys

Inza

Larrañaga

. A review of feature selection techniques in bioinformatics. Bioinformatics 2007; 23: 2507–2517.

Chandrashekar

Sahin

. A survey on feature selection methods. Comput Electr Eng 2014; 40: 16–28.

Zarshenas

Suzuki

. Binary coordinate ascent: an efficient optimization technique for feature subset selection for machine learning. Knowl Based Syst 2016; 110: 191–201.

10.

Mafarja

Mirjalili

. Hybrid binary ant lion optimizer with rough set and approximate entropy reducts for feature selection. Soft COmput 2019; 23: 6249–6265.

11.

Wolpert

Macready

. No free lunch theorems for optimization. IEEE Trans Evol Comput 1997; 1: 67–82.

12.

Mehrnam

Nasrabadi

Ghodousi

, et al. A new approach to analyze data from EEG-based concealed face recognition system. Int J Psychophysiol 2017; 16: –8.

13.

Siuly

. Designing a robust feature extraction method based on optimum allocation and principal component analysis for epileptic EEG signal classification. Comput Methods Programs Biomed 2015; 119: 29–42.

14.

Zhao

Wang

Zhang

, et al. A review of convolutional neural networks in computer vision. Artif Intell Rev 2024; 57: 99.

15.

Abdel-Basset

Mohamed

Jameel

, et al. Nutcracker optimizer: a novel nature-inspired metaheuristic algorithm for global optimization and engineering design problems. Knowl Based Syst 2023; 262: 110248.

16.

Zhang

Cai

. Adaptive dynamic self-learning grey wolf optimization algorithm for solving global optimization problems and engineering problems. Math Biosci Eng 2024; 21: 3910–3943.

17.

Chen

Dai

Liu

, et al. Dynamic convolution: attention over convolution kernels. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, pp.11027–11036. 10.1109/CVPR42600.2020.01104

18.

Kennedy

Eberhart

RC.

A discrete binary version of the particle swarm algorithm. In: 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation, Orlando, FL, USA, 1997, pp.4104–4108. 10.1109/ICSMC.1997.637339

19.

Chang

Sun

, et al. Gene selection and classification method based on SNR and multi-loops BPSO. Adv Intell Comput Bioinformatics ICIC 2024 2024; 2024: 73–84.

20.

Raul

Rout

Somayajulu

DVLN

. Topic classification using regularized Variable-size CNN and dynamic BPSO in online social network. Arab J Sci Eng 2024; 49: 3347–3369.

21.

Abbes

Kechaou

Hussain

, et al. An enhanced binary particle swarm optimization (E-BPSO) algorithm for service placement in hybrid cloud platforms. Neural Comput Appl 2023; 35: 1343–1361.

22.

Kumar

. Binary whale optimization algorithm and its application to unit commitment problem. Neural Comput Appl 2020; 32: 2095–2123.

23.

Alwajih

Abdulkadir

Hussian

, et al. Hybrid binary whale with harris hawks for feature selection. Neural Comput Appl 2022; 34: 19377–19395.

24.

Mafarja

Mirjalili

. Hybrid whale optimization algorithm with simulated annealing for feature selection. Neurocomputing 2017; 260: 302–312.

25.

Zhou

Zhang

. A novel hybrid binary whale optimization algorithm with chameleon hunting mechanism for wrapper feature selection in QSAR classification model: a drug-induced liver injury case study. Expert Syst Appl 2023; 234: 121015.

26.

Mafarja

Aljarah

Faris

, et al. Binary grasshopper optimization algorithm approaches for feature selection problems. Expert Syst Appl 2019; 117: 267–286.

27.

Fang

Liang

. A novel method based on nonlinear binary grasshopper whale optimization algorithm for feature selection. J Bionic Eng 2023; 20: 237–252.

28.

Sorour

AlBarrak

Abohany

, et al. Credit card fraud detection using the brown bear optimization algorithm. Alexandria Eng J 2024; 104: 171–192.

29.

Sun

Liu

Wang

, et al. An integrated skip convolutional network with residual learning and feature extraction for point and interval prediction of solar radiation. Appl Soft Comput 2024; 159: 111621.

30.

Liu

Raj

ANJ

Rajangam

, et al. Multiscale-multichannel feature extraction and classification through one-dimensional convolutional neural network for speech emotion recognition. Speech Commun 2024; 156: 103010.

31.

Celik

. Covidcoughnet: a new method based on convolutional neural networks and deep feature extraction using pitch-shifting data augmentation for COVID-19 detection from cough, breath, and voice signals. Comput Biol Med 2023; 163: 107153.

32.

Fahim

Sarker

, et al. Self attention convolutional neural network with time series imaging based feature extraction for transmission line fault detection and classification. Electr Power Syst Res 2020; 187: 106437.

33.

Karthika

Rajaguru

Nair

. Wavelet feature extraction and bio-inspired feature selection for the prognosis of lung cancer − A statistical framework analysis. Measurement ( Mahwah N J) 2024; 238: 115330.

34.

Yang

. A detection method of oil content for maize kernels based on CARS feature selection and deep sparse autoencoder feature extraction. Ind Crops Prod 2024; 222: 119464.

35.

Lin

Zhou

, et al. Motor imagery EEG task recognition using a nonlinear granger causality feature extraction and an improved salp swarm feature selection. Biomed Signal Process Control 2024; 88: 105626.

36.

Kanso

Smaoui

. Logistic chaotic maps for binary numbers generations. Chaos, Solitons Fractals 2009; 40: 2557–2568.

37.

Meng

Chen

Yin

, et al. Crisscross optimization algorithm and its application. Knowl Based Syst 2014; 67: 218–229.

38.

Wang

Gao

Huo

, et al. Depthwise separable axial asymmetric wavelet convolutional neural networks. Appl Soft Comput 2024; 163: 111886.

39.

Kelly

Longjohn

Nottingham

The UCI Machine Learning Repository [Dataset], 2023. https://archive.ics.uci.edu

40.

Abdel-Basset

Mohamed

Hezam

, et al. Performance optimization and comprehensive analysis of binary nutcracker optimization algorithm: a case study of feature selection and merkle–hellman knapsack cryptosystem. Complexity 2023: 3489461. 10.1155/2023/3489461

41.

Mafarja

Aljarah

Heidari

, et al. Binary dragonfly optimization for feature selection using time-varying transfer functions. Knowl Based Syst 2018; 161: 185–204.

42.

Emary

Zawbaa

Hassanien

. Binary grey wolf optimization approaches for feature selection. Neurocomputing 2016; 172: 371–381.

43.

Toennies

KD.

An Introduction to Image Classification. Singapore: Springer, 2024. 10.1007/978-981-99-7882-3

44.

Zhou

Wang

Zhu

. Feature selection based on mutual information with correlation coefficient. Applied Intelligence 2022; 52: 5457–5474.

45.

Jin

Bie

, et al. Machine learning techniques and chi-square feature selection for cancer classification using SAGE gene expression profiles. Data Mining for Biomedical Applications 2006; 3916: 106–115.

46.

Huang

Luo

, et al. Decorrelated spectral regression: an unsupervised dimension reduction method under data selection bias. Neurocomputing 2023; 549: 126406.

47.

Yin

Yang

Wang

, et al. Ensemble selector mixed with pareto optimality to feature reduction. Appl Soft Comput 2023; 148: 110877.

48.

Liu

Shan

, et al. A feature selection method based on multiple feature subsets extraction and result fusion for improving classification performance. Appl Soft Comput 2024; 150: 111018. 10.1016/j.asoc.2023.111018

49.

Bao

Sun

Guan

, et al. An active learning method using deep adversarial autoencoder-based sufficient dimension reduction neural network for high-dimensional reliability analysis. Reliab Eng Syst Saf 2024; 247: 110140.

50.

Guo

, et al. A novel dimensionality reduction method based on flow model. Neurocomputing 2024; 599: 128066.

IBNOA-DynamicCNN: A novel data mining method based on feature subset selection and multi-feature extraction for improving classification performance

Abstract

Keywords

Get full access to this article

References