Research on real-time software defect prediction system based on deep neural network

Abstract

In response to the problems of poor generalization ability and difficulty in feature selection of traditional software defect prediction models, this paper introduces deep neural networks to build an automated, real-time updated, and efficient software defect prediction system that processes large-scale data. Data, code features, historical defect records, and developer activities can be collected from the version control system Git and defect tracking system JIRA. The quartile range method can be used to handle outliers, the mean interpolation method and forward filling method can be used to fill in missing values, and these raw data have been cleaned and feature extracted. In terms of the model, the Deep Neural Networks (DNN) algorithm is used for model architecture and training. In terms of real-time prediction, Apache Kafka and Spark Streaming are used to achieve real-time acquisition, processing, and analysis of software data, achieving real-time software defect prediction. After multiple experiments on several open-source projects, the model achieves a prediction accuracy of 92.5%, a recall of 88.3%, a precision of 90.2%, and an F1 score of 89.2%. The results show that the system has high prediction performance when dealing with large-scale data in complex environments, and can help improve the efficiency and quality of software development.

Keywords

software defect prediction artificial intelligence algorithms deep neural networks real-time data processing large-scale data

Get full access to this article

View all access options for this article.

References

Lina

Shujuan

. Research progress in software defect prediction technology. Journal of Software 2019; 30: 3090–3114.

Rhmann

Pandey

Ansari

, et al. Software fault prediction based on change metrics using hybrid algorithms: an empirical study. Journal of King Saud University-Computer and Information Sciences 2020; 32: 419–424.

Thota

Shajin

Rajesh

. Survey on software defect prediction techniques. International Journal of Applied Science and Engineering 2020; 17: 331–344.

Liu

Guo

Liu

, et al. Comparative experiments between software defect prediction models: problems, progress, and challenges. Journal of Software 2022; 34: 582–624.

Daoud

Aftab

Ahmad

, et al.

Machine learning empowered software defect prediction system.

Intelligent Automation & Soft Computing 2022; 31: 1287–1300.

Mohammed

Kora

. A comprehensive review on ensemble deep learning: opportunities and challenges. Journal of King Saud University-Computer and Information Sciences 2023; 35: 757–774.

Thota

Shajin

Rajesh

. Survey on software defect prediction techniques. International Journal of Applied Science and Engineering 2020; 17: 331–344.

Chen

R-C

Dewi

Huang

S-W

, et al.

Selecting critical features for data classification based on machine learning methods.

Journal of Big Data 2020; 7: 52–52.

Oveisi

Moeini

Mirzaei

. LSTM Encoder-Decoder dropout model in software reliability prediction. International Journal of Reliability, Risk and Safety: Theory and Application 2021; 4: 1–12.

10.

Cabral

Minku

. Towards reliable online just-in-time software defect prediction. IEEE Transactions on Software Engineering 2022; 49: 1342–1358.

11.

Venkatesh

Anuradha

. A review of feature selection and its methods. Cybernetics and Information Technologies 2019; 19: 3–26.

12.

Jinxiao

Yutao

. A software defect automatic allocation method based on hybrid neural networks and attention mechanisms. Computer Research and Development 2020; 57: 461–473.

13.

Ardimento

Aversano

Bernardi

, et al.

Just-in-time software defect prediction using deep temporal convolutional networks.

Neural Computing and Applications 2022; 34: 3981–4001.

14.

Yang

Chen

. A survey on ensemble learning under the era of deep learning. Artificial Intelligence Review 2023; 56: 5545–5589.

15.

Ewuoso

Hall

. Core aspects of ubuntu: a systematic review. South African Journal of Bioethics and Law 2019; 12: 93–103.

16.

Song

Minku

. A procedure to continuously evaluate predictive performance of just-in-time software defect prediction models during software development. IEEE Transactions on Software Engineering 2022; 49: 646–666.

17.

Tabassum

Minku

Feng

. Cross-project online just-in-time software defect prediction. IEEE Transactions on Software Engineering 2022; 49: 268–287.

18.

Yan

Huayao

Changhai

, et al. User feedback in Firefox defect tracking system. Software Journal 2022; 33: 3983–4007.

19.

Herbold

. On the costs and profit of software defect prediction. IEEE Transactions on Software Engineering 2019; 47: 2617–2631.

20.

Zhang

, et al.

An empirical study of data sampling techniques for just-in-time software defect prediction.

Automated Software Engineering 2024; 31: 56–67.

21.

Deng

Qiu

. Software defect prediction via LSTM. IET software 2020; 14: 443–450.

22.

Sahar

Bangash

Hindle

, et al.

IRJIT: a simple, online, information retrieval approach for just-in-time software defect prediction.

Empirical Software Engineering 2024; 29: 131–168.

23.

Alsaeedi

Khan

. Software defect prediction using supervised machine learning and ensemble techniques: a comparative study. Journal of Software Engineering and Applications 2019; 12: 85–100.

24.

Khan

. Hybrid ensemble learning technique for software defect prediction. International Journal of Modern Education & Computer Science 2020; 12: 1–10.

25.

Son

Pritam

Khari

, et al.

Empirical study of software defect prediction: a systematic mapping.

Symmetry 2019; 11: 212–239.

26.

Cichy

Kaiser

. Deep neural networks as scientific models. Trends in Cognitive Sciences 2019; 23: 305–317.

27.

Jorayeva

Akhan

Cagatay

, et al.

Machine learning-based software defect prediction for mobile applications: a systematic literature review.

Sensors 2022; 22: 2551–2567.

28.

Kaur

Pruthi

Gandhi

. Machine learning based software fault prediction models. Karbala International Journal of Modern Science 2023; 9: 237–251.

29.

Wang

Arasteh

, et al. A software defect prediction method using binary gray wolf optimizer and machine learning algorithms. Computers and Electrical Engineering 2024; 118: 109336.

30.

Arasteh

Ghaffari

, et al. A new binary chaos-based metaheuristic algorithm for software defect prediction. Berlin, Germany: Cluster Computing, 2024, pp.1–31.

31.

Bakeman

. Kappaacc: a program for assessing the adequacy of kappa. Behavior Research Methods 2023; 55: 633–638.

32.

Chen

L-q

Wang

Song

S-l

. Software defect prediction based on nested-stacking and heterogeneous feature selection. Complex & Intelligent Systems 2022; 8: 3333–3348.