Research on adaptive optimization of college students’ innovation and entrepreneurship education based on machine learning and big data analysis

Abstract

Auto-PLD is the automatic design algorithm framework of machine learning pipeline for the whole-process data analysis scenario. First, define a machine learning pipeline with five stages and can support the processing of continuous and discrete type features separately. Then, the automated pipeline design problem is decomposed into two sub problems: structure search and hyper parameter optimization, and an algorithm combining reinforcement learning and Bayesian optimization is proposed to alternately optimize the two subproblems. Finally, in order to improve the efficiency of automatic pipeline design, two parallelized pipeline construction methods are further proposed. Experimental results show that Auto-PLD outperforms auto-sklearn with most datasets. Moreover, with the increase of computing nodes, the parallelized Auto-PLD can further improve the pipeline building performance. The Auto-PLD-random method performed at 1, 4, and 8 hours respectively: 19, 22, 23, 27, 26, and 25. Automated Pipeline Design with Q-learning (Auto-PLD-Q) methods performed at 1, 4, and 8 hours: 18, 20, 26; 23, 27, and 27. The Auto-PLD-DeepC method performed at 1, 4, and 8 hours, respectively: 21,23,27; 22, 26, and 26. Auto-PLD-PG methods performed at 1, 4, and 8 hours, respectively: 19, 21, 27; 26, 26, and 25. Auto-LLE is an automated machine learning algorithm framework for lifelong learning scenarios. For the classification task based on concept drift and data imbalance, an algorithm based on weighted ensemble learning of adaptive model was proposed. Concept types are divided into “long-term concepts” and “short-term concepts,” and different types of concepts are handled separately using incremental learners and adaptive weight update methods, and automatically capture the concept drift and improve the model prediction performance. Based on Auto-PLD and Auto-LLE, we design and implement a system that supports both automated pipeline design and automated lifelong learning. In system design, high system accessibility and scalability are obtained by designing easy high-level programming interface and pluggable module integration. In task type, the common data analysis tasks such as classification, regression, and clustering are supported.

Keywords

automated machine learning CASH problems lifelong learning machine learning big data analysis

Get full access to this article

View all access options for this article.

References

Agyapong

Shalaby

Wei

, et al.

Can ResilienceNHope, an evidence-based text and email messaging innovative suite of programs help to close the psychological treatment and mental health literacy gaps in college students?

Front Public Health 2022; 10: 11.

Balaji

Annavarapu

CSR

Bablani

. Machine learning algorithms for social media analysis: a survey. Comp Sci Rev 2021; 40: 32.

Belmonte

Segura-Robles

Moreno-Guerrero

, et al. Machine learning and big data in the impact literature. A bibliometric review with scientific mapping in web of science. Symmetry-Basel 2020; 12(4): 15.

Berger

Fernando

Churchill

, et al. Scoping review of stepped care interventions for mental health and substance use service delivery to youth and young adults. Early Interv Psychiatry 2022; 16(4): 327–341.

Bowman

CDD

Elkins-Tanton

Talamante

, et al. Mission to psyche: including undergraduates and the public on the journey to a metal world. Space Sci Rev 2023; 219(3): 50.

Buecker

Mund

Chwastek

, et al. Is loneliness in emerging adults increasing over time? a preregistered cross-temporal meta-analysis and systematic review. Psychol Bull 2021; 147(8): 787–805.

Choo

Razak

Radiah

ABD

, et al. A review on supervised machine learning for accident risk analysis: challenges in Malaysia. Process Saf Prog 2022; 41: S147–S158.

Clapp

McCoy

. The potential of big data for obstetrics discovery. Curr Opin Endocrinol Diabetes Obes 2021; 28(6): 553–557.

Dhillon

Ganggayah

Sinnadurai

, et al. Theory and practice of integrating machine learning and conventional statistics in medical data analysis. Diagnostics 2022; 12(10): 25.

10.

Zhou

Burger

, et al. Psychological interventions for depression in Chinese university students: a systematic review and meta-analysis. J Affect Disord 2020; 262: 440–450.

11.

Gaiotto

EMG

Trapé

Campos

CMS

, et al. Response to college students' mental health needs: a rapid review. Rev Saude Publica 2021; 55: 18.

12.

Gullino

. Teaching plant pathology: a forty-five year long journey. J Plant Pathol 2024; 11.

13.

Harerimana

Wicking

Biedermann

, et al. Nursing informatics in undergraduate nursing education in Australia before COVID-19: a scoping review. Collegian 2022; 29(4): 527–539.

14.

Wang

. Application of systems engineering principles and techniques in biological big data analytics: a review. Processes 2020; 8(8): 30.

15.

Imai

. A novel approach to analyze the factors affecting adverse drug reactions by combination of electronic medical record database and machine learning method. Yakugaku Zasshi 2023; 143(6): 485–489.

16.

Jin

Zhai

Wang

, et al. Big data, machine learning, and digital twin assisted additive manufacturing: a review. Mater Des 2024; 244: 53.

17.

Kaas-Hansen

Gentile

Caioli

, et al. Exploratory pharmacovigilance with machine learning in big patient data: a focused scoping review. Basic Clin Pharmacol Toxicol 2023; 132(3): 233–241.

18.

Kiles

Jasmin

Nichols

, et al. A scoping review of active-learning strategies for teaching social determinants of health in pharmacy. Am J Pharmaceut Educ 2020; 84(11): 1482–1490.

19.

Kiryu

. Development of a medical big data analysis system utilizing artificial intelligence analytics in clinical pharmacy. Yakugaku Zasshi 2023; 143(6): 501–505.

20.

Kokol

Zagoranski

. Machine learning on small size samples: a synthetic knowledge synthesis. Sci Prog 2022; 105(1): 16.

21.

Kruzan

Williams

KDA

Meyerhoff

, et al. Social media-based interventions for adolescent and young adult mental health: a scoping review. Internet Interv 2022; 30: 12.

22.

Kumar

Nguyen

TPN

Kaur

, et al. Opportunities and challenges in application of artificial intelligence in pharmacology. Pharmacol Rep 2023; 75(1): 3–18.

23.

Laing

Sheen

Nicola-Richmond

, et al. The utility of threshold concepts for clinical psychology education programmes. Clin Psychol 2021; 25(3): 316–328.

24.

López

Morgan

Hutchings

, et al. Revisiting critical STEM interventions: a literature review of STEM organizational learning. International Journal of Stem Education 2022; 9(1): 14.

25.

Yang

, et al. Advances in machine learning processing of big data from disease diagnosis sensors. ACS Sens 2024; 9(3): 1134–1148.

26.

Macy

Kercher

Steinfeldt

, et al. Fewer US adolescents playing football and public health: a review of measures to improve safety and an analysis of gaps in the literature. Publ Health Rep 2021; 136(5): 562–574.

27.

Monaco

Pantaleo

Amoroso

, et al. A primer on machine learning techniques for genomic applications. Comput Struct Biotechnol J 2021; 19: 4345–4359.

28.

Nassis

Verhagen

Brito

, et al. A review of machine learning applications in soccer with an emphasis on injury risk. Biol Sport 2023; 40(1): 233–239.

29.

Pearlman

Wilkerson

Cobb

, et al. Factors associated with likelihood to undergo cosmetic surgical procedures among young adults in the United States: a narrative review. Clin Cosmet Invest Dermatol 2022; 15: 859–877.

30.

Raju

Jumah

Ashraf

, et al. Big data, machine learning, and artificial intelligence: a field guide for neurosurgeons. J Neurosurg 2021; 135(2): 373–383.