Sage Journals: Discover world-class research

Abstract

With the rapid growth of audio information resources, the efficient management and accurate retrieval of massive music data have become the focus of research. To improve the accuracy and efficiency of music data retrieval, a method combining local maximum chromaticity energy with note onset detection is proposed. The piano audio fingerprint is constructed by extracting key features and generating digital identifiers. The innovation of this method lies in using note start detection to capture transient features in audio signals and combining local maximum chromaticity energy points to enhance the uniqueness and robustness of audio fingerprints. The experimental results showed that the overall recognition accuracy of the constructed fingerprint technology performed well in different dataset sizes, remaining around 93%. The recognition accuracy fluctuated around 93% in general and decreased slightly when the dataset size was larger than 400. When the dataset was 1,000, the recognition accuracy was about 90%. In terms of the audio fingerprint extraction effect, the technology had a maximum fingerprint distance of 0.80 for the songs of For Forever and Shape of You, and a minimum fingerprint distance of 0.56 for Despacito and For Forever. In terms of the performance of the fingerprint retrieval system, the research system had a minimum hit rate of 84 and a maximum hit rate of 112, while the retrieval accuracy was mostly above 90%. The audio fingerprint method was superior in recognition accuracy, robustness, and scalability. The study provides technical support and a research basis for the application of audio watermarking, copyright protection, and audio classification.

Keywords

local maximum chromaticity energy note onset audio fingerprint audio retrieval hash algorithm signal-to-noise ratio fast Fourier transform

Get full access to this article

View all access options for this article.

References

Meng

Zhu

, et al. Privacy-preserving liveness detection for securing smart voice interfaces. IEEE Trans Dependable Secure Comput 2024; 21(4): 2900–2916.

Wang

. Research on emotional music app design under the needs of office workers. Inform Manag Comp Sci 2022; 5(1): 1–2.

Lin

Liu

Chen

, et al. A secure device management scheme with audio-based location distinction in IoT. Comput Model Eng Sci 2024; 138(1): 939–956.

Leonzio

Cuccovillo

Bestagini

, et al. Audio splicing detection and localization based on acquisition device traces. IEEE Trans Inf Forensics Secur 2023; 18: 4157–4172.

Chen

Guo

, et al. An audio fingerprinting based indoor localization system: from audio to image. IEEE Sens J 2024; 24(12): 20154–20166.

Chen

Zhang

. Audio fingerprint retrieval algorithm using anti-fingerprint and frequency domain segmentation. Chin J Acoust 2023; 42(1): 82–97.

Uikey

Bedi

Choudhary

, et al. A highly robust deep learning technique for overlap detection using audio fingerprinting. Multimed Tool Appl 2024; 83(10): 29119–29137.

Xia

. The comparison of algorithms of audio fingerprinting. Appl Comput Eng 2023; 5(32523): 250–256.

Zhang

Wang

, et al. Short video fingerprint extraction: from audio-visual fingerprint fusion to multi-index hashing. Multimed Syst 2023; 29(3): 981–1000.

10.

Nguyen

Pham

, et al. Approaches for lyrics song seeking: a case study for Vietnamese song. Int J Inf Technol 2024; 16(8): 5023–5031.

11.

Reise

Fernandez

Dominguez

, et al. Topological fingerprints for audio identification. SIAM J Math Data Sci 2024; 6(3): 815–841.

12.

Ansori

Allwinnaldo

Alief

, et al. HADES: hash-based audio copy detection system for copyright protection in decentralized music sharing. IEEE Trans Netw Serv Manage 2023; 20(3): 2845–2853.

13.

Qiu

Zhang

, et al. RBNN: memory-efficient reconfigurable deep binary neural network with IP protection for Internet of things. IEEE Trans Comput Aided Des Integrated Circ Syst 2023; 42(4): 1185–1198.

14.

Roshan

Shanker

Bal

. Study of chronological order in intersecting printed and pen strokes with the help of chromaticity diagram. Acta Sci Malaysia 2022; 6(2): 38–42.

15.

Gao

Wang

, et al. VeriFi: towards verifiable federated unlearning. IEEE Trans Dependable Secure Comput 2024; 21(6): 5720–5736.

16.

Singh

Demuynck

Arora

. FlowHash: accelerating audio search with balanced hashing via normalizing flow. IEEE/ACM Trans Audio Speech Lang Process 2024; 32: 4961–4970.

17.

Xiao

Huang

Wang

, et al. MBE: a music copyright depository framework incorporating blockchain and edge computing. Comput Syst Sci Eng 2023; 47(12): 2815–2834.

18.

Kumar

Singh

, et al. Secure data storage and retrieval over the encrypted cloud computing. Int J Comput Netw Inf Secur 2024; 16(4): 52–64.

19.

Zou

Weng

Lei

, et al. EarPrint: earphone-based implicit user authentication with behavioral and physiological acoustics. IEEE Internet Things J 2024; 11(19): 31128–31143.

20.

Shaban

Atta

Elsheweikh

. Building a smart management system for the field training course at the faculty of specific education-Mansoura university. Comput Syst Sci Eng 2024; 48(5): 1213–1250.

21.

Purnima

Gautam

. Video data security: analysis, relevance and open challenges. Int J Adv Technol Eng Explor 2023; 10(104): 875–905.

22.

Giri

Prasad Chimouriya

Ram Ghimire

. Crossing strokes examination from cromaticity diagram. Sci herit j 2023; 7(1): 1–8.

23.

Wirdiani

Machetho

Putra

, et al. Improvement model for speaker recognition using MFCC-CNN and online triplet mining. Int J Adv Sci Eng Inf Technol 2024; 14(2): 420–427.

24.

Liu

Wang

Zhou

, et al. SoundID: securing mobile two-factor authentication via acoustic signals. IEEE Trans Dependable Secure Comput 2023; 20(2): 1687–1701.

25.

Sahin

. Camera recordings as a means of proof in the modern era and its evaluation from the Shari’ah perspective. Dinbilimleri Akademik Arastirma Dergisi-J Aca Res Rel Sci 2023; 23(2): 1–32.

26.

Kvicalova

. Audio forensics behind the Iron Curtain: from raw sounds to expert testimony. Sound Stud 2023; 9(2): 187–208.

27.

Prashanth

Jayalakshmi

Vedhapriyavadhana

. A review of deep learning techniques in audio event recognition (AER) applications. Multimed Tool Appl 2024; 83(3): 8129–8143.

28.

Hussain

Ullah

Senapati

, et al. Energy supplier selection by TOPSIS method based on multi-attribute decision-making by using novel idea of complex fuzzy rough information. Energy Strategy Rev 2024; 54: 101442.

29.

Senapati

Chen

Pedrycz

. Artificial intelligence-driven energy optimization in smart homes using interval-valued Fermatean fuzzy Aczel-Alsina aggregation operators. J Build Eng 2025; 105: 112418.

30.

Hussain

Yin

Ullah

, et al. Enhancing renewable energy evaluation: utilizing complex picture fuzzy frank aggregation operators in multi-attribute group decision-making. Sustain Cities Soc 2024; 116: 105842.

31.

Rukhsar

Hussain

Ullah

, et al. Intelligent decision analysis for green supplier selection with multiple attributes using circular intuitionistic fuzzy information aggregation and frank triangular norms. Energy Rep 2025; 13: 5773–5791.

32.

Xiang

Qin

, et al. Audio-text retrieval based on contrastive learning and collaborative attention mechanism. Multimed Syst 2023; 29(6): 3625–3638.

33.

Yang

. Audio feature extraction: research on retrieval and matching of hummed melodies. Informatica 2024; 48(12): 107–112.

Construction of piano audio fingerprint based on local maximum chromatic energy combined with note onset

Abstract

Keywords

Get full access to this article

References