Abstract
We created a robust and secure forensic marking algorithm through the process of hiding information in a two-dimensional (2D) barcode and embedding it into the discrete wavelet transformation-discrete fractional random transformation (DWT-DFRNT) domain using the quantization technique. We hid information in the 2D barcode, encoded it with the block code that we developed, and then converted it through scrambling. The security of the algorithm was greatly improved by increasing the calculation complexity through hiding the embedded information. Forensic marks were embedded into the DWT-DFRNT dual domain. The 2D-DWT used for this was applied to the frequency division and the DFRNT was applied to increase the algorithm security by randomly mixing the pieces of information so that they could be embedded in unpredictable locations in a certain frequency space. The bit error generated in the extraction process was corrected by the self-error-correction function of the block code and 2D barcode. The experimental result showed that the information contained in the 2D barcode was accurately extracted from the forensic marks within the error correction range.
1. Introduction
The rapid growth of the digital contents industry and the development of the relevant technology, as well as the diversification of the service, have instigated the illegal duplication and circulation of digital content; thus, violation of copyright and ownership is increasing day by day. For this reason, the illegal market based on the illicit circulation of digital content greatly influences the legal market. Digital rights management (DRM) and watermarking technology have been applied to prevent this situation. However, Apple recently announced that it had become “DRM free,” and a number of recording companies have joined it. Thus, forensic marking technology is drawing attention as a solution for this problem.
Forensic marking technology is a more positive copyright protection technology because it contains not only ownership information but also information about the user who has been given the content, so that a user who has made an illegal duplication may be tracked and identified. On the other hand, forensic marking technology to follow up user information faces a technological challenge in that it should be able to provide a greater amount of information than conventional watermarking technology and have the capacity to safeguard robustness and security.
Audio forensic marking began with a first fundamental study [1] and proceeded to various methods, including the spread spectrum method [2], echo hiding [3], and the quantization method [4]. In addition to the method of embedding them into the time domain [5], studies have been conducted on methods of robustly embedding forensic marks into the frequency domain; these include discrete cosine transform (DCT), DWT, singular value decomposition (SVD), and cepstrum transform (CT) [6–9]. Recently, studies have also been carried out on methods that use dual domains such as DWT-DCT, DWT-SVD, DCT-SVD, and WT-complex cepstrum transform (CCT) [10–13]. However, because most of the studies have focused on a single purpose such as robustness or mass embedment, there is the need for a study considering all aspects of the problem, including robustness, algorithm security, and extraction performance.
In this study, therefore, we employed the DWT-DFRNT domain and a two-dimensional (2D) barcode to ensure robustness and security; furthermore, we developed and used a block code pattern for accurate extraction by increasing the extraction performance.
2. Related Works
2.1. 2D Barcode and Forensic Mark
2D barcodes, which contain hidden information, are widely used in various areas such as newspapers, magazines, posters, TV, the internet, tickets, receipts, and advertisements. 2D barcodes retain information in two directions, horizontally and vertically, and thus the amount of recordable information is drastically greater than in a one-dimensional (1D) barcode. A 2D barcode is also applicable to digital content: a visible mark can be embedded into digital content such as a research article or an image so that it contains the information relevant to the content.
Figure 1 shows some representative examples of 2D barcodes that have been released and frequently used: (a) the quick response (QR) code, (b) DataMatrix, and (c) PDF417. In different forms, all of them show a 2D barcode generated from the same information, the message “123456789.” Among them, PDF417 is stack barcode, whereas the QR code and DataMatrix are based on the matrix method. The QR code holds the greatest amount of information, followed by DataMatrix and PDF417. Among the various types of 2D barcodes, the QR code is known to exhibit good performance in many respects, since the code size is small even if it contains a great deal of information, and the code can be scanned and read rapidly.

Types of 2D Barcode.
The information capacity and code size of 2D barcodes are dependent on the module size, error correction level, and types of encoding. Generally, the information capacity increases as the code size of the 2D barcode increases but decreases as the error correction level rises. For example, a
2.2. DWT-DFRNT Dual Domain
In this study, we embedded forensic marks into the DWT-DFRNT dual domain in order to ensure the robustness of the forensic marks and the security of the algorithm based on the frequency decomposition ability of DWT and the unpredictable random distribution of DFRNT.
2D-DWT was used in this study and a 1D audio signal was converted to a 2D signal to be used as the input for the 2D-DWT. The 2D-DWT-converted audio signals can be decomposed into H (LH), V (HL), and D (HH), which have different frequency characteristics from one another. One time of 2D-DWT allows for the embedment of at least three forensic marks. This not only robustly embeds the forensic marks into a certain frequency band but also allows the information about the copywriter and user, including the secondary copywriter or those with the neighboring copyright, to be additionally embedded into the content circulated by the copywriter of the content. This shows the pathways by which the contents are circulated and thereby enables effective multistage circulation tracking.
DFRNT accepts the specific frequency coefficients generated by the 2D-DWT as the input data for the DFRNT and randomly mixes the data by effecting various changes through the manipulation of the parameters. This leads to increased calculation complexity, so that the statistical characteristics of the data may not be understood by illegal users. The DFRNT [14] is generally performed in the method that follows.
Firstly, matrix H is generated using P generated as a random seed value, which is one of the parameters shown in (1):
To generate an eigenvector from matrix H, SVD matrix decomposition is performed with respect to H, as shown in (2):
Here, the generated
Next, the
Then,
In this way, DFRNT can transform the input signals to arbitrary unpredictable signals with three parameters and restore them through inverse transformation.
Nowadays, there are several researches for forensic marking algorithm using DFRNT or 2D Barcode. Guo et al. [15] studied a watermarking algorithm using high amplitude selection and phase shifting keying in the DFRNT domain, Luo et al. [16] used DFRNT domain to embed an image watermark into subimage block which is subsampled from original image, and Jin and Kim [17] proposed a watermarking algorithm using visual cryptography and quantization of DFRNT coefficients. The algorithms using DFRNT are secure because DFRNT has random key, but the drawback is less robust against attacks because the algorithms were not combined with a frequency transform method.
Many research papers tried to use the 2D Barcode as a watermark. Premaratne and Safaei [18] studied to embed datamatrix code into DWT-DFT domain, Kim et al. [19] enhanced the datamatrix watermarking algorithm using encryption keys, and J.-H. Chen and C.-H. Chen [20] studied detection scheme using QR code and DCT. Gunalan and Nithya [21] studied to embed QR code using histogram shifting method and Seenivasagam and Velumani [22] studied to embed QR code in CT-SVD domain. However, these methods have disadvantages such as weak security caused by not taking appropriate security like DFRNT, small information capacity, and vulnerable robustness. Poomvichid et al. [23] studied to embed QR code in DWT domain using genetic algorithm as the method for audio content. This method restored robustness a little, but there are some disadvantages of small information capacity and weak security. Nah et al. [24] proposed the method embedding DotCode making into Hadamard matrix in DCT domain. This method has also some problems such as small information capacity, low robustness, and nonblind needing original audio.
We researched various forensic marking methods for image and audio in multiple domain. Li and Kim [25, 26] studied to embed hologram forensic mark generated from random binary image for gray image into DCT-SVD domain or DWT-SVD domain. Li and Kim [27] studied to embed hologram generated from random binary image into DWT domain for audio. Li and Kim [28, 29] studied to embed binary watermark into DWT-SVD domain or DWT-DCT domain using quantization method for audio. However, most of these methods have strong robustness but have common disadvantages such as small information capacity and weak security.
To overcome this problem, this paper propose to use 2D Barcode in dual domain combining DFRNT and frequency domain as DWT, so inaudibility, enough capacity, robustness, and security are enhanced.
3. Proposed Forensic Marking Algorithm
3.1. Generation of Forensic Mark
The information that is embedded into the audio signal is generated as a barcode through a 2D barcode encoder; the generated barcode is put into the block code encoder that we designed for the coding to a binary image. It then undergoes scrambling and finally produces the forensic mark image.
Since the error correction of the 2D barcode is focused on the correction of bust error rather than random error, other possible errors other than bust error are corrected by such methods as block coding.
Figure 2(a) shows the

Block code Patterns.
When the encoding is performed in the
When the decoding is performed in a
Like Figure 2(a), Figure 2(b) shows the encoding into a
When the decoding is performed using a
The image that has been encoded by the block code goes through scrambling. Since scrambling mixes the image pixel values in a meaningless order, it enhances the reverse engineering security of the embedded information.
3.2. Forensic Mark Embedding Algorithm
The forensic marks generated through the procedures of 2D barcode generation, block coding, and scrambling are embedded into the DWT-DFRNT domain. Equation (11) shows the procedure in which the embedded information is encoded through a barcode generator (BACG) and block code encoder (BLCE):
where c denotes
Equation (12) shows the procedure of the scrambling accepting the encoded image K. In (12), a refers to the size extended by padding the pixel at the rim of the image K of
On the other hand, the original audio signal undergoes 2D-DWT and DFRNT, as shown in (13). Firstly, the original signal is decomposed into the H, V, and D subband elements through the two-stage 2D-DWT and the subband coefficients are put into the DFRNT function. The DFNRT not only has the random seed value
The F information is embedded to the subband coefficient value S that has gone through the DFRNT through the quantization process shown in (14), (15):
where floor indicates that only the integer part is taken. The T calculated by (14) is used for the calculation and conditional judgment in (15), where mod refers to modular operation and Q the quantization coefficient as follows:
Through the procedure, the forensic mark information is embedded into S. Equation (15) shows how +1 bit and −1 bit are embedded, respectively.
The S to which the information has been embedded again goes through the inverse DFRNT (IDFRNT). The IDFRNT is performed by simply changing the sign of α, the parameter for the DFRNT, as
Therefore, the process of the sequential IDFRNT and inverse DWT (IDWT) can be expressed as
Through the process, we finally obtain the audio signal Y to which the forensic marks have been embedded.
Figure 3 shows the forensic mark embedment process step by step.

Forensic mark insertion processes.
3.3. Forensic Mark Extracting Algorithm
Forensic mark extraction is the opposite of the embedment process that includes 2D-DWT and DFRNT, requantization, descrambling, block code decoding, 2D barcode generation, and reconstruction. Firstly, the audio signal Y to which forensic marks have been embedded is decomposed through 2D-DWT into subband frequency elements that then go through DFRNT. The DFRNT parameters should have the same values that have been used for the forensic mark embedment at this time:
The signal
The image
In (21),
Through these procedures, the embedded information can be finally restored. The restored information
Figure 4 shows the forensic mark extraction process step by step.

Forensic mark extraction processes.
4. Experimental Result
4.1. Experimental Environment
The sampling rate of the sample audio used for the experiment in this study was 44100 Hz. The segment size, which is the embedment unit, was selected as 65536 Sample (1.4861 s) to be suitable for the DWT. The used 2D barcodes were QR codes of the
DWT decomposes the input signal into the three subbands of H, V, and D through two-stage 2D-DWT and applies the DFRNT to each of them. The default setting for the parameters of the DFRNT function was
The web-based generator RACO [30] was used for the generation of the 2D barcode. The robustness experiment was performed to evaluate robustness through the attacking experiment using “Stirmark for audio” [31], an audio watermark experimental tool.
The quality evaluation after the forensic mark embedment was conducted with reference to the signal-to-noise ratio (SNR), and the extracted forensic mark was evaluated in terms of bit error rate (BER) and normalized cross-correlation (NC). The mean SNR value was set to be near to 25 dB by controlling the forensic mark embedment strength (the Q value). The formulas to calculate BER and NC are as follows:
4.2. Multistage Embedment and Extraction (No Attack)
We evaluated the performance of the suggested algorithm for the case where the parameters used for the forensic mark embedment and extraction are not changed and no attack is made.
Figure 5(a) shows the 2D barcode generated from the embedded information, where the size of the cell is

Forensic mark generation by
Figure 6(a) shows the original audio signal and Figure 6(b) the audio signal after the embedment of the forensic marks into the V band of the 2D-DWT. The Q value for the embedment was set to be 0.05 and the acquired SNR was 26.10 dB. The message hidden in the QR code was “Always be my baby.”

Original audio and forensic marked audio.
In multistage circulation tracking, information should be embedded into music contents in stages. Table 1 shows the extraction results after embedding the information into the V band, then the H band, and later the D band. The messages “Eric7501232345678” and “Dana7203121234567” were, respectively, embedded to the H band and the D band through the QR codes. The SNRs calculated following the embedment were 22.77 dB and 21.01 dB, respectively, indicating that the SNR values decreased in stages by the information embedment.
Extraction results of multiple embedding.
The experimental result showed that the rescrambled image had a bit error within 2%, but the 2D barcode restored through block code decoding had no bit error.
Table 2 shows the SNR values before and after the forensic mark embedment into the audio signals of various genres of music. The mean SNR was 26.19 dB and BER was 0% in all the extracted samples.
SNR of the marked audio.
4.3. Security Experiment
In this study, we proposed that the DFRNT-based algorithm was secure and verified whether the forensic marks could be extracted when there was a change in the orders of the partial parameters of the DFRNT and scrambling.
To evaluate the security of the DFRNT, we changed the partial parameters of the DFRNT function and determined whether the extraction was possible. As shown in Table 3, a small change in the three parameters
Security by DFRNT.
Additionally, simply changing the order of scrambling gave totally different extraction results. Figure 7 shows the extraction results when the

Security by scrambling.
4.4. Robustness Experiment
Table 4 shows the result after attacks were made, including compressor, “add noise,” and low pass filtering. The state of the 2D barcode was at the level where the embedded messages could be restored in all cases. In the cases of “add noise” and low pass filtering, three symbols of the restored 2D barcode were partially damaged but they could be restored by the standard code system and put into the 2D barcode restorer so that the restoration rate of the embedded information could be increased.
Extraction results after Stirmark attacks (QR_Ver = 1).
Table 4 also compares the extraction performance with reference to the BER between the DWT single domain and the DWT-DFRNT dual domain whose security has been ensured, fixing the SNR at 26.10 dB. The two types of domain showed similar extraction results and the QR codes of both domains were restorable. This experiment showed that the DWT-DFRNT exhibited a similar level of performance with that of the DWT single domain while maintaining security. A similar level of difference may be found if the experiment is performed with different QR code versions and block code patterns.
To include more information in the 2D barcode, we performed an experiment using QR code version 2 (
Table 5 shows the result after attacks were made, including compressor, “add noise,” and low pass filtering with Stirmark. The BER was similar to that of the result shown in Table 4, but more information could be hidden in the QR code.
Extraction results after Stirmark attacks (QR_Ver = 2).
We encoded the QR code version 1 (
Table 6 shows the result after attacks were made on the audio signal processed by the
Extraction results after Stirmark attacks (QR_Ver = 1).
Table 7 shows the result of the robustness test result with the audio samples of various genres of music. In most cases, the audio samples were robust to compressor, “add noise,” and low pass filtering by Stirmark.
Robustness (QR_Ver = 1,
Table 8 shows the extraction result after the representative attacks by Stirmark. The audio samples were robust to various types of attacks, including add_brumm, add_noise, compressor, dyn_noise, exchange, extra_stereo, and low pass filtering. The experiment was performed with the
Extraction results after Stirmark attacks.
Figure 8 shows the bit error extracted after the compressor, “add noise,” and low pass filtering attacks depending on the change of the quantization coefficient. Overall, the bit error decreased as the quantization coefficient was increased. Regarding each attack type, the bit error decrease was the greatest for low pass filtering and the smallest for compressor. The experiment was performed with the

BERs by changing the quantization coefficient.
Figure 9 shows the bit error extracted after the signal processing including compressor, “add noise,” and low pass filtering when the two types of QR codes, version 1 (

BERs by QR code and block code pattern.

SNRs and BERs by mp3 compression.
Figure 10 shows the trends of the SNR and the extraction bit error of the audio signal according to the bit rates of the mp3 compression. As the bit rate was decreased, the SNR decreased a little. When the compression was performed at 96 kbps, the bit error was less than 2%, which allows accurate information extraction. However, when the compression was performed at 64 kbps or more, the SNR drastically decreased to the level of 16 dB or lower. At this level, the sound quality is so low that the music service cannot be provided.
Figure 11 shows the trend of the extraction bit error according to the filtering cut-off frequency. When the cut-off frequency was 3 k or lower, the bit error exceeded 3% so that the embedded information could not be restored. The experiment for Figures 10 and 11 was performed with the

BERs by filtering cut-off frequency.
Table 9 shows comparative analysis results with reference [23, 24]. As shown in Table 9, computational complexity, information capacity, robustness, and security are compared analytically by the same experimental condition concerning inaudibility for audio sample contents. From the result, the proposed method has more enough capacity, higher robustness and security than other two methods. The proposed method uses DWT-DFRNT dual domain, so robustness, and security are obtained at the same time. Information capacity is given by quantization method and detection performance is enhanced by block code method.
Overall comparison between related research and the proposed method.
5. Conclusion
In this study, we ensured robustness and security by embedding forensic marks into the DWT-DFRNT dual domain generated from a 2D barcode image through block coding and scrambling. The DWT domain gives robustness and is suitable for multistage embedment, while the DFRNT domain contributes to the security of the algorithm by randomly mixing the information so that it may be embedded in an unpredictable position of a certain frequency band. The block coding and 2D barcode are required to increase the extraction performance by reducing the error taking place during the extraction.
The experimental result showed that the forensic marks were secure because the extraction failed when the embedment/extraction key composed of a series of parameters was partially changed. The result also showed that the forensic mark was robust to a number of attacks by “Stirmark for audio.” Additionally, the forensic marks could be accurately embedded to the frequency subbands by multistage embedment and accurately extracted, showing that multistage illegal circulation may also be followed up.
In this paper, the parameter for DFRNT transform is selected empirically for experimental environment but more stable range has to be selected for security. Also, 2D Barcode or block code method causes limited capacity. Hence, applications needing low robustness and high payload can use low error correction level or large 2D Barcode, or optional block code method. In the future research, an adaptive optimization algorithm should be also studied for improving robustness by embedding the watermark into optimal space with proper strength.
Footnotes
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgment
This research project was supported by the Ministry of Culture, Sports and Tourism (MCST) and the Korea Copyright Commission in 2011.
