Abstract
This paper reports a novel symbol-to-signal mapping for DNA sequences, based on the concept of categorical periodograms. A categorical periodogram is a numeric sequence with the n-th element of the sequence indicating the number of occurrences of cycles with period n in it. The period of the cycle is defined as the number of intervening events plus one. Spectral analysis studies have been conducted on Cumulative Categorical Periodogram (CCP) of 10 genes from the data set of Burset and Guigo. It is observed that the spectral signatures in CCP are functionally equivalent to the established N/3 peak in the spectrum of indicator sequences of genomes. Being a single sequence compared to four sequences in the case of indicator sequence representation, the method is claimed to be functionally equivalent, but computationally better for identification of gene coding regions in sequences.
