Classification of Short Kinetics by Shape

Abstract

Discerning significant relationships in small data sets remains challenging. We introduce here the Hamming distance matrix and show that it is a quantitative classifier of similarities among short time-series. Its elements are derived by computing a modified form of the Hamming distance of pairs of symbol sequences obtained from the original data sets. The values from the Hamming distance matrix are then amenable to statistical analysis. Examples from stem cell research are presented to illustrate different aspects of the method. The approach is likely to have applications in many fields.

Keywords

symbolic data transformation hamming distance kinetics hematopoietic stem cells trading volume