Sequential pattern mining can prove to be very useful for
predicating future activities, interpreting recurring phenomena, extracting
similarities in a series of events, etc. For example, in the NASDAQ market, the
problem of finding stocks whose closing prices are always about
β
_0
higher than or β
_1
times
the stocks of a given company, reduces to linear pattern retrieval: given query
X, find all sequences Y from the database S so that,
Y=β
_0
+β
_1
X with confidence C.
In this paper, we introduce a novel approach using the Simple Linear
Regression (SLR) model to match and retrieve sequential patterns. We extend the
one-dimensional R
^2
model to ER
^2
for
multi-dimensional sequence matching. In addition, we present the SLR + FFT
pruning technique to speed up data retrieval without incurring any false
dismissal. Experimental results on both synthetic and real datasets show that
the pruning ratio of SLR + FFT can be above 99%. Applying the retrieval
technique to real stocks resulted in the discovery many interesting patterns,
some of which are presented in the paper. Also, using ER
^2
as the similarity measure for on-line signature recognition yielded high
accuracy.