Abstract
In dynamic environments, predictive models face the challenge of concept drift - changes in data distributions that degrade model performance over time. Concept drift detection and visualization can bring valuable insight into the data dynamics, especially for multidimensional data, and are related to visual knowledge discovery. To address this, we introduce a novel visualization model based on parallel coordinates, Parallel Histograms through Time (PHT). It allows visualization and identification of concept drift in the input data, enabling the simultaneous observation of the evolution of several features. When paired with a classifier, the PHT model provides a simultaneous visualization of the data changes in terms of (i) histograms over time and (ii) variation of the characteristics of the features. Although a classifier can identify changes, the PHT model enhances this capability by visually representing the detected drift, offering good premises for a comprehensive justification of the observed changes. Experimental evaluations on synthetic (CIRCLES, SINE1) and real-world (WEATHER, ELECTRICITY) datasets demonstrate that our method detects drift with high accuracy. In conjunction with domain-specific causality analysis, our model can be used to justify the identified and visualized concept drifts.
Get full access to this article
View all access options for this article.
