Abstract
In this paper, we study the privacy breach caused by unsafe correlations in transactional data where individuals have multiple tuples in a dataset. We provide two safety constraints to guarantee safe correlation of the data: (1) the safe grouping constraint to ensure that quasi-identifier and sensitive partitions are bounded by l-diversity and (2) the schema decomposition constraint to eliminate non-arbitrary correlations between non-sensitive and sensitive values to protect privacy and at the same time increase the aggregate analysis. In our technique, values are grouped together in unique partitions that enforce l-diversity at the level of individuals. We also propose an association preserving technique to increase the ability to learn/analyze from the anonymized data. To evaluate our approach, we conduct a set of experiments to determine the privacy breach and investigate the anonymization cost of safe grouping and preserving associations.
Get full access to this article
View all access options for this article.
