Abstract
In this paper we introduce the Independence National Historical Park dataset, a freely-available collection of thin-shell ceramic fragments, images and metadata for these images. We describe the contents of this dataset and its possible uses in vessel and decorative pattern classification of thin-shell ceramic artifacts. We will employ computer vision algorithms to help automate the ceramic classifications used in archaeological vessel reconstructions. This methodology promises significant time and cost savings for the laboratory mending of ceramic objects, freeing scarce human resources to focus on the analyses of objects needed to produce new history insights. We propose a vector space model of pyramid histogram of visual words (PHOW) that can, in turn, be used by a supervised learning schema for decorative pattern classification. Inspired by explicit semantic analysis of textual documents using term frequency-inverse document frequency (tf-idf), a tf-idf model is proposed to represent thin-shell ceramic artifacts. The effectiveness of this representation is then demonstrated in decorative pattern classification using various machine learning classification algorithms, including support vector machines (SVMs) and k-nearest neighbor clustering.
Get full access to this article
View all access options for this article.
