Abstract
This article describes how transportation distances such as the Earth Mover's Distance can be used for measuring melodic similarity for notated music. We represent music notation as weighted point sets in a two-dimensional space of onset time and pitch. The Earth Mover's Distance can then be used for comparing point sets by determining how much work it would take to convert one of the point sets into the other by moving weight between the point sets.
For evaluating how well this method and other methods agree with human perception of melodic similarity, we established a ground truth for the RISM A/II collection based on the opinions of human experts.
The RISM A/II collection contains about half a million musical incipits. For 22 queries, we filtered the collection so that about 50 candidates per query were left, each of which we then presented to about 30 human experts (out of a group of 37 experts) for a final ranking. We present our filtering methods, the experiment design, the resulting ground truth, and a new measure (called “Average Dynamic Recall”) that can be used for comparing different similarity measures with the ground truth.
Get full access to this article
View all access options for this article.
