Abstract
In recent years gazetteers based on semantic web technologies were discussed as an effective way to describe, formalize and standardize place data by using contextual information as a method to structure and distinguish places from each other. While research concerning semantic gazetteers with regard to historical places has pointed out the importance of enabling the creation of a global and epoch-spanning gazetteer, we want to emphasize the importance of taking a domain oriented approach as well – in our case, focusing on places set in medieval and early modern times. By discussing the topic from the historians’ perspective, we will be able to identify a number of challenges that are specific to the semantic representation of places set in these time periods. We will then do a survey of existing gazetteer projects that are taking historical places into account. This will enable us to find out which technologies and practices already exist, that can meet the demands of a gazetteer that considers the time specific geographic, social and administrative structures of medieval and early modern times. Finally we will develop a catalogue of design practices for such a semantic gazetteer. Our recommendations will be derived from these existing solutions as well as from our epoch-specific challenges identified before.
Introduction
If the humanities and especially historical research are going to make use of the methods and techniques provided by computer sciences, in order to enhance and improve their methodological scope, they need to represent their domain and the knowledge belonging to it in a formalized, standardized, and machine-readable way. To this end, it is necessary to develop models in order to be able to capture the historical data, to enrich and to process them.
When studying the development of human societies from a historical perspective, three dimensions are of major relevance: people, time and space. In this paper we will discuss how to model historic places and spaces, focusing on the central European area and with regard to the medieval and early modern era, by means of information technologies. We propose the usage of well-established standards that enable data to be easily shared, linked, enriched with data from different domains, and reused for a wide variety of research questions.
Standardization inevitably leads to simplifications regarding different aspects of our understanding of historical places. We must therefore examine how such a need for simplification can be met while preserving the required level of complexity. According to the geographer Yi-Fu Tuan, a place can be understood as a “a center of meaning constructed by human experience” [63, p. 152]. According to this definition, a place in general can be anything as small and concrete as a gate, or as vast and abstract as an empire. Thus to understand and work with places, it is crucial to distinguish them from other ones alongside three dimensions: the geographic extent of the place, its
The actual distinction of places is commonly achieved by using specific geographical data that can be structured either with maps (thus focusing on the geometric extent) or by enriching place names with contextual information, which can be done with gazetteers. Both approaches can be represented with the help of computer-based methods and used for automated analysis. Historical cartographers are employing GIS for their needs while the structure of a gazetteer can be modeled by using relational or graph databases. Of course, the idea of a gazetteer pre-dates the digital age. For an overview on the use and genesis of gazetteers throughout history, see [43].
For historians of the pre-modern eras, it is not only important where the places described in historical sources are geographically located. Rather, it is of much more interest interest in which way they were related to certain groups of people and agents wielding governmental, juridical or religious power or influence over it, as well as how different places were related to each other. Such a contextual approach focusing on what humans attribute to a place rather than on it’s geometrical extent offers a more accurate description of what distinguishes a place from another. The name as well as the location of a place is just a designation for a space that is in some way meaningful to someone, while the relation of a place to the cultural and temporal setting in which it has certain properties, make it unique as an entity [55, p. 56], [58, p. 138]. Multiple and complex relations can be modelled by using digital gazetteers. A gazetteer uses an implicit structure to arrange and distinguish places by their various properties [19, p. 1042]. In its basic structure, described by Linda Hill, a place in a gazetteer consists of at least one An emerging standard using this specification is
Some challenges related to the development of historical gazetteers have recently been discussed by [3,19,58]. However, these works have mostly focused on the 19th, 20th and 21st centuries. Places, set in these epochs, are more commonly distinguished by clear borders due to an increasingly high administrative penetration of space since the 1800s. A standardization of administrative institutions inside the states means fewer ambiguities to consider [53, pp. 99–100, 106].
Other works have taken a broader approach with upper-level ontologies and encompassed a model for human activity in relation to places [23]. Recent projects try to connect existing gazetteers to create an interconnected model of a global (historical) gazetteer. An example for such a project is Pelagios, interconnecting places and documents with a focus on antiquity [56]. However, such a global gazetteer has to be created as a “federated system”, derived from multiple gazetteers, highly specialized on certain demands [24, p. 87].
This paper will take a step back from this global perspective and adopt a domain-oriented focus to the distinct problems related to the representation of places set in medieval and early modern times, with regard to constraints specific to Europe, especially to the Holy Roman Empire. Places from these epochs share a few properties that are not inherent to later periods. The main reason for this is a different kind of society predominant in Europe before the 19th century. Other than modern democratic cultures, medieval and early modern societies where split into multiple sub-societies which existed next to one another or in conflict to each other. This also meant a much more ambivalent construction of meaning attested to the places that were relevant for these societies. A lack of clearly defined administrative units or a general fuzziness of borders between places are only two prominent examples for such ambiguities. In part, these problems are made even more complicated by incomplete or conflicting historical source material. Modern gazetteers on the other hand can more easily disregard such aspects to structure place data.
In Sections 2 and 3, we discuss the general methodological challenges that arise when creating a gazetteer for historical places – like modelling time, (proto-)administrative hierarchies or conceptualization, under special and more detailed consideration of the medieval and early modern situation. In Section 4 we examine how these problems have been addressed by existing gazetteers and ontologies. These are not dedicated explicitly to medieval and early modern places, but may in some cases be adapted for these specific domains. With this survey, we are able to get an overview of the technologies currently in use and the methodological problems that have already been addressed. Finally, we summarize these results in a catalogue of design-practices for the creation of historical gazetteers that take into account the particular properties of medieval and early modern places, and look ahead at necessary future developments.
In terms of technology we promote the usage of Semantic Web technologies for the creation of gazetteers. The standardization of data and the rules on how to organize them, which are intrinsic properties of Semantic Web technologies, enable a high degree of interoperability among different data sources. This allows researchers to use the same model to represent historical places and to interlink their different collections of data within the web of data. The formalized set of rules is developed and published as an
To begin with, it is imperative that place data are systematically structured, so that places can be better distinguished within gazetteers. In general, there are two basic approaches for structuring place data:
See for example ISO 19112, an international standard for the implementation of a gazetteer based on geospatial features; see [57, p. 70].
A
If marking the geographic position of a place is necessary but the use of coordinates is impossible or not desired (e.g. due to lack of accurate data), places can also be distinguished by using Qualitative Spatial Reasoning. This can be implemented by representing the location of places via topological relations (see for instance [6]). The discovery and visualization of these relations can be facilitated by means of applying Semantic Web technologies and known GIS (Geographic Information Systems) standards [33]. In this case, a place is described in its abstract spatial relations to other places. For example, if one wants to determine which towns and villages were within the dominion of the 16th century Prince-Bishopric of Münster, the extension of the territory3 In this example territory is grasped as an administrative unit of some sort. The example works as well in a narrower form in which territory is understood for example as a parcel of land.
Modelling places and territories in the pre-modern periods poses many conceptual and methodological challenges. In this section we elaborate on these challenges, including specific ontological issues regarding different ways to describe a place as a concept. We also discuss which problems arise when dealing with multiple and changing toponyms and with modelling temporal, territorial and hierarchical structures. Finally, we discuss the issue of capturing data provenance with regard to modelling historical places.
Toponymy
Modelling the designation of places over time poses a challenge in itself. A town in the contemporary Federal Republic of Germany is called
To address these problems, a gazetteer must distinguish between the place as such and its designations.4 For a more detailed discussion of technical solutions for these problems, see [62] with regard to the career biographies of researchers.
In order to describe places based on historical and contemporary human-made attestations, categories have to be developed. Especially when dealing with historical places, the development of these categories already implies an interpretation. This interpretation is either based on historical research or, when using concepts directly derived from historical sources, based on a specific worldview, held by the creators of these sources. When focusing on the latter, a gazetteer cannot be merely understood as a collection of places, but as a “cultural gazetteer” [58, p. 141]. In addition, there are two major aspects to consider when creating concepts for categorizing places.
Firstly, one can distinguish between
Although fiat objects and physical objects can be treated differently, it must nevertheless be possible to create dependencies and relations between them. A physical place like a church building can be located inside an administrative unit, which has its own properties. This means, that at least some of the properties concerning the administrative unit must be valid for the church as well. Therefore, the actual place and the meaning of the place attested by humans are distinguished by two concepts. Another example for this could be Vehmic oaks, which served as a court in medieval times. Modelling the tree object and the attested meaning as a place of jurisdiction as two concepts can also make it easier to represent changes to both aspects separately. The function of any specific Vehmic oak ended at some point in time, the tree itself on the other hand can exist much longer and even get another cultural meaning, attributed later in time.
Secondly, it is possible to use either specific/contextualized or general terms for place concepts. Place concepts like
The given examples for specific fiat concepts illustrate an additional challenge related to medieval and early modern places in particular. Not all historic ways of ruling can be grasped the same way as contemporary administrative units. Going further back in time, a clearly defined authority ruling a certain territory is less likely to be found. Instead, administration existed in the form of various privileges, e.g. different forms of jurisdiction or the right to raise taxes. These, in a given territory, could be enforced by different agents – e.g. a prince, the church, a town, or a local nobleman. Sometimes the same or similar privileges could be claimed by multiple ruling actors for the same group of people. Therefore an accurate depiction of an administered space with a multitude of privileges based only on fiat concepts would require the creation of a large and heterogeneous number of such concepts. A more accurate way would be to model ruling as relations between agents, privileges and places. This way, the vast ambiguity of medieval administration, as well as small conflicts of interest, could be captured more effectively. An example for the connection of data about places and people to model rulership is provided by the
Place concepts are similar to the
To better grasp these challenges about multiple assertions how a place is regarded, we can again take a look at the history of the city of Münster. During the
Hierarchies
An extensive gazetteer should be able to structure not only the places as singular entities, but also model relations between them. We can distinguish semantic and geometric relations between places. In this section we will focus on the semantic relations between fiat places, while the issue of geometric relations has been discussed in Section 2. When considering semantic relations a historical gazetteer should especially be able to represent the membership of a place in an administrative structure. Modelling these relations has several advantages. When for instance the name of a town is not known to a user, the place can nevertheless be found by its context information e.g. by its association to a certain political entity. Furthermore, the position of a place in a hierarchy of rulership already delivers a basic understanding for the people associated with a place. Their rights, duties and status as ruled subjects can in part be derived from the legal status of the place they inhabit. This legal status can express itself in its relation to other places.
When modelling places as defined administrative units, one has to decide between distinguishing the levels in a general (
Firstly, for the medieval and early modern period, we can distinguish between an Especially itinerant craftsmen, for example tinkers, were organized in tinker-districts. These districts stood in relation to a territorial lord who was responsible for their protection; see [32, pp. 831–932].
Until this point, we presumed that hierarchies were not in conflict with each other and therefore not contradictory. But places can also be part of two or more A contemporary example for such a place would be the Crimean peninsular.
To illustrate the problems described above we can take a look at the jurisdictional situation in the city of Münster from the 16th to the 18th century. As a city set in the prince-episcopat of the same name, it fell under the jurisdiction of the diocesan ecclesiastical court (
If territories are modelled in a gazetteer, it has to be asked how continuity and changeability of their borders and coverage are structured in the ontology. Furthermore, processes like merging or splitting of territories need to be modelled, so some relations for stating if there is any continuity between such operations should be created.
The representation of the territory as an area poses another problem. When dealing with territories that are encompassed by clearly defined boundaries, it is possible to store area information as polygons to maps. However, with medieval and early modern territories this is rarely feasible, although there were material (like using boundary stones,
Even if this kind of border demarcation is preserved, they are not necessarily an accurate mark of historical territories. Borders were often in dispute, so that this status has to be captured as well. These considerations only apply, however, when there was indeed a spatial concept of borders, thus when a dominion was linked to a territory. The polygon-based representation of administrative structures, which is used in modern maps, does not reflect the medieval or early modern situation. In the Middle Ages, ‘ruling’ did not mean
Clear boundaries were more simply established for small spaces in which ruling agents were defined, and where they could unambiguously be marked, e.g. by walls, as it is the case for towns, or part of towns like cathedral immunities [52, p. 10].7 One should note, that even such clearly defined borders usually only marked the core of a sphere of influence. The whole sphere of influence often expanded further to a peripheral state. An example for this was he
Because of the continuous changes of toponyms, place concepts, administrative affiliation or the mere existence of places, a historical gazetteer demands for a model of time. There are numerous possibilities to tackle this issue. Some projects, focusing on contemporary place data, simply use a
Using GIS practices as an example, the easiest way to model time is to understand the whole data set as a approximative representation of the world at a certain point in time. This
In fact, there are two general ways of conceiving a model for time: the first is based on the concept of timespan in which a statement is
When every statement in the gazetteer is attributed with a valid time, it is possible to distinguish different aspects of a place on a high granularity. Not only the place as a whole, but also individual aspects (e.g. its population or predominant religion) can be distinguished by different time intervals. On the other hand, the
This problem, as well as the problem of continuity, can also be addressed by using the second approach to model time – an event-based approach. Numerous ways for modelling events exist on a theoretical level. In general the valid-time approach assigns time spans as attributes to places while with the event-based approach event-objects with or without a time index are modeled. These can be associated with one or more places. One possibility to further distinguish the events in use is to build an additional ontology to represent the events in the gazetteer.8 For examples of event-ontologies see [39] and [12]. A description of historical periods is provided by the event-gazetteer
A number of these problems results from the use of numerical dates, and arise from all the described approaches: Kauppinen et. al. distinguish three types of fuzziness regarding the representation of time:
One also has to keep in mind that dates in historical sources may make use of different reference systems for dates. Like the Julian and Gregorian calendars, these can exist simultaneously. Although a standardization of these systems may be considered advantageous for comparability, a gazetteer drawing heavily from historical sources should be able to model different reference systems as well to keep the information provided by the sources.10 A few of the concepts discussed in this section are already implemented in the
Designing an ontology as well as modelling a database for historical places is based on the study of historical sources. It is therefore important to state from which sources place data, concepts and properties are derived, since historical data are rarely unambiguous and undisputed. For historians, it is vital to distinguish between different sources that may present different perspectives on the same event or which have different levels of trustworthiness. Thus, for instance, there is a difference whether data derives from a charter certifying a certain act of law between different parties, or whether your data derives from a chronicle written many years after the event by one of the interested parties, for instance after a dispute about this act of law.
On the other hand, one should note that not only actual places but also mythological places played a role in historical sources. Fictitious places were occasionally even depicted in maps. The most popular example may be the
Furthermore, in the interest of citability, it might also be desirable to capture which person entered which data. This could become much more important in the future if a contribution to a database by a researcher should be counted as a form of publication [18,41].
Ontological approaches
Existing digital gazetteers mostly try to cover the state of the contemporary world. Some also incorporate historical places, mostly by simply adding a
In the following section, we thus focus on ontologies and projects more aware of the necessity of a historical perspective. Moreover, due to their different perspectives, an overview of these projects also establishes the current state-of-the-art. The following prefixes are going to be used throughout this section:
The
The
Besides these three central projects providing an ontology to describe cultural heritage, a normed vocabulary in the shape of an authority file and a repository to collect structured data, we also examine two historical gazetteer initiatives which focus on different time periods.
The
Of course this list is far from being complete. With examples serving as
In the section, we examine how these projects represent historical places, whether the challenges described above have been addressed by them, and in which way they did so. This is meant to provide conclusions for the modelling of a Historical Gazetteer for medieval and early modern places and spaces.
CIDOC CRM
The For an overview of the classes and properties discussed in this section, please consult Fig. 1 on page 503

UML representation of the
The question of the diverging toponyms can be addressed by the property
The problem of topological relations between different territories is tackled in the
To solve the issues of ambiguity and uncertainty with historical source data, Hiebel et. al. have provided a new approach in their extension of the
This distinction clarifies if one models the actual historical place, or if one talks about a contemporary and possibly flawed representation of it. A fictitious example to illustrate this could be a medieval village that does not exist anymore. Through descriptions in sources about it and archaeological findings we can make educated guesses about its true location and shape. But regardless of how exact our findings may be, we can never be sure that the data we acquire represents all the information of the village as it truly was in the past. To separate our incomplete findings from the historic object, the
Concerning the representation of temporal change and the matter of temporal disambiguation, the
By using
With Version 6.2.2 some features from CRMgeo proposed by Hiebel et. al. [28] have become part of the
To describe the provenance of any information modelled with the
One should also note, that the classes, discussed in this section, are only meant to describe real objects. However the

UML representation of the
For the For an overview of the classes and properties discussed in this section, please consult Fig. 2 on page 504

An extract about the town of Münster (Westf)
A conceptual disambiguation of places can be done in different ways. Firstly, there are a few subclasses for The subclasses are called:

An extract about the Prince-Bishopric of Münster
As shown in line 1, the Prince-Bishopric of Münster can be understood as a
The number of the subclasses provided by the
The
It is not possible to model territories as polygons with the
For temporal disambiguation, the
The hierarchy model of the
Since the
In For a simple UML representation of the example see Fig. 3 on page 506 A principality is defined by

UML representation of the Prince-Bishopric of Münster in
Like the
The primary intent of
Temporal data is also modeled in part with topological relations. The respective properties are called
Because of the wide scope of The names of the corresponding properties are
The

UML representation of the
It thus uses specific place concepts that are embedded in a historical context. Although some of the concepts are more general than others, they are not related to each other in a structure of inheritance. Most concepts are defined in a documentation [65]. Because the project’s definition is based on Yi-Fu Tuan’s experience-based approach to places [63], the list consists not only of man-made objects like
The For a simplified UML representation of the data model see Fig. 4.

An extract about Istanbul
As one can see the properties
Furthermore, the
The
Concerning the time model, the properties For a complete list of the predefined time periods see [66].
Since the
Finally, since the
Whereas the
The ontology of the For a complete list of the concepts see [21]. For Prince-Bishopric territories for example exists the concept
The

An extract about Roztoka
Listing 4 shows how two name changes of the village have been modeled in lines 3 and 9 by using See lines 5 and 10 in Listing 4.
The
With the See lines 5 and 10 in Listing 4 for an example.
The hierarchy levels used are specific and historically contextualized. On a conceptual level, there is a distinction between an administrative, an ecclesiastical, and a jurisdictional hierarchy tree. Conflicting memberships at one of these trees can be resolved by the use of timestamps as is shown in Listing 5, depicting the memberships of the

An extract about Freistaat Preußen
For a human user, it is possible to see at one glance the hierarchies a place was part of at a certain time. To achieve this, the whole hierarchy tree for an object is traversed and then visualized when the page of an object is requested. The whole tree for the town of Münster is shown in Fig. 5. Note that the nodes are named with historical terms to achieve a better distinction. The

An extract about Freistaat Preußen
Even if none of the projects discussed above is explicitly designed to take the specific problems of places in medieval and early modern times into account, a number of the challenges stated in Section 3 could already be considered resolved by them. Other issues yet remain untouched. (See Table 1 for an overview.)
Challenges concerning categorization of place have been addressed insufficiently. The distinction between fiat places and physical places as well as general and specific concepts is only done by the
When modelling territories, a geographical approach is used by the

The town of Münster as an example for administrative hierarchies in the genealogical gazetteer.33
All ontologies introduced in this paper solve the problem of multiple names. However, only
Problems of temporal disambiguation mostly remain unsolved. Except for the
By separating the historical source from the editor of a data set, only the
Challenges addressed by the ontologies discussed in this paper
Place concepts
Finally, we summarize which of the different approaches discussed above are required for a gazetteer that covers medieval and early modern places.
With regard to place concepts, the most important question is the decision between using general or specific categories. General concepts allow for more interoperability and comparability, while specific concepts can be more historically accurate and allow for deeper levels of historical analysis. We propose the use of both in combination with an inheritance structure, so that the specific concepts are specializations of the general ones. The use of multiple inheritance guarantees the specification of more complex and ambiguous historical concepts. An enrichment with context information – for example through the property
As shown in the
Toponymy
The importance of modelling multiple names for each place has been demonstrated. The most common and efficient practice to resolve this issue is to distinguish between the places and its names. The names can be single (place name) objects themselves, and therefore be provided with their own temporal attribution, shown in the
Territories
To be useful for qualitative reasoning, territories can be modelled by their spatial (like topological) relations in addition to (or instead of) their geometric representation. This enables to enrich the models by inferring new inherent relations, for instance relations between current and historical places (see [34,36]). While not offering a representation accurately e.g. by maps (as with coordinates), topological relations also better meet the requirements for representing places that are lacking clear defined boundaries.
If it becomes necessary to visualize the territory on a map, we propose that its extent will be approximated by choosing places whose positioning is less uncertain (like towns and villages) that belong to the respective territory. This provides a much more accurate depiction of medieval and early modern realities, in which authority was defined by power over towns, villages, farmsteads and rights to use forests or stretches of water.
In some cases, it can be useful to include the use of geometries, for example if a gazetteer serves as an object catalog for a GIS application. If territories are also capturing administrative authority in terms of dominion over spaces, it has to be considered to use a model for representing different degrees of administrative penetration. In this case, the overlap representation should also be possible.
The problem remains that most of the approaches looked at have been developed for a world mostly accessed by homogeneous administrative structures. The distinctive features of medieval and early modern dominion, which have been stated in Sections 2 and 3, are ignored by current gazetteer projects when it comes to model ruling structures on the level of towns or villages. To capture these features in a gazetteer for the medieval and early modern world, a new model that understands ruling as an interconnection between places, non-governmental institutions and people has yet to be developed. Such a model of ruling would have to be developed from scratch because of its high dependency from history as a knowledge domain. Nevertheless it should be applicable as an extension to every gazetteer that tries to take pre-modern political realities into account.
Temporal disambiguation
From the different approaches to model time introduced in Section 3.5, in practice, the
Furthermore, the valid time-interval in itself claims a continuity which is not covered by the sources. In most of the cases historical texts report events. In this case, the continuity is already an interpretation of the data from today’s perspective. Only an event-based approach can account for the problem of modelling continuity between the transition from one time span to another, and the creation of time spans at all. Since historical texts in general inform us about events that resulted in changes to the world rather than predefined time spans in which there was no change, event-based models bear a closer proximity to the source material. Therefore, the aspect of temporal disambiguation has to take a more prominent place in the design and usage of ontologies for historical data.
Finally, all aspects of a place have to be distinguished by temporal properties as shown in the
Hierarchies
The need for modelling hierarchies depends on the geographical, temporal, and conceptual scope of an ontology for pre-modern places. The
Provenance of data
Since the data in a gazetteer for historical places heavily relies on information drawn from historical sources, it is imperative that the provenance of all statements in such a database is made visible. Ideally, the data model used for capturing provenance is able to distinguish different concepts of historical sources. As shown with the
Conclusions
There is not only a trend in research concerning gazetteers to develop ontologies that are able to model places from different temporal, cultural and geopolitical reference frames on a general level, but also a need for this development in order to ensure comparability and interoperability. At the same time some design problems have to be approached not from a general but from a domain oriented perspective. In this paper, we have shown this, by integrating the domain based perspective of historical humanities with the broader approach, developed in geographical and computer sciences. We focused on the challenges arising when modelling historical places which are set in medieval and early modern times. We examined the nature of such places from different perspectives: the naming of places (toponymy), their categorization (place concepts), their relation to other places (hierarchies), their spatial extension (territories), their change over time (temporal disambiguation) and their validity in research as well as in historical sources (provenance of data). We argued, that each of these aspects comes with it’s own specific design challenges. Furthermore, especially the challenges regarding place concepts and temporal disambiguation have shown how strongly a certain historical thinking and understanding of the past can shape particular aspects of an ontology. This observation shows again, how important a domain oriented design perspective is.
We have then surveyed a number of ontologies from existing projects (none of them specializing on the medieval and/or early modern time period) and discussed how they are approaching the challenges identified by us. Thereby we have shown which existing technological solutions can meet our demands and which can not.
Derived from this discussion as well as from the challenges identified before, we have developed a catalog of design practices for the creation of a domain specific semantic gazetteer covering medieval and early modern places.
