Abstract
The sharing of product and process information plays a central role in coordinating supply chains operations and is a key driver for their success. “Linked pedigrees” - linked datasets, that encapsulate event based traceability information of artifacts as they move along the supply chain, provide a scalable mechanism to record and facilitate the sharing of track and trace knowledge among supply chain partners. In this paper we present “OntoPedigree” a content ontology design pattern for the representation of linked pedigrees, that can be specialised and extended to define domain specific traceability ontologies. Events captured within the pedigrees are specified using EPCIS - a GS1 standard for the specification of traceability information within and across enterprises, while certification information is described using PROV - a vocabulary for modelling provenance of resources. We exemplify the utility of OntoPedigree in linked pedigrees generated for supply chains within the perishable goods and pharmaceuticals sectors.
Introduction
One of the most important challenges in supply chains and logistics is information integration. Sharing of data and knowledge in a standardised manner along the supply chain is crucial not only to enable visibility, i.e., tracking and tracing of artifacts, but also to enable the more effective management of the supply chain.
Barcodes and more recently RFID tags have provided initial solutions to this challenge. GS1,1
However a critical limitation of the current EPCIS specification is that though it does propose a mechanism to exchange and share data, the EPCIS XML schemas define only the structure of the data to be recorded. The semantics of data and data curation processes are informally defined. Their interpretation is left up to the individual EPCIS specification implementing engines, thereby greatly increasing the possibility of interoperability issues arising between supporting applications, e.g., validation and discovery Web services built over the event repositories.
To overcome the limitations highlighted above, we propose the concept of event based “Linked pedigrees” [9] - interlinked datasets described in RDF, curated by consuming real time EPCIS events in the supply chain and encapsulated as linked data, that enable the capture of a variety of tracking and tracing information such as the Chain of Custody (CoC) and Chain of Ownership (CoO) about products as they move among the various trading partners. The notion of linked pedigrees is motivated by the widely prevalent use of pedigrees for tracking and tracing commodities in the pharmaceutical industry.4
Linked pedigrees specifically overcome a significant limitation prevalent in conventional pedigree exchange - that of information being available only from partners one-up or one-down in the supply chain. Dereferencing URIs make it possible to sequentially traverse the chain of pedigrees exchanged between partners and retrieve traceability information, given that adequate authentication, authorisation and access control mechanisms are in place.
In this paper we present “OntoPedigree”,5
a content ontology design pattern for the data modelling of linked pedigrees, that can be specialised and extended to define domain specific or indeed product specific pedigree ontologies. In particular the paper makes the following scientific contributions:We formalise the notion of a track and trace artifact, “linked pedigrees” using the underlying standards for Semantic Web and principles of linked data to represent traceability-specific domain knowledge in supply chains.
We illustrate how EPCIS governing supply chain events can be exploited to generate linked pedigrees using the OntoPedigree design pattern.
The paper is structured as follows: Section 2 presents the requirements and competency questions we considered for the design of the pattern. Section 3 presents the intent, conceptual entities and the graphical illustration of OntoPedigree. It also highlights the relationship of OntoPedigree to other patterns and ontologies and provides details of implementation support for OntoPedigree. Section 4 presents the OWL axiomatisation of the pattern. Section 5 presents scenarios from the agri-food and pharmaceutical supply chain where OntoPedigree has been applied. Section 7 presents conclusions.
Pedigrees have been defined initially in EPCGlobal’s ratified Pedigree Standard6
Version 1.0, 2007,
While the above description of a pedigree is given within the context of pharmaceutical supply chains, the interpretation of the definition highlights certain key requirements for the design of a generic content ontology design pattern for pedigrees, that could be reused across multiple sectors.
Recently the concept of “Event based Pedigree”7
has been proposed that utilises EPCglobal’s EPCIS specification for capturing events in the supply chain and generating traceability datasets based on a relevant subset of the captured events. OntoPedigree, the design pattern for the generation of linked pedigrees as presented in this paper, builds on the event based pedigree approach.Below we outline some of the central requirements for pedigrees which form the basis of the entities defined in OntoPedigree.
As highlighted in Section 1, a linked pedigree is a dataset, represented as a named graph [2], identified using an http URI, described and accessed using linked data principles and represented using the RDF data model. The definition of a pedigree must include URIs for certification, product, transaction and consignment information asserted in the trading partner’s knowledge base. It must include URIs to the pedigrees sent by the immediate upstream or downstream partners.
Competency questions
Given the requirements above, we define certain competency questions that govern the design of an ontology pattern for pedigrees.
Who is the creator of the pedigree? What is the supply chain creation status of a given pedigree? Which are the business transactions recorded against a particular consignment? What are the events associated with pedigrees created between dates X and Y? Which products have been shipped together? Which other pedigrees are included in the received pedigree?
Pattern description and graphical representation
Intent
OntoPedigree provides a minimalistic abstraction and defines conceptual entities for the modelling of semantically enriched knowledge required to enhance data visibility in a supply chain. The pattern can be specialised to define domain specific pedigrees.

Graphical Representation of OntoPedigree.
Some of the key concepts and relationships encapsulating the data model defined by the pattern are described below. Note that several entities in OntoPedigree extend from or exploit concepts and relationships defined in PROV-O8
A pedigree defined by a repacking trader, who combines goods from several consignments received from upstream stakeholders.
A pedigree defined by a downstream retailer who returns goods to an upstream trader.
Traceability information in OntoPedigree, regarding consignments and transactions are specified as sets of EPCIS events described using EEM [10] ontology. Certification information is defined using the PROV-O vocabulary. Extending from PROV-O serves the dual purpose of being able to attach provenance to pedigrees as well as facilitates the querying of pedigrees for certified knowledge using a dedicated vocabulary. Product master data is specified using domain specific vocabularies that in many scenarios exploit the GoodRelations ontology.10
Pedigrees generated using OntoPedigree are based on GS1 standards. Supply chain processes that implement these standards for the generation of linked pedigrees can declare their conformance and enforcement to these standards using the
Figure 1 illustrates the graphical representation of OntoPedigree. It depicts the entities defined for the pattern and their relationships with entities from PROV-O, EEM and GoodRelations.
Implementation support for OntoPedigree
In order to facilitate the creation of linked pedigrees that exploits EEM and OntoPedigree, we have implemented the LinkedEPCIS reference implementation and Java library.12
A pedigree is required to have certain mandatory properties and relationships as part of its definition. In this section we present an incremental axiomatisation of OntoPedigree ( A pedigree is a PROV entity A pedigree must include a creation time, a serial number and describe its status. It must assert the creating organisation and may include the service used for its creation. It must include at least one consignment relation, a transaction relation and at least one relationship capturing product master data. Finally, it may include links to pedigrees received from the immediate upstream or downstream partners.
Combining the above assertions, the overall definition of

Manchester syntax serialisation of OntoPedigree.

Generalised agri-food chain scenario for tomatoes.
In this section we present scenarios from the perishable goods and pharmaceutical supply chains that exemplify the use of OntoPedigree in the exchange of traceability information. Perishable goods is one of the scenarios considered in the FI-PPP projects SmartAgrifood13

Linked pedigrees created using OntoPedigree and exchanged between the supply chain partners.
The lifecycle of perishable goods e.g., tomatoes in the agri-food sector, is a complex process until they reach the end consumer because of the number of involved stakeholders and the diverse set of data that is produced. The tomato supply chain involves thousands of farmers, hundreds of traders and few retail groups. Figure 3 shows a generalised food chain scenario with a reduced level of complexity. This scenario covers 90% of the supply scenarios for fresh food products. The general workflow involving the capturing of events, generation of linked pedigrees and exchange of pedigrees related to the sale of tomatoes between stakeholders such as the farmer (Franz), the trader (Joe), the distribution centre (FreshFoods Inc.) and the supermarket (Orchard) is outlined below. We assume that all supply chain trading partners have a supporting EPCIS implementation infrastructure installed that can be exploited for capturing and querying event and pedigree datasets.
Franz farmer specialises in growing tomatoes. The packaging of tomatoes is done in punnets, each of which are tagged with RFID labels. Shipment of tomatoes to downstream partners is done in cardboard boxes each of which is tagged with a RFID tag. Joe trader bundles tomatoes procured from multiple farmers to larger product batches before dispatching them to distribution centres. Freshfoods Inc. sources tomatoes from multiple traders and splits up large product batches into smaller batches for distribution to retail supermarkets. Orchards is a supermarket that receives fresh produce from distribution centres such as Freshfoods Inc.
Joe trader requests pedigree information on an identified tomato batch that has been delivered to him by Franz farmer. The request is made by RESTfully invoking Franz farmer’s agent which is part of the FMS (farm management system) installed at Franz farmer’s end [7]. Joe trader receives an authenticated and certified message containing the pedigree URI, where the pedigree has been generated using the OntoPedigree design pattern. Joe trader’s agent dereferences the URI and receives the pedigree dataset. Object property value resources in the pedigrees are asserted using EPCIS event data URIs.
Joe trader combines the tomato produce received from Franz with those received from other farmers (e.g., Bob) into shipments which are then forwarded to Freshfood Inc. On receiving a pedigree request from Feshfood Inc., Joe trader’s agent sends the pedigree which includes URIs to the pedigree provided by Franz farmer and Bob farmer.
The pedigrees generated and exchanged in the workflow defined above are illustrated in Fig. 4.
The significant advantage of exchanging traceability information using linked pedigrees over conventional mechanisms is that the pedigree received by FreshFood Inc. from Joe trader includes URIs to pedigree datasets provided by Franz farmer and Bob farmer, even though they are not FreshFood Inc’s one-up or one-down partners. Consuming EPCIS event data curated as linked data to generate and exchange linked pedigrees as outlined above can help derive implicit knowledge that can expose inefficiencies such as shipment delay, inventory shrinkage and out-of-stock situations.
Pharmaceutical supply chains
We outline the scenario of a pharmaceutical supply chain, where trading partners exchange product track and trace data using linked pedigrees. Figure 5 illustrates the flow of data for four of the key partners in the chain.

Trading partners in a pharmaceutical supply chain and the flow of information.
The Manufacturer commissions,15
Associating the serial number with the physical product.
As the serialised items, cases and pallets move through the various phases of the supply chain at a trading partner’s premises, EPCIS events are generated and recorded at several RFID reader nodes.
When the pallets with the cases are shipped from the manufacturer’s premises to the warehouse, pedigrees encapsulating the minimum set of EPCIS events are published at an URI based on a predefined URI scheme. At the warehouse, when the shipment is received, the URI of the pedigree is dereferenced to retrieve the manufacturer’s pedigree. When the warehouse ships the cases to the distribution center, it incorporates the URI of the manufacturer’s pedigree in its own pedigree definition. As the product moves, pedigrees are generated with receiving pedigrees being dereferenced and incorporated, till the product reaches its end-of-life stage.
While supply chain information visibility [6] has received significant attention in recent years, to the best of our knowledge, our work is the first attempt in utilising Semantic Web standards and linked data principles for the representation of EPCIS events and for exploiting the representation for provenance-based tracking and tracing.
Closely related to our work is the use of Semantic Web technologies for capturing and managing data across the supply chain as first proposed in [1] although the focus was on the environmental impact of food in the organic food supply chain. The CASSANDRA project16
In [4] the authors present a solution that utilises both RFID and GPS for tracking and tracing of international shipments, although the solution uses EPCIS, the focus there is on the implementation of a system, rather than utilising a traceability artifact such as a pedigree as done in our approach.
In [8] a data model and algorithm for managing and querying event data has been proposed. The data model is illustrated as an extended entity relationship diagram and is close in spirit to EEM as proposed in this paper. A critical limitation of this model is that it is overlayed on top of relational databases and is not available in a form that can be shared, reused between organisations as linked data.
In [12] the authors propose to use the InterDataNet (IDN)17
The Fosstrack18
Data visibility in supply chains has received considerable attention in recent years. Information systems are now being designed to facilitate the process of making data available in real time to stakeholders in the supply chain.
In this paper, we have proposed a design pattern “OntoPedigree” that provides a minimalistic abstraction for designing domain specific pedigree ontologies. OntoPedigree builds upon EPCIS, a GS1 standard for facilitating event based traceability and PROV-O, a vocabulary for specifying provenance of linked datasets. OntoPedigree can be mapped with other content ontology patterns such as SEP to declare the conformance of supply chain processes to these standards when they generate pedigrees using OntoPedigree.
The pattern design is supported by a robust set of requirements and a rigorous set of competency questions. An OWL axiomatisation of OntoPedigree has been provided. Finally, an implementation framework that facilitates the seamless uptake of the generation of linked pedigrees in supply chains has been presented. It is worth noting that OntoPedigree is domain independent and can be widely applied to most scenarios of traceability. We strongly believe that linked pedigrees generated using OntoPedigree can make a significant difference to current visibility approaches in supply chains. We are in the process of applying the pattern to the generation of pedigrees in the Wine supply chain where traceability information is required to be integrated with datasets for the environmental conditions prevailing in the vineyards.
