Abstract
Ontology-driven systems with reasoning capabilities in the legal field are now better understood. Legal concepts are not discrete, but make up a dynamic
Keywords
Introduction
The Semantic Web has been devoted to addressing social issues from the outset. For instance, in the field’s best-known seminal article [11], Tim Berners-Lee, Handler, and Lassila describe intelligent agents dialoguing and interacting between each other in order to solve a particular medical scenario. However, in the early years of the Semantic Web, the main effort went into background studies, where Artificial Intelligence (AI) and Knowledge Engineering (KE) were the main actors on the scene. Activities mostly involved the development of new formal languages (e.g., RDF [33], RDFS [18] and OWL [75]1 Even if the first drafts of such specifications were published starting from the end of the 20th century, here we decided to cite only their earliest versions.
Only later, and standing on the shoulders of the aforementioned works, did people start to approach the Semantic Web from different perspectives, such as engineering (i.e., Linked Data [12]), the social sciences (i.e., Social Semantic Web [57]), and the hard sciences (i.e., Web Science [59]). The Semantic Web then started to broaden its reach, moving from academia to other (more “applicative”) domains, such as industry, administration, and, last but not least, law – the very topic of this special issue.
In the following sections we want to spend a few words on examples of applications of Semantic Web technologies to these domains before presenting, in Section 7, five high-quality articles that have been selected for this special issue of the
According to the vision provided in foundational works [35,95,109,111],
Since at least 2010, a number of initiatives have been proposed to promote Semantic Publishing to a broader audience, particularly triplestores (e.g., Open Citation Corpus [92,110] and the Open University Open Linked Data [126]), workshops (e.g., SePublica [44–47] and Linked Science [56,67,68]), special issues of academic journals (e.g., the Semantic Web Journal special issues on Force11:
Probably due to the success the Semantic Publishing movement is having in academia, several publishing companies (e.g., the Nature Publishing Group4 Nature Linked Data Platform: Elsevier Linked Data Repository: Semantic Web Journal Linked Data: http://semantic-web-journal.com/sejp/page/semanticWebJournal. Journal of Universal Computer Science bibliographic database:
On the other hand, the work carried out by Legal Information Institutes (LII)8
Their success is incontrovertible, because of its global scope. For example, the Cornell LII was visited last year by about 30 million different individuals from 246 countries and territories.11 Cornell Legal Information Institute: Australasian Legal Information Institute: Graham Greenleaf, personal communication.
LII and FALM have been paying a close attention to Semantic Web developments [93] to enable easier access to legal texts and improve their Tom Bruce, personal communication. “On the technical side, we employ Semantic Web technologies in a number of our features and collections. For example, search of some portions of the US Code is enhanced with Linked Data from the DrugBank database, which directs users who search for a brand-name drug to the regulations that deal with the components of that drug (so, for example, a search for ‘panadol’ would give results similar to a search for ‘acetaminophen’). We have, in prototype, a number of features based on entity-linking techniques (so, for instance, mentions of medical conditions in the sections dealing with benefits for military veterans are linked to their corresponding MESH entries, from which the user can navigate to further medical information; in the past, we classified agricultural regulations using AGROVOC, and linked regulations to related scientific papers classified with that ontology). These prototype features come and go as we test them for viability, but will very soon be aggregated into a system of infocards that will show the user a great deal of related, linked data having to do with the things that a particular law regulates or mentions, or that have to do with its creator or enforcer. We’ve done a lot of related work on metadata models for legislation for the Library of Congress, and some of our work on that has been adapted for use in the latest incarnation of the Government Publishing Office’s FD/SYS online publications. We’ve also done a lot with the automated extraction of statutory and regulatory definitions, with particular attention to scoping language, in order to link all defined terms in the statutes and regulations back to the relevant definitions.”
Thus, LIIs set out the principles of legal information and FALM principles – the so-called Hague principles (2008)17 See the following presentations by Graham Greenleaf: Australasian Legal Information Institute: Journal of Open Access to Law:
This constitutes an example of the mixed public-private business models that will proliferate in the new Web of Data scenarios [52], changing top-down and exclusively market-based approaches into more relational and flexible ways of handling regulations, services, and rights [20].
Setting aside certain differences between civil- and common-law countries, we take
For instance, since 2009, the US Government has started20 Open Government Initiative: Data.gov homepage:
On the basis of that experience, the UK government’s project data.gov.uk22 Data.gov.uk homepage:
Similarly, the London Gazette homepage: An example of an HTML+RDFa page in the London Gazette is available at The Gazzette Ontology:
A sister project of the previous one, i.e., legislation.gov.uk,26 Legislation.gov.uk homepage: The National Archives homepage: Government of the United Kingdom homepage:
These kinds of initiatives by the aforementioned and other local, regional, and national governments (Catalonia,29 Open public data of the government of Catalonia: Open platform for French public data: Italian public administration open data: Dutch national open data platform: Singapore government data: DCAT-AP is an action conducted by improving semantic interoperability in the European eGovernment systems of the European Commission’s Interoperability Solutions for European Public Administrations (ISA) programme http://eurovoc.europa.eu/drupal/sites/all/files/eurovoc-consolidated.owl.
At the European level, EUR-Lex and the Publication Office has also made great strides during the last three years.37 EUR-Lex Access to European Union Law: Fulgencio Sanmartín, personal communication.
The Re-use of Public Sector Information Directive,40 Directive 2013/37/EU of the European Parliament and of the Council of 26 June 2013 amending Directive 2003/98/EC on the re-use of public sector information: Legal Aspects of Public Sector Information Thematic Network Outputs (LAPSI): https://ec.europa.eu/digital-agenda/en/news/legal-aspects-public-sector-information-lapsi-thematic-network-outputs. https://www.gov.uk/government/publications/open-data-charter.
For this reason the Open Government Data movement is no longer limited to government organizations but extends its reach to other public bodies, especially to parliamentary bodies (e.g., the Italian Parliament45 Italian Senate (
The rise of the Web of Data, the increase of linked
Early accounts of Law and the Semantic Web stressed the work done on ontologies as early as the 1990s and in the first stages of the Web [10]. Several national and EU Esprit, FP5, FP6, and FP7 projects funded the construction or refinement of upper (foundational), core, and domain and linguistic ontologies [24].53 (
The methodological stages of knowledge representation, the iterative lifecycle of legal ontology building, and lessons learned have been carefully explained and discussed in several monographs [14,26,61,89]. There are five dimensions of law that have been addressed through computational modelling: the structure of legal documents, norms and normative systems, concepts and legal conceptions, cases and precedents, argumentation and legal reasoning [107]. They were initially related to the development of domain-independent ontologies for knowledge-sharing and reuse, mainly as knowledge-interchange formats for knowledge-based systems. Early legal ontologies were formalised using ONTOLINGUA, LOOM, and DAML+OIL, with an increasing preference for W3C standards, Semantic Web languages (OWL DL), editors such as Protégé, and reasoners such as Pellet [17].
Scalability, reusability, and end user-centred approaches were taken into account to model specific legal domains, mainly e-commerce, e-administration, e-governance, criminal law, consumer law, mediation, drafting, and contracting. The increasing weight of Semantic Web languages has helped to shape a more pragmatic approach, envisaging Web services, community participation, and a growing involvement of legal experts, especially in knowledge-acquisition, validation, and implementation processes.
Thus, the way of modelling legal knowledge can be quite flexible, depending on the epistemic assumptions in the knowledge acquisition process, the selected requirements, and the final purposes of the tool, application, or system at stake. The next step is to lean on lessons learned over the past fifteen years.
In early 2005, a project funded by the United Nations, Akoma Ntoso, was launched to enable citizens to exercise their right to access African parliamentary proceedings and deliberations, while supporting Parliaments in managing legislative documentation [8,82,87,125].55 Architecture for Knowledge-Oriented Management of African Normative Texts using Open Standards and Ontologies, a.k.a., Akoma Ntoso,
Almost in parallel, MetaLex emerged in 2002 from several past EU projects and was proposed as a legal XML standard. In 2006, it evolved into CEN-MetaLex as a general format for the exchange and interoperability of legal documents [14]. It adopts and adapts the concepts introduced in the FRBR specification [Functional Requirements for Bibliographic Records of the International Federation of Library Associations]. This work had a follow-up in the EU Project ESTRELLA,56 Standardized Transparent Representations in order to Extend Legal Accessibility, ESTRELLA (2006–2008):
A quick look at the 2015 Legal Technology Survey Report of the ABA Legal Technology Resource Center57
“Imagine a lawyer stranded at the airport. Her flight home to visit family has been delayed, as so often happens. She’s debating how to fill her time when her Apple Watch issues a gentle tap on her wrist. It’s an alert about an email from an important client – a small business she has represented in a wide range of matters.
“With time to spare, she pulls out her iPad and jumps online using her own data plan – not the suspicious public WiFi. She quickly reviews the email, which details the client’s issues with a recently hired employee. Seconds later she has an employment contract she drafted open on the iPad, accessed from her secure cloud-based document management system.
“She uses a simple annotation app on the iPad to highlight key language in the contract and add explanatory comments. She then exports the annotated contract to her email app, types a brief memo explaining her thoughts on the matter, and sends it off to the client (with a CC: to her practice management system so the email is filed and flagged for later time entry).”
“Throughout this brief interlude, the lawyer has never left her seat in the airport terminal, and she’s used nothing more than her iPad and her Apple Watch. Her client need never know. From his perspective, she may as well have been behind a vast mahogany desk aided by a small army of legal assistants. The quality of service is the same.”
This is not an uncommon situation. The picture could be completed by adding some utilities: semantic contracts, automated structuring of the content of contracts, personal access to large databases, smart tools for reorganising the relevant legal knowledge. This is not so far off now. Shared information from heterogeneous systems, personalisation, context awareness, and interaction are features of the Semantic Web. Consumer-generated data is growing at an unprecedented pace – a 2,000% increase in global data is expected by 2020 [55] – and companies, lawyers, and law firms are quite aware of this.
New emerging issues are also important in the development of markets and the evolving entwinement of law and the Web of Data. Trust, transparency, metadata markets, licensing, statistical and big-data governance, privacy, data protection, security, and intangible legal goods such as intellectual property and patents are fostering new modes of cooperation between legal experts and Semantic Web developers.
Ontology-driven systems with reasoning capabilities in the legal field are now better understood. Legal concepts are not discrete but make up a dynamic
This has given a new boost to Digital Rights Management (DRM)58
Consider, too, that consensus and disagreement are equally present both on the Web and in the actual law-making processes. Recent path analyses of the evolution of intellectual property rights show that international patent systems were still under construction moving into 2000 [73]. This fact changed dramatically in the 21st century, owing in part to a new extremely competitive framework, nationally and internationally alike, and to the economic success of the Web. The launching of the Creative Commons in 2001,60
There is an important standardisation effort in the interplay between Web languages and the Semantic Web and legal knowledge. It is true that most behaviours on the Web seem to be implicit. Best practices and norms are not often made explicit [100]. However, even recognizing that standards and technologies make sense only to the extent that they are deployed, and that many
Alternatives – common practices in the Web as against formal standards – are not necessarily antagonistic. Industry demands that common languages, formats, and specifications that are already in use receive the imprimatur of standardisation bodies to encourage fluent exchanges of data and digital goods. This demand is slowly shifting from XML formats to RDF models and Semantic Web technologies. One example of this is what has taken place within the MPEG-21 ecosystem, which initially had space for specifying XML schemata but more recently turned to OWL ontologies, as illustrated in this Special Issue [106].
Much of this shift can be attributed to trends in the technology world, but some can be grounded in the fact that the overall focus when dealing with legal data is shifting from a need to publish documents (on paper and online) to an effort to produce complex applications providing some kind of legal reasoning. This is certainly risky, because imposing a textual or positivistic approach on all countries, cultures, and political bodies of different natures should be avoided. The challenge is to balance the use of technical standards with the emerging patterns of social and political behaviour.
Standards for legal documents are reacting accordingly. While the first and second generations of standards for legal and legislative documents were mostly national, and mostly aimed at generating printed and online representation of legislative texts, current standards are looking to reach agreements on a more general framework.
Given that text documents are still very much the core material produced by legal professionals, and that references to text documents will remain the basis for grounded and verifiable legal reasoning regardless of the actors and technologies employed, current generation standards in the legal domain are providing a layered organization of their offerings: presentation-oriented XML is being replaced with structured XML with ample room for metadata and annotations; naming mechanisms based on URIs and IRIs provide linkable anchors both to entire documents and to smaller fragments; and document-oriented ontologies provide the necessary glue between abstract legal reasoning and the textual pieces of supporting evidence.
Within OASIS, for instance, three Technical Committees are actively working to produce a flexible, comprehensive, and wide-reaching platform for the digital representation of legal documents and their content: the LegalDocML TC64
The LegalCiteM TC65 http://eur-lex.europa.eu/legal-content/EN/TXT/?uri=URISERV%3Ajl0068.
Finally, LegalRuleML TC70
But this is all still under development, and must be carefully validated and tested through an on-going implementation process in different scenarios. Policies, social regulations, ethics, and law are deeply intertwined. Thus, Semantic Web technologies can also be applied on specific service-oriented approaches, focusing on social problems such as the administrative implementation of immigration laws and policies in a specific jurisdiction. In this way, methodology, ontology-building, and epistemology can be kept in separate clusters, and different dimensions can be combined ad hoc to tailor specific solutions. Legal isomorphism and the so-called scoping problem [120] – the extraction of implicit meaning from general regulations with concrete aims and targets in a given context – can be tackled in an ordered and relational manner, making it possible to create scenarios with lay and expert participation alike [112,113].
So, too, new general frameworks can be added to an emerging contemporary legal landscape for the Web of Data. Customary international private law cannot be easily modelled without taking all stakeholders into account. The general balance between privacy, data protection, and security [58] seems to broaden the legal normative scope for regulating, among other elements, linked data markets [116], co-regulatory instruments [98], self-regulated collective awareness and informed consent [88], the behaviour of LEAs behaviour (law enforcement agents)71
These new trends are still to be explored further, but they certainly suggest the promise of hybrid models of regulation and synergy
This special issue on the use of Semantic Web technologies to address Legal Domain issues and scenarios brings together five high-quality contributions, out of eight submissions we originally received.
In “An OWL Ontology Library Representing Judicial Interpretations” [30], Marcello Ceci and Aldo Gangemi introduce an OWL 2 DL ontology library making it possible to describe the interpretations a judge makes of the law while engaging in the legal reasoning on which basis a case is adjudicated. This ontology library is based on a theoretical model and on some specific patterns that exploit some new features introduced by OWL 2, and it provides meaningful legal semantics, while retaining a strong connection to source documents (i.e., fragments of legal texts).
In “Semantic Model for Legal Resources: Annotation and Reasoning over Normative Provisions” [39], Enrico Francesconi presents an OWL 2 DL ontology for describing normative provisions (in terms of Hohfeldian legal fundamental relations) and related axioms in order to enable advanced access to legislative documents. The discussion is supported by examples of semantic annotations of legal textual resources using RDF/OWL standards and of SPARQL queries for accessing and reasoning over provisions. This is framed on CELLAR.
In “LOTED2: An Ontology of European Public Procurement Notices” [36] Isabella Distinto, Mathieu d’Aquin, and Enrico Motta describe the construction of the LOTED2 ontology for the representation of European public procurement notices. LOTED2 is a legal ontology that supports the identification of legal concepts and, more generally, of legal reasoning. In particular, it seeks to strike a compromise between the accurate representation of legal concepts and the usability of the ontology as a knowledge model for Semantic Web applications, while creating connections to other relevant ontologies in the domain.
In “PPROC, an Ontology for Transparency in Public Procurement” [77], Jose Félix Muñoz-Soro, Guillermo Esteban, Oscar Corcho, and Francisco Serón introduce the PPROC Ontology – an ontology that enables the description of procurement processes and contracts. The authors focus in particular on making the ontology appropriate for describing the standard data relating to the tender (i.e., objectives, deadlines, awardees) and the details of the whole process of publishing and performing contracts.
In “Overview of the MPEG-21 Media Contract Ontology” [106] Víctor Rodríguez-Doncel, Jaime Delgado, Sílvia Llorente, Eva Rodríguez, and Laurent Boch present the MPEG-21 Media Contract Ontology (MCO), an ontology enabling the description of contracts dealing with rights to multimedia assets and, more generally, with any content protected by intellectual property. The ontology is composed of a core model (describing permissions, obligations, and prohibitions in contracts) and a specific vocabulary representing the common rights and constraints in the audio-visual context. The paper also includes a description of the design principles and the methodology followed in developing the ontology, as well as several examples of it in RDF and a description of related tools.
Footnotes
Acknowledgements
We would like to thank all the authors of accepted and rejected articles, as well as the editors-in-chief of
We also warmly thank Graham Greenleaf, Tom Bruce, Fulgencio Sanmartín, and Enrico Francesconi for providing us useful updated information about AustLII, Cornell’s LII, and CELLAR.
A special mention needs to go to Rinke Hoekstra: not only has he given us continuous support and encouragement, but he has also managed to review three of the articles.
The edition of this special issue on the
