Abstract
State-of-the-art progress in cloud computing encouraged the healthcare organizations to outsource the management of electronic health records to cloud service providers using hybrid cloud. A hybrid cloud is an infrastructure consisting of a private cloud (managed by the organization) and a public cloud (managed by the cloud service provider). The use of hybrid cloud enables electronic health records to be exchanged between medical institutions and supports multipurpose usage of electronic health records. Along with the benefits, cloud-based electronic health records also raise the problems of security and privacy specifically in terms of electronic health records access. A comprehensive and exploratory analysis of privacy-preserving solutions revealed that most current systems do not support fine-grained access control or consider additional factors such as privacy preservation and relationship semantics. In this article, we investigated the need of a privacy-aware fine-grained access control model for the hybrid cloud. We propose a privacy-aware relationship semantics–based XACML access control model that performs hybrid relationship and attribute-based access control using extensible access control markup language. The proposed approach supports fine-grained relation-based access control with state-of-the-art privacy mechanism named Anatomy for enhanced multipurpose electronic health records usage. The proposed (privacy-aware relationship semantics–based XACML access control model) model provides and maintains an efficient privacy versus utility trade-off. We formally verify the proposed model (privacy-aware relationship semantics–based XACML access control model) and implemented to check its effectiveness in terms of privacy-aware electronic health records access and multipurpose utilization. Experimental results show that in the proposed (privacy-aware relationship semantics–based XACML access control model) model, access policies based on relationships and electronic health records anonymization can perform well in terms of access policy response time and space storage.
Introduction
Recent development in information technology has given a powerful and positive impact toward the improvements in field of medical information. Electronic health records (EHRs) are defined as, “the EHRs means a repository of patient data in digital form, stored and exchanged securely, and accessible by multiple authorized users.” 1 EHRs are increasingly adopted to collect and store various types of patients’ data. It includes information about patients’ personal details, medical treatments, and laboratory test results. EHRs are generated and maintained, 2 within a healthcare organization (HCO) or community and it is in digital format. EHRs are mainly used by different health professionals and administration staff. Healthcare professionals who use different components of the EHRs are health physicians, nurses, radiologists, pharmacists, laboratory staff, patients, and their dependents. International standards like Health Insurance Portability and Accountability Act (HIPAA) oblige EHRs to provide interoperability to promote information sharing between healthcare institutions and organizations. 3
The traditional EHR systems work in a centralized database environment where medical information is stored and managed by the hospital itself. This approach is expensive not only in terms of initial system development and maintenance, but such medical information also become incompatible with other healthcare systems. 4 Keeping in mind these inherent issues of traditional EHR system, health organizations are obliged to take the services of cloud service providers (CSPs) to manage the EHRs on their behalf, 5 which has advantages in terms of organizational cost and system scalability as compared to the traditional systems. In cloud deployment models, hybrid cloud is mostly preferred for the HCOs to host their EHR data. Along with the benefits, it creates serious security and privacy issues in terms of EHR access. 2
Security and privacy issues are imminent while outsourcing personal EHR data to the cloud because of its sensitive nature and “legal and social” repercussions for personal information disclosure. In cloud-based EHRs, on one hand, the patients’ information sharing is necessary and beneficial, but on the other hand, it must be performed so immaculately that patients’ privacy ought to be preserved. Privacy in cloud-based EHRs is essential as users and data are transparent in the public cloud. Moreover, it is also necessary to give EHR access for improvement in quality of service and EHR data utilization.
Privacy preservation of cloud-based EHR data can be achieved in a straightforward way that is to encrypt the EHR data before transmitting it to the cloud.2,6,7 Nonetheless, encrypted data processing is not efficient and is limited to certain operations, thereby making it unsuitable for EHR data with multipurpose usage. 8 Most of the cryptographic approaches are computationally expensive, and require complex key management and public key infrastructure (PKI), thereby making them less efficient for the data outsourced to the cloud.9,10 Attribute-based encryption (ABE) is a prominent scheme that provides a solution to most of above-mentioned problems. 11 ABE is basically a cryptographic access control scheme. Cryptographic access control schemes use cryptography and attribute-based access mechanisms to preserve the privacy of EHRs. However, ABE is computationally expensive and there are also access control policy management issues.12,13 Even its variant like ciphertext-policy attribute-based encryption (CP-ABE) and key policy attribute-based encryption (KP-ABE) are not sufficient to provide refined access control mechanism, enhanced data utility, and privacy protection in the cloud-based EHRs.
Therefore, an immediate alternative to cryptographic and cryptographic access control–based approaches is a set of these Privacy-aware anonymity-based techniques. These privacy techniques are used for protecting persons’ private sensitive data when it is publicly released, for example, like generalization, suppression, and Anatomy.6,14–18 Intel conducted a proof of concept to describe that anonymization technique like generalization and suppression (used in k-anonymity and l-diversity) can be used in cloud computing to achieve anonymity. 19 There are also some partitioning-based techniques8,20 and differential privacy 21 (to name a few) to the outsourced healthcare data. Zhang et al. 22 a modified MapReduce system is proposed for outsourcing anonymized data to the public cloud while at the same time the sensitive data is stored in the private cloud.“Privacy-aware data retrieval system” in the hybrid cloud is also proposed in Zhou et al. 23
As privacy-aware anonymity-based techniques alone are not sufficient to preserve the privacy, there must be some access control mechanism that can provide fine-grained access control to patient EHRs. Access control is very important for protecting cloud-based EHRs from unauthorized access. However, most recent access control systems for healthcare services are not flexible due to using role-based access control (RBAC) schemes. 24 Moreover, RBAC 25 also fails when the number of potential users is very high and most of users are transparent beforehand. To provide fine-grained access control mechanism to outsource EHRs, we cannot use such access control model like RBAC directly in cloud computing due to lack of scalability and flexibility in attribute management. The diverse access control policies and various access control interfaces can also cause inappropriate interoperability. However, eXtensible Access Control Markup Language (XACML)-ABAC can provide a better solution to most of these access control issues in the cloud.26–31 There is also another issue of privacy protection of access policies itself. Access control policies in their plain form for cloud-based EHRs create a source of collusion between CSPs and data users; therefore, a mechanism to hide access control policies is also necessary to provide a more robust solution. 32 Access policy anonymization can protect EHR data from being used for malicious activities for different purposes in healthcare domain. EHRs, when outsourced to the cloud, are vulnerable to more sophisticated attacks. For instance, the data that are outsourced to cloud for multiple users can come under collusion attack, that is, the CSP and data users may collude with each other for various incentives. In these scenarios, a whole data set that is stored in the cloud, along with the privacy mechanism, can be exposed. 8
Attribute-based access control (ABAC) 33 is the most recent access control mechanism that is used in privacy-preserving solution of cloud-based EHRs.34,35 However, in almost majority of solutions, major attention is given to provide privacy (using cryptographic and hybrid access control techniques), some limited fine-grained access control solution is provided in these solutions. We have noticed that multipurpose EHR usage and relationship-based access control (Rel BAC) aspect in proposed privacy-preserving solutions need proper and timely attention. In Rel BAC model, access permissions are modeled as relations between users (subjects) and data (objects) while access control rules are the instantiations of relation between specific sets of users and objects. Rel BAC model is represented as an entity relationship (ER) model while permissions are defined as relations between classes of subjects and objects. 36 Moreover as XACML lacks semantic interoperability, the use of semantic-based access control in XACML can simplify the policy specification by incorporating semantic inference in access control process.37,38
To provide fine-grained relationship-based EHR access with privacy preservation of EHR data, it is crucial to have an efficient privacy-aware Rel BAC solution. For this purpose, we extend the open and widely accepted XACML standard in relationship semantic access control and privacy preservation context. Our research is mainly about the use of Rel BAC with a privacy-preserving technique Anatomy for enhanced utility. The main purpose of this work is to propose a privacy-preserving access control model (PPX-AC) that will provide privacy-aware fine-grained access control solution that is interoperable and scalable with extended XACML-Rel BAC in the hybrid cloud. Proposed privacy model will provide maximum utilization of patient EHRs to different domain users: original data users (ODU), private data users (PRDU), and public data users (PBDU). EHR authorization is given based on their specific domain user permissions in access control policies. The proposed model will provide defense against internal privacy threats with the use of privacy technique Anatomy. Policy anonymization is also used to prevent privacy disclosure and possible collusion attacks in public cloud. Main contributions of our work are given below:
A research gap in related work is identified and we explore that privacy-preserving (using anonymization techniques and privacy models) and relationship semantics–based access control solutions for cloud-based EHRs are not used to achieve privacy, relationship semantic access control, and EHR data utility.
A privacy-aware relationship semantics–based XACML access control model (PRSX-AC) for EHRs in hybrid cloud is proposed, and its main features are as follows:
Provide fine-grained access control for cloud-based EHRs;
Provide relationship semantics–based access control with XACML that will be semantically interoperable in hybrid cloud;
Privacy model will use privacy technique Anatomy for EHR anonymization, as it provides high-quality data utilization;
The proposed solution will provide relationship-based EHR access, and it will also improve information sharing in terms of primary and secondary use of EHRs (medical usage, personal usage, institutional research, data analysis, and information sharing) in the hybrid cloud.
PRSX-AC model is formally verified, and a prototype that compiles XACML policy to verify its effectiveness is implemented.
In section “Related work,” the related work in the cloud-based EHR privacy preservation is given. Section “PRSX-AC” provides description of proposed PRSX-AC, and main design goals of PRSX-AC along with refined conceptual level details and technical description of different components are given. In section “Formal specification, modeling, and verification of PRSX-AC model,” we have formally verified the PRSX-AC model properties. Experimental evaluation is given in section “Experimental results and discussion.” Finally, section “Conclusion” concludes the whole work.
Related work
There are many approaches that are used to solve security- and privacy-related issues of EHR access in the cloud. In this section, a brief review of the relevant work on privacy preservation techniques of cloud-based EHRs is given. An overview of these privacy-preserving techniques along with the related work in different cloud deployment models in EHRs will be described.
EHR privacy-preserving techniques and cloud deployment models
This section provides a comprehensive overview and analysis of privacy-preserving techniques used in the cloud-based EHRs. We have categorized the privacy-preserving technique for cloud-based EHRs into cryptographic techniques, cryptographic hybrid access control techniques, and privacy-aware anonymity-based techniques.
Cryptographic techniques
In these techniques, various cryptographic mechanisms are used for privacy preservation. Some of cryptographic techniques are given here as it will help to understand the privacy-preserving approaches analysis. Symmetric key encryption (SKE) uses the same key for encryption and decryption to secure the data. SKE-based algorithms are currently used as a standard in the Advanced Encryption Standard (AES; standard recommended by NIST). In public key encryption (PKE) technique, we use private and public keys instead of a single key like in SKE. Although, encryption through PKE is secure, it is computationally expensive and not efficient, thus mainly used in combination with the SKE. ABE is another technique that is based on PKE. In ABE, encryption and decryption are performed on user’s attributes. ABE allows users to share the specific attribute-based encrypted data and provides fine-grained access.24,39,40 There are two variants of ABE: CP-ABE and KP-ABE. In ABE, encryption is performed based on access policy. In CP-ABE, user’s private key is the set of attributes. CP-ABE usage is limited as it involves specification of access control policies. Management of user’s attributes is another issue in CP-ABE. 41 In KP-ABE, access policy is associated with the private key and encrypted text is a set of descriptive user attributes. Decryption is only possible if access policy and user attribute match. There are many variations in cryptographic techniques like multi-authority attribute-based encryption (MA-ABE), searchable encryption, and fully homomorphic encryption (FHE).42–44
Cryptographic hybrid access control techniques
Cryptographic hybrid access control–based approaches make use of the combination of various above-mentioned cryptographic techniques with access control mechanisms like RBAC, ABAC, ABE, CP-ABE, and KP-ABE to name a few. In some of hybrid approaches, pseudo-anonymity and statistical data partitioning techniques are combined to get their maximum benefit for the privacy preservation of cloud-based EHRs. Hybrid techniques represent combination of different complex cryptographic, access control, and data partitioning techniques.2,9,24,34,35,45–52,53–59
Privacy-aware anonymity-based techniques
Privacy-preserving techniques have different sanitization mechanisms to transform data into anonymized form. Privacy-aware anonymity-based techniques, such as generalization, suppression, Anatomy, Angel, and differential privacy are used to transform microdata to anonymized form. In these privacy techniques, it is also tried to achieve the balance between privacy and data utility. Privacy-preserving techniques are used to prevent identity and sensitive attribute data disclosure when it is publicly released.12,6,14–18,60 We have tried to give a precise description of privacy-preserving techniques used for EHRs. The above-mentioned privacy-preserving techniques have been used in different cloud deployment models like public, private, and hybrid. Now, we will describe each cloud deployment model and various privacy techniques used to achieve privacy of EHR data.
Public cloud
The cloud deployment model is available to the public users, and it is monitored by the CSP. There can be different EHR recipient’s entities, like HCOs, healthcare professionals, and insurance and pharmaceutical companies. The EHRs are stored at the off-premise servers and managed by the CSPs in public cloud.12,23 Public access to data stored in cloud has made public cloud more vulnerable. There is always high risk for EHR data that malicious activities can be performed by the internal, as well as external, entities. According to security and privacy risks given in Pino and Di Salvo, 61 denial-of-service, man-in-the-middle, eavesdropping, IP-spoofing based flooding, and masquerading are the possible attacks. Consequently, there is strong need of privacy mechanisms to ensure confidentiality of EHRs. Cryptographic techniques and efficient signature verification schemes are already used, but limited work exists to the EHRs’ privacy preservation through anonymity-based techniques. Most of the EHRs’ privacy preservation work is done at public cloud like cryptography techniques7,32,36,48,49,62–65 and cryptographic access control hybrid techniques.2,9,46,49–51,53,54,56
Private cloud
Private cloud is managed by the HCOs or a third party, and it may exist on or off the premise of health organization. 12 EHRs stored in the private cloud are considered much secure as compared to the public and hybrid cloud deployment models. Its reason is that EHRs in a private cloud are only accessed by the trusted authority of the HCOs. Some work at the private cloud for cryptographic hybrid access control techniques is given in previous works.2,55
Hybrid cloud
Public and private cloud deployment models are combined in hybrid cloud. It is more significant in healthcare scenarios. Healthcare providers that do not have enough infrastructure resources can store the healthcare data in hybrid cloud. 12 Hybrid cloud ensures an efficient and robust solution for future healthcare applications. It effectively uses the maximum advantage of cloud computing and overcome the drawbacks of private and public cloud.12,66,67 Security and privacy preservation of EHRs are major issues in hybrid clouds, so they need novel solutions in this context. Privacy-aware anonymity-based techniques are applied at hybrid cloud8,38,68 and public cloud. 10 Table 1 presents a comprehensive overview and analysis of privacy-preserving techniques in cloud-based EHRs.
Privacy techniques.
DP: data privacy; MU: multipurpose utility; AC: access control; SB: semantic-based; RB: relation-based; EHR: electronic health record; ABE: attribute-based encryption; RBAC: role-based access control; ABAC: attribute-based access control; PKI: public key infrastructure; CP-ABE: ciphertext-policyattribute-based encryption; CSP: cloud service provider; EMR: electronic medical record; PHR: personal health record; ECC: elliptical curve cryptography; MSK: multi authority symmetric key.
Symbols used for security and privacy metrics; √: Satisfied, ×: Not satisfied, ■ : Limited.
Discussion
We have used the evaluation metrics, namely, relationship-based (RB), data privacy (DP), multipurpose utility (MU), access control (AC), and semantic-based (SB), for evaluation of privacy approaches given in Table 1. We have selected recent studies related to privacy preservation of cloud-based EHRs for comparison. It is clear from Table 1 that cryptographic techniques used in solutions only provide data privacy and all the remaining metrics are not satisfied. Hybrid cryptographic access control approaches are used in majority of the work and shows effectiveness against data privacy, access control, and limited multipurpose utility. These cryptographic hybrid access control approaches fail to provide relationship- and semantic-based features in cloud-based EHRs. Cryptographic hybrid access control solutions use cryptographic and access control mechanisms like RBAC, 25 ABAC, 33 ABE, CP-ABE, and KP-ABE. However, the used techniques have their limitations. RBAC has scalability issue in cloud with increase in number of users and resources. In KP-ABE data owner is not an authority who decides on access control structure, but it is the key distribution center. 9 In CP-ABE, although data owner has a full control over access policy altogether, it also represents a complicated technical solution. It is surely not affordable in all cloud-based EHRs. There are also some privacy-aware anonymity-based solutions8,10,21,22 that show some potential toward providing an alternative less complicated solution. However, privacy-aware anonymity-based solution alone fails to achieve all other evaluation metrics except data privacy. Data partitioning technique like MapReduce technique is also used, but its emphasis is on data partitioning based on computations, not at providing a privacy-aware defensive solution. There is another direction of semantic-based approaches, these approaches provide semantic access control only and data privacy, and relationship and multipurpose usage are not focused in their solutions. Overall, privacy-preserving solution for cloud-based EHRs lack relationship-based access control with semantic meanings and interoperability. Solutions should also provide data privacy at less computational cost and should support an optimal balance between EHRs’ multipurpose utilization. The proposed model differs from existing approaches mainly in terms of evaluation metrics as motioned above. In proposed solution, we have extended XACML authorization architecture that is based on attribute-based access control model (XACML-ABAC).we have innovatively combined relationship-based access control with semantic reasoning and privacy-preserving technique (Anatomy). In addition to satisfying privacy threats, the solution also provides collusion prevention in the public cloud.
PRSX-AC
In this section, first design goals of proposed (PRSX-AC) hybrid cloud model are described. In the next sections, different (PRSX-AC) model phases with detailed logical flow are described. Algorithms of proposed (PRSX-AC) model are also described in detail in last section.
PRSX-AC model: design goals
Access control: in proposed model, XACML-ABAC provides fine-grained access control; it logically fits to achieve authorization and a flexible policy creation environment in hybrid cloud. When EHR data are outsourced to public cloud, it needs fine-grained access control mechanism to avoid unauthorized EHR access.
Relationship with semantics: proposed (PRSX-AC) model will provide a novel feature of relationship-based EHR access with semantic reasoning. Relationship-based access and semantics will refine EHR multipurpose usage in hybrid cloud.
EHR data privacy: EHR data will be anonymized with privacy-preserving techniques Anatomy for its simple and effective mechanism in EHR access and outsourcing scenario. Access policy request will contain requested attributes, and response attributes in their original form will be given. Our solution will provide defense against external threats (policy anonymization) and internal threat as authorized users in public cloud will also get anonymized version of EHR data, not original data. Privacy technique Anatomy preserves EHRs’ sensitive information on disclosure and provides maximum EHR utility in cloud-based EHRs.
EHR multipurpose usage: as most of the cryptographic access control solutions are too expensive and complicated, that entity in healthcare domain cannot afford EHR sharing to the public cloud. EHR data owners are also reluctant to share at public cloud due to external threats. Proposed (PRSX-AC) model will provide efficient and improved EHR usage in terms of primary and secondary use with additional relationship-aware access.
PRSX-AC models: description and phase details
A relationship-aware privacy-based access model (PRSX-AC) for EHRs with fine-grained access control mechanism is given in detail in this section. In our proposed privacy model (PRSX-AC), we assume that the EHR data user is authentic, and due to our two levels of privacy preservation, integrity of EHRs is not compromised. Proposed privacy model is an extension of standard XACML- ABAC33,73 with semantic Rel BAC hybrid approach and EHR privacy mechanism. Moreover, proposed model will also provide protection of access policy during transmission from public to private cloud. The proposed model (PRSX-AC) operates in three main phases as follows:
Phase A: XACML-hybrid RS-ABAC;
Phase B: XACML-EHR anonymization;
Phase C: XACML-policy anonymization.
In our proposed (PRSX-AC) model, we have divided EHR data users into three levels of domain users: ODU (hospitals, health professionals, family, and patients), PRDU (friend, relatives, and colleagues), and PBDU (medical research and institutions, pharmaceutical companies, and public users). We present the PRSX-AC model information flow with our proposed extension in hybrid cloud in Figure 1; however, we will briefly explain each phase in next sections. It is important to note that all three phases of proposed model are performed at private cloud. Its benefit is that HCOs can be relieved from all infrastructure and storage operations due to performing all such operations at private cloud. Another benefit is that the public cloud vulnerability becomes reduced due to this processing shift.

Block diagram of PRSX-AC hybrid cloud model.
First, HCO uploads Original EHR data to the private cloud. Domain users (ODU, PRDU, PBDU) send EHR access request to policy enforcement point (PEP). The PEP sends the access request to the context handler. Context handler converts it into an XACML request context and sends it to the policy decision point (PDP). The PDP requests subject or resource attributes (EHRs) from the context handler. The context handler requests the remaining missing attributes from a policy information point (PIP). The PIP obtains the requested attributes from EHR data. The PIP returns the requested subject/resource attributes to the context handler. (a) If access request is from ODU, the context handler sends the request to PDP, it evaluates the access policy and access response is given to PEP; then, it sends response to ODU users. (b) For PRDU access request, the context handler sends the request to relationship reasoner to get semantics of relationship; once obtained, relationship semantics are given to context handler, and it sends access request to Phase 2 for EHR attribute anonymization; after receiving response from Phase 2, context handler forwards it to the PDP through PEP. The PDP evaluates the access policy, and access response is given to PEP; then, it sends access response to PRDU users. (c) For PBDU, policy access request is given to context handler and it further sends access request to Phase 2 and receives the response from it. Next, context handler sends access policy with anonymized response to Phase 3, where access policy is anonymized with hashing. Phase 3 returns response to context handler and it follows same steps as given above in (Phase 2) and access response is given to PRDU users.
Phase A: XACML-Hybrid RS-ABAC
We have extended XACML-attribute-based access control model33,73 in relationships and semantic context in XACML-Hybrid RelS-ABAC. In proposed model, we are using Rel BAC model 36 concept in EHR access scenarios. In our proposed (PRSX-AC) model, we have used hybrid relationship semantics–based and ABAC model for access control decisions of all requests that come from EHR users. Access requests from ODU and PBDU will get access response from XACML-ABAC in hybrid model. However, access request from PRD users will be processed by relationship semantics–based approach in proposed model. In this category, different patient relationships are introduced as shown in Figure 3(a). Patient relationship access request will not be interpreted with their semantic meaning in XACML. For this purpose, we have used relationship reasoner in PRSX-AC model. When PRD users request EHRs, then it is forwarded to hybrid Rel ABAC model in extended XACML. Next, missing relation attributes are requested to relationship reasoner, and Ontology Point contains ontology for various relationships. As given in Giunchiglia et al., 36 “An ontology is capable of describing concepts, e.g. persons, which exist in a certain domain and relationships among them.” Ontologies are described in the Semantic Web in Web Ontology Language (OWL). The process of drawing conclusions and new information gain through ontology’s takes place through inference engines. Simple inferences can be made with Resource Description Framework Schema (RDFS) and OWL, for instance, through inheritance; however, complex custom inference rules require some special rule language like Semantic Web Rule Language (SWRL). In our proposed model, the inference engine performs relationship reasoning based on logical inferences rules. Extended XACML decides about the access response and gives it to PRDU. The process of XACML policy evaluation with relationship semantics is given in Figure 2.

XACML-hybrid RelS-ABAC policy evaluation with relationship-based semantics.
We have assumed Subject, Object, and Permission hierarchies in our proposed EHR access approach. For this purpose, we present a mapping of subject-to-patient relationships, object-to-EHR data, and permissions-to-EHR permissions and present their hierarchies. For access control decisions, we are using the Rel BAC logic, which allows us to express and reason about patient relationships with objects (EHR data) to form permissions and rules. We are presenting a short description of how we can use it to express more expressive relationship-based access control policies for EHR access scenarios. We have sets of users and objects formalized as atomic concepts. Permissions are formalized as description logics (DL) roles (not to be confused with the RBAC roles)
where

(a) Subject (patient relationship), (b) object (EHR data), and (c) permission (EHR permissions) hierarchies.
RelS-BAC rules and description.
RelS-BAC: EHR policy rules and representation.
EHR: electronic health record.
Phase B: XACML-EHR anonymization
The privacy technique Anatomy is developed to overcome the defects of generalization and to achieve better utility in data publishing. Anatomy 14 produces two tables: A Quasi Attribute Table (QAT) and a Sensitive Attribute Table (SAT); the two tables separates QI-values from sensitive values. Anatomy does not modify the quasi-identifier or the sensitive attribute, but it separately releases QAT and SAT to disassociate the relationship between the two tables. The QAT contains the quasi attributes, SAT contains the sensitive attributes, and both QAT and SAT have one common attribute Group-ID. All records in the same group will have the same value of Group-ID in both tables so that it will help in linking the sensitive attribute values in the group. Every group must have distinct sensitive attribute values and each distinct sensitive value occurs exactly once in the group. In generalization, quasi attribute values are generalized, whereas in Anatomy, QAT values are in original form; therefore, the Anatomy is considered a better approach then the generalization. Figure 4 shows the process of EHR anonymization performed with privacy technique Anatomy. The anatomization process that we perform on EHR data tables is described in Algorithm 2 with complete details in next section.

The process of EHR anonymization in XACML-privacy by Anatomy.
Phase C: XACML-policy anonymization
After EHR data anonymization, policy anonymization is performed in PRSX-AC model. Although EHR data at public cloud will be in anonymized form, transmission access policy without anonymization will provide a source of collusion between CSP and unauthorized malicious entity. In this case after anonymization, CSP and unauthorized data user at public cloud will not be able to gain information that can be used for malicious purposes. In proposed model, policy anonymization is performed with MD5 hash function. First, access policy is parsed to extract logical, relational operators; then, the remaining attributes are anonymized using hashing algorithm. Figure 5 shows the process of access policy anonymization and policy format before and after policy anonymization.

The process of access policy anonymization with hashing and policy format.
PRSX-AC model: access control and anonymization algorithms
In this section, we will define three (PRSX-AC) model-based algorithms with their complete details. PRSX-AC model includes privacy-aware relation-based access control (PR-AC) algorithm, XACML anonymization algorithm, and XACML-policy anonymization algorithm. We will present formal specification and modeling of the algorithmic details in the next section.
In Algorithm 1, first, Original EHR outsourcing from HCO to the private cloud is performed. EHR original entities
In Algorithm 2, given an EHR data table ET and a parameter l, we obtain a pair of tables QAT and SAT for publication. First, an l-diverse partition of ET is computed, and then, the QAT and SAT from the l-diverse partitions are produced. After that, it hashes the tuples of ET into hash buckets by their sensitive values SV so that each bucket includes the tuples with the same SV value. The QI-group-creation step is performed in iterations and continues as long as there are at least l non-empty hash buckets. In new QI-group QGc, first, algorithm obtains a set Sl consisting of the l hash buckets that currently have the largest number of tuples. Then, from each hash bucket in Sl, a random tuple is selected and added. Therefore, QGc contains l tuples with distinct SV values. Next step is Tuple-residue-assignment, which is performed for each residue tuple t. Algorithm collects a set SO of QI-groups (produced from the previous step), where no tuple has the same SV value. Then, at last, anonymized QAT and SAT tables are published.
XACML-policy anonymization algorithm, given with plain access policy
Formal specification, modeling, and verification of PRSX-AC model
In this section, we tried to minimize the level of abstraction through detailed modeling and formal analysis of the proposed PRSX-AC model. We have used high-level Petri nets (HLPN) and Z language for the modeling and analysis of the proposed model. In Malik et al., 74 it is given that we can use HLPN for two reasons: (a) to simulate the proposed systems and (b) to provide the mathematical representation, so that we can analyze the behavior and structural properties of the proposed model. We can summarize the benefits of presenting formal model and analysis of the proposed systems as (a) the interconnection of the model components and processes, (b) the fine-grained details of the flow of information among various processes, and (c) how the information processing takes place. The verification of proposed model is performed using SMT; for this purpose, the Petri net models are first converted into SMT with the specified properties. After that, Z3 solver is used to check either the model satisfies the required properties or not. In this study, we use HLPN to perform formal specification and modeling of proposed algorithms. HLPN is a set of 7-tuple, N = (P, T, F, ϕ, R, L, M0):
T represents a set of finite transitions, such that
F denotes the flow relation from place to transition or transition to place, such that
R represents the set of rules that maps T to logical formulas, such that
L denotes the labels that are mapped on each flow in F, such that
M0 represents the initial state where the flow can be initiated, such that
To represent a system in HLPN, we first define a set of P (Places) and the associated data types; after that, we define set of rules involved in HLPN. Figure 6 depicts the HLPN of the PRSX-AC model. The notations used are presented in Table 4. Table 5 shows the places, mapping, and the description involved in the PRSX-AC model HLPN. As shown in Figure 7, there are 20 places and 18 transitions involved in the PRSX-AC Model, so we have divided its HLPN model into Phase A, Phase B, and Phase C, same like we did in previous section. We have already described the proposed model with its logical details in previous section. In this section, our focus will be at the specification and modeling of PRSX-AC model phases.

HLPN for privacy aware relationship semantics-based XACML access control model (PRSX-AC).
Summary of notations.
EHR: electronic health records.
Places and mapping used in PRSX-AC HLPN.
PRSX-AC: privacy-aware relationship semantics–based XACML access control model; HLPN: high-level Petri nets; EHR: electronic health records; QIG: quasi-identifier group; F-QIG: final quasi-identifier group; QAT: quasi attribute table; SAT: sensitive attribute table.

HLPN of Phase A-XACML-hybrid RS-ABAC.
Modeling and analyzing: Phase A-XACML-hybrid RS-ABAC
The HLPN model of PRSX-AC model starts by taking inputs from HCO and storing it in EHRo. The transition Outsrc EHRo stores EHR in the private cloud. EHR domain users (EHRu) send access request to PEP; in this Phase A, original users’ (Ou) request is described; however, the remaining users (Pru and Pbu) can also send request in the same way. PEP sends that access request the context handler as given in equations (1) and (2). Contxt Handlr converts user request into an XACML request context and sends it to the PDP; in addition, Contxt Handlr also performs request forward activities for PDP, PEP, and PIP. However, the main functionality is given in equation (3). The HLPN of Phase A-XACML anonymization is shown in Figure 7
The PDP requests attributes from the
In equation (6), transition
Modeling and analyzing: Phase B-XACML-EHR anonymization
In Phase A, we have modeled XACML-based request/response when ODU and PRDU are participating in PRSX-AC model. ODU will get Original EHR data XACML-based response. When a request is received from PRDU, the relationship semantics are resolved as XACML lacks semantic interpretation of participating entities. The HLPN of Phase B-XACML anonymization is shown in Figure 8.

HLPN of Phase B-XACML-EHR anonymization.
In Phase B, EHR data anonymization modeling is performed as follows. EHRs from PIP having same sensitive values are hashed and stored in HB as shown in equation (8). After that, tuples are taken from set of l largest buckets SOB and quasi groups are formed with the union of tuples to quasi groups in function
In residue assignment process,
Modeling and analyzing: Phase C-XACML-policy anonymization
We have modeled XACML anonymization algorithm in Phase B in this Phase C, we will model policy anonymization algorithm. Access policy anonymization is necessary to prevent privacy disclosures that may occur when policy is transmitted from private to public cloud. Access policy is received from PEP as shown in HLPN of Phase C-XACML-policy anonymization. Transition P-Anonymization compares operators from SOP with the policy and performs hashing of compared policy attributes as given in rule (equation (14)). Figure 9 shows the HLPN of Phase C-XACML-policy anonymization.

HLPN of Phase C-XACML-policy anonymization.
In equation (15), transition Send-Data sends anonymized EHR tables AnQAT, AnSAT, and An Policy to the public cloud. Phase C completes when access response is given to EHRe in the last transition Acs-Resp as given in equation (16)
Formal verification of PRSX-AC model
We have presented the formal modeling and analysis of proposed (PRSX-AC) model in previous section. In this section, we will present the security and privacy property verification of PRSX-AC model. In verification process, we demonstrate the correctness of the base system. We need system specification and properties to verify a proposed model or a system.
74
In this work, we use the bounded model checking75,76 technique to perform the verification, using SMT-Lib and Z3 solver. In bounded model checking, we verify the system description, in this process, it is checked whether there are any of the valid inputs that drive the system into a state where the system always terminates after a finite number of steps. We perform various tasks during the process of bounded model checking: Specification, the properties or rules, which must be satisfied by the system to prove its correctness; Modeling, representation of the system; Verification, we use a tool to check whether the specifications have been satisfied by the model. The definition of bounded model checking
74
is given as, “Formally, given a Kripke Structure
Property 1. Authorization request: access request from ODU or PRDU for EHR data is given to private cloud. Any EHR access attempt from un authorize user at public and private cloud will be denied.
Property 2. EHR anonymization: EHR data will be anonymized through Anatomy and stored in anonymized EHR repository. EHR anonymization property will anonymize EHR data so that it can preserve patients sensitive attributes against privacy attacks like identity disclosure and attribute disclosure.
Property 3. Multipurpose utility: PDP evaluates the access request against stored access policy in PAP and permissions will be given depending upon type of user:
If access request come from OD users, it will get response from Original EHR data at private cloud.
If access request come from PRD users, it will get specific permission response, based on relationship from anonymized EHR data at public cloud.
If access request come from PBD users, it will get permissions response from anonymized EHR data at public cloud.
Access request from any other unauthorized users like if PRD users request Original EHR data, it will result in response Deny/Not Applicable.
Property 4. Policy anonymization: access policy will be anonymized before transmission to public cloud. This property avoids possible attacks like data spoofing, unintended EHR data modification, and collusion attacks at public cloud.
The verification results of PRSX-AC model are given in Figure 10.

Verification results of PRSX-AC model.
Experimental results and discussion
In this section, we present the experimental results to check the effectiveness of proposed (PRSX-AC) model-based approach. The performance and optimization parameters are evaluated in terms of response time and space requirement.
Preparation and settings
To evaluate our idea, we implemented a prototype that compiles XACML policy into MSSQL ACLs. For this purpose, we designed a resource database (hospital) that is populated with random data because of lack of enough information. The Patient attribute table consists of 50,000 patients EHRs with 25 attributes (attr0–attr24) each. 77 All the experiments were carried out on a 2.4 GHz Intel Core™ i3 with 8 GB memory and running Windows 10. We have used the database server MSSQL version 12.0.2000.8 in our experimental verification.
Access policy response time
When different domain users access EHR data, in such context, access response time is critically important factor for HCOs. In Figure 11, we have taken execution time in seconds on y-axis and number of policies on x-axis .We can deduce from Figure 11 that if we increase the number of records, then the retrieval time will also increase linearly. It can be easily shown from Figure 11 that there is noticeable increase in response time of PRDU as compared to ODU and PBDU. As given in Esposito, 78 the ontological response produces performance overhead, so in our case of hybrid Rel ABAC model, the ontological representation and inference create approximately the same overhead. However, anonymization of EHR data imposes less overhead as compared to PRDU due to the use of privacy technique Anatomy. We have taken execution time of Anatomy as given in Shyamala and Christopher. 79 In our proposed model, we are giving access to different other domain users also, so it will improve multipurpose EHR usage in health scenarios.

Policy response time.
Space requirement
For space requirement, we take an average in each analysis and round it to the nearest integer. Here, the space requirement is increasing linearly with the number of attributes; in this case, it is scalable. Figure 12 represents the space storage (in MBs) on the y-axis and the number of attributes (n) on the x-axis. It can be stated from Figure 12 that there is negligible difference of space requirement between original attributes and anonymized attributes. As in Anatomy, we have original attributes even after applying the privacy technique and there is no increase in space of anonymized attributes. However, it must be noted that generalization-based privacy techniques application to EHR data records prominently increases the space requirement. It can be used to support a more useful fact that such types of privacy techniques can be used more effectively in EHR privacy-aware access scenarios.

Space requirement for EHRs.
Discussion
Experimental results show that we have successfully achieved the design goals for proposed (PRSX-AC) model. Proposed approach design goals are explained in detail in section “PRSX-AC model: design goals.” It is shown in response time that different data user entities can get their required response depending upon their request. Access control mechanism prevents unauthorized access to EHR data; moreover, it saves system overhead in case of full EHR data retrieval. However, it is noted that ontological response time in terms of relationships is higher as compared to anonymized and original response. This increase in execution time is due to use of ontology in relationships. Anonymized response time introduces a minor delay that is acceptable against EHR personal sensitive information disclosure. Although number of requested attribute in access policy affects response time (access time increases with increase in number of attributes), in our proposed approach, we are using hybrid ABAC model in PRSX-AC. Its advantage is that we can selectively anonymize requested attributes so access time will not directly depend upon total number of EHR attributes. Multipurpose EHR usage is achieved as response from different data user entities and is given depending upon their specific requirement. Privacy preservation of EHR data is performed through anonymization technique Anatomy. Our solution provides defense against external threats through policy anonymization and internal threat as authorized users in public cloud will also get anonymized version of EHR data with access control mechanism. Privacy technique Anatomy preserves EHRs’ sensitive information disclosure. Space requirement shows that the use of Anatomy is highly suitable as it is not creating any space overhead as compared to other privacy-preserving techniques.
Conclusion
To provide privacy-aware fine-grained EHR access in cloud is a challenging task. Cloud-based EHR system has shown great potential to improve the quality of service and utilization of EHR data across medical institution. However, privacy preservation with multipurpose EHR usage in hybrid cloud is not completely focused in most of the proposed solutions. A comprehensive analysis of privacy-preserving solutions of cloud-based EHRs shows that although hybrid cryptographic access control schemes provide highly developed solutions, however, still it is not sufficient to support privacy preservation with multipurpose EHR data utility. The solutions also lack fine-grained relationship semantics for EHR access with an efficient privacy preservation mechanism. Our proposed (PRSX-AC) model is based upon exploratory research from related work. We have innovatively extended XACML-attribute-based access control mechanism with (Rel BAC) semantics, privacy technique Anatomy, and access policy anonymization in hybrid cloud. Our (PRSX-AC) model provides privacy with maximum EHR data utility. We have given relationship-based EHR access scenarios in PRSX-AC model, as it will enhance model understanding. The proposed model (PRSX-AC) is formally verified along with its security and privacy-preserving properties. Our experimental results show that implemented prototype of PRSX-AC model is effective in terms of performance and optimization parameters.
Footnotes
Handling Editor: Mohsin Raza
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
