Optimization of dynamic data traceability mechanism in Internet of Things based on consortium blockchain

Abstract

Internet of Things is widely used in many fields such as industry, medical care, education, and supply chain. With the participation of multi-authorized entities, a large number of dynamic data will be generated in the basic dimension of time. The operations on these data have to be safe and traceable for use in various forensics and decisions. Therefore, the key point of dynamic data security protection is to reject tampering of unauthorized users and to realize the process in evidence and tracing of the dynamic data operation. In order to find a solution to the problem above, an optimization of dynamic data traceability mechanism based on consortium blockchain is proposed in this article. First, a mathematical model for the security of dynamic data storage has been established, followed by analysis on honest behavior motive of individual node decision-making in group game and distributed node cooperation essence in specific industry background. After that, ownership transition function and the architecture of the dynamic data storage system are optimized; quality and growth characteristics of the system under stochastic state model are analyzed. Result shows that the solution can effectively avoid potential attacks such as tampering and faking under approved accession mode. The mechanism has good application value while ensuring the dynamic data storage security.

Keywords

Consensus consortium chain dynamic data game theory

Introduction

The Internet of Things (IoT) has been connecting extraordinarily large number of devices to the Internet. In many fields such as industry, medical care, education, supply chain, new data will be generated in time dimension with participation of multiple authorized entities, which is called dynamic data in this article. Handling the massive amount of data generated by these devices in efficient, secure, and economic way is an essential research subject. Stringent security and traceability is required by dynamic data and operations on the data for various forensics and decision-making.^1,2 Common characteristics existed in dynamic data are as follows: (1) Continuity. As time goes on, new dynamic data are generated continuously by multi-entities. (2) Time sensitivity. Dynamic data are sensitive to the generation time and application time. For example, to predict future tendency by dynamic data generated during a certain period. (3) Multidimension. In different applications, dynamic data have multiple dimensions besides time dimension. For example, address and trading volume in supply chain system, operation history and permission settings in industrial control system, etc.^3,4 (4) Availability. Dynamic data should be qualified with availability and meet the management requirements by users, especially enterprise users,⁵ such as log information analyzing and checking, illegal operation investigation. Traceability is an important prerequisite for integrity and reliability of dynamic data, and it is an important manifestation of availability of dynamic data. Therefore, it is significant to ensure traceability of dynamic data.

Traceability of dynamic data includes traceability of dynamic data itself and historical operations on it. Prerequisite for the availability of dynamic data is to ensure the integrity and reliability of them, namely, to ensure dynamic data are not tampered or falsified in the process of storage and transfer. However, preventing tampering and forging of dynamic data are increasingly becoming a challenging task due to two main reasons. First, over the span of last few years, cyber-crime has transformed from individual act to organizational act. This transformation provided attackers with high budget, resources, and sophistication to become more professional in data tampering and forging.⁶ Second, the existing data infrastructure is originally designed for the legitimate storage of dynamic data but with potential security risks. For example, dynamic data are usually stored by central database combined with certain safety measures, such as access authentication, access control, information encryption, digital watermark.^7–10 As well as stored upon cloud-based data storage technology by abstracting all kinds of data resources into resource pools,^11,12 which can be provided to users in a more flexible and convenient way. However, the above traditional methods have a variety of problems, such as vulnerable to be attacked because of the concentrated value, high complexity of algorithms, and so on.

In order to improve the security of dynamic data storage, the system must be strengthened from two aspects. On one hand, the correctness of the dynamic data should be verified and not be tampered or faked. On the other hand, the system should implement the traceability of dynamic data operation history and provide the capability of data recovery. In view of the problems above, the whole life cycle of dynamic data recorded by blockchain is put forward in this article to implement security and reliable storage, ensure the integrity, reliability, traceability, and permanence of dynamic data.

In this article, the consistency between local behavior of consensus terminals maximizing their own benefits and the overall goals of ensure system security and effectiveness has been analyzed. Consensus mechanism applicable for the dynamic data storage has been proposed accordingly. Compared with proof of work (PoW), the consensus mechanism we proposed has greatly reduced the waste of power and computing resources. With key distribution mechanism, dynamic data stored at different levels could be transmitted and verified, as well as through secondary hash iteration communication between adjacent levels and encryption operation asymmetrically the security of the system has been improved a lot.

The major work of this article is as follows:

By analyzing the honest behavior motive of individual node making decision in group game and the distributed nodes cooperation essence in specific industry background, a consensus mechanism applicable to dynamic data storage is proposed in this article and the security of dynamic data are guaranteed in consensus way based on blockchain technology.

Hierarchical traceability mechanism of dynamic data in multi-entities transferred dynamically is put forward in this article. Communication channel is built on key mechanism, and secure transmission of end-to-end encrypted data communication system is achieved. Tampering or faking dynamic data are effectively prevented, by providing timing relationship while using the positive and negative asymmetric encryption operations.

An universal distributed dynamic data traceability mechanism for IoT is put forward based on the blockchain consensus mechanism and information transmission mechanism proposed in (1) and (2), providing an effective means for dynamic data monitoring and protection.

The organizational structure of this article: First, current traceability technologies and the concepts of blockchain were introduced, followed by analysis of consortium blockchain applicability to solve the problem of IoT dynamic data storage security. Second, universal methods and processes for the security of dynamic data operation were abstracted, as well as the dynamic data storage model was proposed. After that, consensus boundary conditions of IoT were analyzed based on Game Theory, consortium blockchain consensus mechanism based on verification node list was proposed, and the dynamic data storage architecture based on the above-mentioned consensus mechanism was further put forward. Finally, by theoretical analysis and deployment experiments, the effectiveness of the solution for resisting common attacks and the traceability of dynamic data operations was proved.

Related work

Traceability methods

Traceability is a key activity in many industries. Many modern traceability systems are based on radio frequency identification (RFID) technology. Alfian et al.¹³ proposed an e-pedigree food traceability system, utilizing RFID technology to track and trace product location and wireless sensor network to collect temperature and humidity during storage and transportation. According to the application of the RFID system, the security solutions have specific characteristics and requirements. Several security approaches, in various application domains, are based on a central database with large computational resource.¹⁴ The existing centralized database is vulnerable by attackers because of its concentrated value, so data security and authenticity of them cannot be guaranteed. Nissim et al.¹⁵ reviewed 29 different USB-based attacks and classified them into four major categories. It put forward a method to identify the associated and vulnerable USB peripherals and hardware for each attack, but it meant that the method allowed occurrence of attack. Dawle et al.¹⁶ proposed a database intrusion detection mechanism to enhance the security of database through logging all the activities of an intruder using SQL injection in a website. Administrator can show the details and prevent attackers in injecting malicious codes to the database and stealing or destroying or modifying database. However, as the view of Yang and Chen,¹⁷ the single-party trust mechanism cannot avoid the malicious tampering or falsification in dynamic data by staff with advanced access right. What is more, the above algorithms are usually very complex. Generated dynamic data volume and incremental volume with time by persistence will be extremely high. In addition, processing and storage performance of intelligent processing terminals or field sampling devices, used for collecting, coding, and dynamic data storage are usually limited, hence, high complexity security algorithm is not suitable to solve the problem above.¹⁸

Cloud storage service has been widely adopted by diverse organizations, through which users can conveniently share data with others. Zhang et al.¹⁹ pointed out that cloud data are open to multiple authorized users, as well as sufficient evidences of data whereabouts and operation histories of all level subjects’ are hardly to be recorded and provided. Therefore, it cannot meet the entire process recording requirements of dynamic data in some special fields (such as industrial control systems and traceability systems). Accordingly, it is difficult to determine responsibility under cloud storage. In order to solve the problem that uncontrolled malicious modifications may wreck the usability of the shared data, Yang et al.²⁰ proposed an efficient public auditing solution that can preserve the identity privacy and the identity traceability for group members simultaneously. But under the cloud platform, people cannot build trust with cloud service providers and cannot make sure they would fulfill the service agreements by the web front-end interface either.²¹ To avoid sensitive information being stolen, tampered, or faked, the system needs an absolutely reliable cloud platform service provider, but it is unrealistic.

As can be seen from the above, RFID is suitable only for tracking tangible assets, not for dynamic data or tracking intangible assets. The centralized database such as cloud computing can only implement the storage of dynamic data, but cannot guarantee the security of the data. Natural defects are existing in these traditional centralized databases in resisting malicious users (including internal personnel with advanced permissions) to tamper or fake dynamic data.^22,23 Blockchain can provide tamper-resistant and other secure features guarantee without the third party. By analyzing blockchain application scenarios of dynamic data oriented to multi-agencies, a method has been put forward to limit the consensus and visibility of the blocks within the consortium blockchain, and a new way has been provided to solve the problem for storage and traceability of dynamic data simultaneously.

Blockchain

Blockchain technology is a new “distributed accounting system.”²⁴ Since the groundbreaking paper, “Bitcoin: A peer-to-peer electronic cash system” published by S Nakamoto²⁵ in 2008, blockchain has been emerging as a new trust mechanism in recent years. Simple technologies such as hash chain, manufacturing work delay, incentive mechanisms have been adopted to bypass the problems in traditional academic. Through decentralization and de-trust, problem of the multi-parties cooperation and mutual trust has been almost perfectly solved. “Universal participation” and “Shared writing” of data information have been realized in collectively to maintain the solution of reliable database and point-to-point value registration and transfer.^26,27 As an underlying security technology, blockchain has attracted the attention of the cryptography and other domains. Researchers have conducted extensive researches on blockchain technology, including the analysis of the protocol^28,29 and blockchain technology applications in special fields.^30–32

Generally, according to the restriction on participating nodes, blockchain can be roughly divided into three types, namely, public blockchain, consortium blockchain, and private blockchain.

Public blockchain. Everyone can check the transaction and verify it anonymously and can also participate the process of getting consensus. For example, Bitcoin²⁵ and Ethereum³³ are both public blockchains.

Consortium blockchain. Consortium chain is limited to members of the alliance. Read–write access permission and accounting qualification are formulated according to the rules of the alliance. The data in blockchain can be open or private, treated as partly decentralized. R3CEV³⁴ and Hyperledger³⁵ are both consortium blockchains.

Private blockchain. Private blockchain is only used by private organizations. Read–write access permission and accounting qualification are formulated according to the rules of the private organization.

Consortium blockchain can set the openness to the public according to the application scenario. The network is maintained by the member agencies, and nodes are accessed through the gateway of the member agencies. Therefore, consortium blockchain is applicable to storage, management, authorization, monitoring and auditing of dynamic data by multiple agencies in specific background.

Dynamic data storage security model

According to the security problem of dynamic data storage in the transaction process of the instance system, the assumptions are made as follows:

The attack source is unlimited, and each attacker comes separately and independently.

The arrival of the attackers reaches the Poisson distribution of the parameters $λ$ , which is the average attack occurred per unit time.

Tamper of dynamic data caused by each attack is under the negative exponential distribution of the parameter $μ$ .

Intervals and results of each attack are independent.

Set M as number of terminals in the instance system, U as the set of information code of dynamic data, $U = {u_{1}, u_{2}, \dots, u_{n}}$ , $u_{i} = 〈 u_{i} . code, u_{i} . state 〉$ , $u_{i} . code$ includes the corresponding dynamic data information code, $u_{i} . state$ representatives the state of data files, and $u_{i} . state \in {1, 0, - 1}$ $(i \in N)$ . Three kinds of values correspond to tampered, no change, and faked three conditions in the data file. L is the safe operation condition of the system. $S_{L} (t_{1}, t_{2})$ describes the damage extent of the system under safe operation condition L within time $[t_{1}, t_{2}]$ under all kinds of attacks. $P_{L} (t_{1}, t_{2}, k)$ is the probability of data files tampered or faked for k times under safe operation condition L within time $[t_{1}, t_{2}]$ under all kinds of attacks. $R_{B} (S)$ is the risk factor of the system under the security algorithm B or called the system robustness factor. $p_{n} (t)$ representatives the probability that system has been attacked by n times in time t.

According to the assumption, when $Δ t$ is little enough, the probability of one attacker arriving during time $[t, t + Δ t]$ is $λ Δ t$ . Therefore, at the moment $t + Δ t$ , the probability $p_{n} (t + Δ t)$ that system has been attacked by n times is

\begin{matrix} p_{n} (t + Δ t) = p_{n} (t) (1 - λ Δ t - μ Δ t) + p_{n + 1} (t) μ Δ t + o (Δ t) \\ (p_{n} (t + Δ t) - p_{n} (t)) / Δ t = λ p_{n - 1} (t) \\ + μ p_{n + 1} (t) - (λ + μ) \cdot p_{n} (t) + o (Δ t) / Δ t \end{matrix}

By $Δ t \to 0$ , there is

\begin{matrix} d p_{n} (t) / dt = λ p_{n - 1} (t) + μ p_{n + 1} (t) \\ - (λ + μ) p_{n} (t) \begin{matrix} (n = 1, 2, \dots) \end{matrix} \end{matrix}

(1)

Considering the special case, when $n = 0$ , the probability of the system being attacked within the time $[t, t + Δ t]$ is divided into three independent situations:

If system has not been attacked at time t, and there is no new attack within $[t, t + Δ t]$ , the probability is $(1 - λ Δ t) p_{0} (t)$ ;

If system has not been attacked at time t, and there is one new attack within $[t, t + Δ t]$ , the probability is $λ Δ t μ Δ t p_{0} (t)$ ;

If system has been attacked at time t, and there is no new attack within $[t, t + Δ t]$ , the probability is $(1 - λ Δ t) μ Δ t p_{1} (t)$ .

Thus, there is

d p_{0} (t) / dt = - λ p_{0} (t) + μ p_{1} (t)

(2)

Therefore, $p_{n} (t)$ should be subject to equations (1) and (2).

$S_{L} (t_{1}, t_{2})$ can be calculated by the following equation

S_{L} (t_{1}, t_{2}) = \sum_{i = 1}^{n} | u_{i} . state | (0 \leq t_{1} < t_{2}, u_{i} \in U)

(3)

The model established in this article is as follows

{\begin{matrix} d p_{n} (t) / dt = λ p_{n - 1} (t) + μ p_{n + 1} (t) - (λ + μ) p_{n} (t) \\ d p_{0} (t) / dt = - λ p_{0} (t) + μ p_{1} (t) (n = 1, 2, \dots) \end{matrix}

(4)

Min S_{L} (t_{1}, t_{2}) = \sum_{i = 1}^{n} | u_{i} . state | (0 \leq t_{1} < t_{2}, u_{i} \in U)

(5)

u_{1} \lor u_{2} \lor \dots \lor u_{n} = 1 \begin{matrix} (u_{i} \in U) \end{matrix}

(6)

P_{L} (t_{1}, t_{2}, k) = P {S_{L} (t_{1}, t_{2}) = k}, (0 \leq t_{1} < t_{2}, k = 0, 1, 2, \dots)

(7)

Min R_{B} (S) = \sum_{k = 1}^{\infty} k \cdot P_{L} (t_{1}, t_{2}, k), \begin{matrix} (0 \leq t_{1} < t_{2}, k = 0, 1, 2, \dots) \end{matrix}

(8)

Equation (4) is the probabilistic equation which shows the system has been attacked by n attackers at moment t; equation (5) is the constraint equation of the extent of system damage; equation (6) is system constraint equation requiring at least one data file; equation (7) is the probability that data files are destroyed k times within time $[t_{1}, t_{2}]$ ; equation (8) is the system target constraint equation to measure average attacks of the system within certain intervals. The smaller the value is, the higher robustness the system is.

The traditional security model of data storage is modeled from the perspective of centralized defense. Centralized database storage combined with certain traditional safety methods of cryptography, such as access control, access authentication, information encryption, digital watermark, can improve system storage security to some extent, but it cannot avoid the security threat caused by potential vulnerabilities of the system or the malicious sabotage of the staffs. In addition, traditional security model based on central enabled IoT framework manifests a number of significant disadvantages, such as security and trust issues, high server maintenance costs, weakness for supporting data traceability based on IoT applications, which impede its wide adoption. The dynamic data security model described as the above equations (4)–(8) is modeled based on two elements of de-trust and defense capability, and it cannot be solved by such traditional methods fundamentally. In contrast, the dynamic data storage security model we proposed is available for byzantine situation. Therefore, it can solve the aforementioned problems well and be used to design new decentralization frameworks for IoT. Therefore, a method to optimize dynamic data storage mechanism based on blockchain technology is proposed in this article.

Optimization of dynamic data storage mechanism

Boundary conditions for consensus

Distributed consensus is the key problem that has to be solved to build a “zero trust” dynamic data traceability mechanism based on blockchain technology. However, the conditions for consensus are quite different between the requirements of public anonymous scenarios and those with rights management.³⁶ For example, financial systems such as Bitcoin adopt an economic incentive mechanism in systems with highly decentralized decision-making power to enable each node to reach an agreement on the validity of block data efficiently. It is simple and effective for nodes to join in the public chain freely. However, the dynamic data are usually internal data of industry closely related to specific work process, as well as consortium chain allowing only approved nodes to join in is more suitable for the management of dynamic data. Obviously, consensus incentives under the monetary system do not apply to the management of dynamic data under the consortium blockchain.

Under the consortium chain, there are certain prerequisites of trust and interest restraints in the multi-party participation. In this section, by analyzing the honesty motive of decision-making of a single node during the group game, we proposed that the essence of distributed nodes cooperation in a particular industry is to make the maximal cumulative utility of each nodes in interaction with the environment and distributed computing, then further analyze the boundary conditions agreed upon by each node in the dynamic data traceability system to optimize the consensus mechanism.

Assume that the set of the nodes involved in the block information validation of the dynamic data traceability system is a finite set. For each node i involved in the verification of information, there is a policy space $S_{i}$ and a revenue function $U_{i}$ , that is, Von Neumann utility for each node i under the strategy space $Σ_{i} = (s_{1}, s_{2}, \dots, s_{n})$ is $U (S_{i})$ . In this article, the expected utility of nodes $U (S_{i})$ under the strategy space $S_{i}$ is used as the value function to evaluate each action.

The goal of each participating node is to maximize its own revenue. To simplify the problem, all other nodes except node i are marked as $- i$ . By analyzing the interaction between nodes i and $- i$ to reach a consensus with binding agreement, the yield matrix of nodes i and $- i$ is obtained, as shown in Table 1.

Table 1.

Yield matrix of nodes i and $- i$ .

Node i	Node $- i$
Node i	Cooperative (C)	Betray (B)
Cooperative (C)	$P_{i} CC, P_{- i} CC$	$P_{i} CB, P_{- i} CB$
Betray (B)	$P_{i} BC, P_{- i} BC$	$P_{i} BB, P_{- i} BB$

In Table 1, C indicates a certain node cooperative, B indicates betray, and the first term in the income function expression is the revenue of node i under the corresponding policy (respectively, $P_{i} CC, P_{i} BC, P_{i} CB, P_{i} BB$ ). The second item is the revenue of node $- i$ under the corresponding policy (respectively, $P_{- i} CC, P_{- i} BC, P_{- i} CB$ ).

The construction of consensus model for each node of traceability system is based on the following hypothesis:

1. For node i, the yield under various strategic combinations satisfies

P_{i} BC > P_{i} CC > P_{i} BB > P_{i} CB

(9)

The above equation shows that in the case of inconsistent node behavior, the party adopting the betrayal strategy can obtain a higher return from the cooperative behavior of sacrificing the remaining nodes than all the nodes work cooperatively. All the nodes cooperate to a consensus can benefit more than betrayal. If a single side cooperates, but the remaining betray, there will be a great loss to the cooperative partner or lead to the lowest benefit.

2. Node i estimates that the probability of betrayal of node $- i$ is $λ$ , that is, the trust extent of node i to node $- i$ is $1 - λ$ . The system is formed by the nodes constituted of regulatory agencies, social organizations, and volunteers of the related traceability system. Among the n nodes that participate in the verification of traceable information, honest nodes are the majority (a relatively large proportion). The betrayal of node $- i$ concerns that there is a false consensus on the remaining nodes except node i. The above analysis shows that this possibility is very small, that is, $λ \to 0^{+}$ .

3. According to hypothesis 2, there is little chance that the node $- i$ will betray. Under the premise of its cooperation, if node i cooperates, it will gain $P_{i} CC$ . If node i adopts the strategy of non-cooperation based on opportunism, it will obtain the short-term maximum self-gain $P_{i} BC$ in Table 1, as well as it will cause that system identifies betrayal of node i within time $[t, t + Δ t]$ and penalizes it. The penalty cost is represented by a function $P (S_{i})$ . Therefore, the overall benefits of node i betrayal are

U_{BC} (S_{i}) = P_{i} BC - P (S_{i})

(10)

Further analysis based on the above hypothesis leads to the following conclusions: the strategy of cooperation or betrayal by node i depends on the comparison of expected benefits value between $E [U (S_{i})]$ brought by the cooperation of node i and node $- i$ and $E [U_{BC} (S_{i})]$ brought by node i betrayal. In two cases

E [U (S_{i})] \geq E [U_{BC} (S_{i})]

(11)

E [U (S_{i})] < E [U_{BC} (S_{i})]

(12)

In these equations, if equation (11) is established, node i will cooperate; if equation (12) is established, node i will betray.

$θ$ $(0 < θ < 1)$ is the discount factor of node i and used to adjust the effect of long-term gains on current benefits. $λ$ is the probability of cooperation of node $- i$ estimated by node i in a round of verification. When node i cooperates, the expected value of benefits can be expressed as

\begin{matrix} E [U (S_{i})] & = λ (P_{i} CB + \frac{θ}{1 - θ} P_{i} BB) + λ (1 - λ) (P_{i} CC + θ P_{i} CB + \frac{θ^{2}}{1 - θ} P_{i} BB) \\ + λ {(1 - λ)}^{2} (P_{i} CC + θ P_{i} CC + θ^{2} P_{i} CB + \frac{θ^{3}}{1 - θ} P_{i} BB) + \dots \\ \approx (1 - λ) [1 + (1 - λ) θ + {(1 - λ)}^{2} θ^{2} + \dots + {(1 - λ)}^{n} θ^{n}] P_{i} CC \\ + λ [1 + (1 - λ) θ + {(1 - λ)}^{2} θ^{2} + \dots + {(1 - λ)}^{n} θ^{n}] P_{i} CB \\ + λ \frac{θ}{1 - θ} [1 + (1 - λ) θ + {(1 - λ)}^{2} θ^{2} + \dots + {(1 - λ)}^{n} θ^{n}] P_{i} BB \end{matrix}

(13)

From equation (13)

\begin{matrix} E [U (S_{i})] = \frac{λ}{(1 - (1 - λ) θ)} \\ (1 - λ / λ P_{i} CC + P_{i} CB + \frac{θ}{1 - θ} P_{i} BB) \end{matrix}

(14)

The derivation process is similar to the above. When node i betrays, the expected value of benefits can be expressed as

E [U_{B C} (S_{i})] = (1 - λ) P_{i} B C + \frac{1}{1 - θ} [(1 - λ) θ + λ] P_{i} B B

(15)

By equations (11), (14), and (15), the condition for node i to take cooperation strategy is

\begin{array}{l} \frac{λ}{(1 - (1 - λ) θ)} (1 - \frac{λ}{λ} P_{i} C C + P_{i} C B + \frac{θ}{1 - θ} P_{i} B B) \\ \geq (1 - λ) P_{i} B C + \frac{1}{1 - θ} [(1 - λ) θ + λ] P_{i} B B \end{array}

(16)

From equation (16)

\begin{matrix} \frac{1}{P_{i} BC - P_{i} BB} \\ [\frac{1}{1 - λ} (P_{i} BC - P_{i} CC) + \frac{λ}{{(1 - λ)}^{2}} (P_{i} BB - P_{i} CB)] < θ \end{matrix}

(17)

Equation (17) is the boundary condition of consensus reached by all nodes in the dynamic data traceability system, which is denoted as $Θ (λ)$ .

Ownership state representation and state transition function

Similar to cryptocurrency trading, transaction process in the instance system can also be considered as a state transition system in the technical aspect. The system includes “state” and “state transition function” of all existing objects. The item here is a general concept and can be tangible goods or digital assets in the instance system. Relative description is given as below.

Definition 1

The “state” of an instance system is a set of all objects that have been encoded, distributed, unsold or ownership transferred (coded and unsale products outputs (CUPO)).

Each CUPO has an amount and an owner (defined by the hash value of its cryptographic public key address). A transaction involves one or more input and output. Each input contains a reference to the existing CUPO and a cryptographic signature created by the private key corresponding to the owner’s address. Each output contains a new CUPO added to the state.

Definition 2

State transformation function of instance system is defined as follows

APPLY (S, TX) - > S' or ERROR

(18)

Rules for each input definition of a transaction are defined as follows:

Rule 1. $QUOTE (CUPO) \notin S - > ERROR$ ;

Rule 2. $S I G N (U A P O) \neq S I G N o w n e r (C U P O) - > E R R O R$ ;

Rule 3. $\forall I N P U T (C U P O) < O U T P U T (C U P O) - > E R R O R$ ;

Rule 4. $\forall OUTPUT (CUPO) - INPUT (CUPO) - > S'$ ;

Rule 5. $\forall C R E A T (C U P O) = 0$ , $O U T P U T (C U P O) > = 0$ $- > E R R O R$ .

Rule 1 prevents sender of the transaction from selling or transferring objects that do not exist; Rule 2 prevents sender of the transaction from selling or transferring other people’s objects; Rules 3, 4, 5 ensure conservation of the value.

Consensus mechanism based on verification nodes list

One of the core advantages of blockchain technology is to use incentive mechanism in decentralized systems to reach consensus among nodes on the validity of block data effectively. However, the application of this mechanism in dynamic data storage system is obviously inadequate.

In this article, we proposed a consensus mechanism based on verification nodes list. By authorizing some of the trusted nodes based on the boundary conditions in this article (see section “Boundary conditions for consensus”), agencies assign a list of verification nodes list (VNL) and allocate different credibility to the nodes. Each node maintains credibility by serving others. When a node experiences long-term denial of service or selfish behavior, its credibility is lowered. When the credibility of the node is below a certain threshold, the node is moved out of the list. When more than 1/3 nodes of the VNL are removed, the agencies must reauthorize the new list.

In the system, the full-node servers are responsible for maintaining the VNL list. The verification nodes only consider the verification results of members in the VNL to complete the block generation. This consensus mechanism greatly improves the efficiency of system consensus while ensuring the security. Meanwhile, as the verification nodes are the authority nodes, once there is a betrayal node, it is convenient for the system to verify its identity and hold responsibility.

According to the analyzation of the consistency between the local behavior of consensus terminals to maximize their own benefits and the overall goals to ensure the system security and effectiveness, it can be concluded: when all terminals have blocks to be submitted for verification, neither side will change their verification results for other blocks in order to maximize their own revenues.

Each node participating in the verification gets all the valid actions that are not recorded before the consensus and exposes them as “candidate sets.” Then, each node participating in the verification merges the candidate set of all other verification nodes in VNL, verifies and votes all the operations. The effective operation of dynamic data includes two situations: one refers to the release of new data and the other is the dynamic data flow between different entities. The above two operations must be implemented through the authority nodes and are considered as a transaction. New data can be released without an input, but must have an output. The node that has the private key corresponding to the public key of the output address is the authorized node that can effectively operate on the address data. The transfer of dynamic data between different entities requires both input and output. The input must be an unused output of a transaction in the system. At the same time, this input needs to be signed by the private key corresponding to the previous output address to verify whether the current node is an authorized node. The trust foundation is achieved by pre-issuing the Root CA certificate (Root CA) through the top management of the industry and building a complete certificate trust chain based on the root CA and middle-tier CA to the lowest-level entity CA.

The mathematical form of the block consensus process is described below. In dynamic data storage system, $A = {A_{1}, A_{2}, \dots, A_{n}}$ is the set of terminals in the system. $G_{i} = {S_{i 1}, \dots, S_{in} : u_{i}}$ represents verification combinations and its benefits of the package block submitted by terminal $A_{i}$ . In each of the terminal verification combinations $(S_{i 1}, \dots, S_{in})$ of the blocks that are packaged by terminal $A_{i}$ , the verification results of the block submitted by terminal $A_{i}$ verified by $A_{j}$ are expressed as $S_{ij}$ and satisfied

\begin{matrix} S_{ij} = \\ {\begin{matrix} 1, After verified by A_{j}, the block submitted by A_{i} is legal \\ - 1, After verified by A_{j}, the block submitted by A_{i} is not legal \end{matrix} \end{matrix}

(19)

A_{j} \overset{S_{ij} = 1}{\to} A_{i} : u_{i} = u_{i} + 1

(20)

A_{j} \overset{S_{ij} = - 1}{\to} A_{i} : u_{i} = u_{i} - 1

(21)

Thus, there are several steps in a certain round of block consensus:

Step 1. When the current round is not yet over, select $A_{i}$ as the best block of this round if the first occurrence $A_{i} \in A$ and $u_{i} (S_{i 1}, \dots, S_{ij}, \dots, S_{in}) = n$ . That is to select the earliest verified block of the system.

Step 2. When the current round is over, select $A_{i}$ as the best block of this round if $\exists A_{i}$ , $\forall A_{j} \in A$ , $n > u_{i} (S_{i 1}, \dots, S_{ij}, \dots, S_{in}) > u_{j} (S_{j 1}, \dots, S_{ji}, \dots, S_{jn})$ . That is to select the maximum benefits block verified by all terminals of the VNL.

Step 3. When the current round is over, $\exists A_{i}$ , $A_{j}$ , $\forall A_{k} \in A$ , block $A_{i}$ and $A_{j}$ who reaches the current value $u_{i}$ earlier can be selected as the best block if $n > u_{i} (S_{i 1}, \dots, S_{i j}, \dots, S_{i n}) = u_{j} (S_{j 1}, \dots, S_{j i}, \dots, S_{j n}) > u_{k}$ $(S_{k 1}, \dots, S_{kj}, \dots, S_{kn})$ . That is to select the earliest maximum benefits block verified by all terminals of the VNL.

Dynamic data storage architecture

Dynamic data storage system uses multilevel access control model, supporting data to be changed dynamically between adjacent entities. Ownership transfer of dynamic data can be referred as the description in section “Ownership state representation and state transition function.”

The key distribution mechanism generates the pair key $(P K_{i}, S K_{i})$ for each entity in the instance system, which is used for communication between adjacent levels. The system only allows adjacent entities to communicate with each other. The authorization node is required to include its own entity certificate when publishing a new dynamic data file or to operate on a dynamic data file. The dynamic data files and related information (including the issuer and the recipient’s address, dynamic data file hash value, etc.) that have been verified and reached a consensus will be stored in the blockchain.

After each round of consensus, verification nodes will group hash operations for all transactions to meet the requirements, the hash value will be stored in the Merkle tree data structure, and it is convenient for blocks rapid induction and integrity checking. The block generation mechanism in blockchain is used to generate the data block. Blocks are connected to a chain using the hash values of block headers. After receiving the dynamic data files, the receiver calculates the hash value locally and compares it with the corresponding data in blockchain by “simplifying the payment verification protocol” of Merkle tree. If inconsistent, it is obvious that files have been changed and alarms should be made to the monitoring center. The process of data delivery by adjacent entities is analyzed in Figure 1.

Figure 1.

Traceability information flows between adjacent entities.

For any two entities i and $i + 1$ , assume that entity i is the sender, entity $i + 1$ is the receiver, the messages will be sent are marked as $R_{i}$ , $S_{i}$ , $M_{i}$ , relatively messages received by the receiver are marked as $R'_{i i}$ , $S'_{i}$ , $M'_{i}$ . Signature on the whole dynamic data will lead to two aspects of defects. On one hand, it needs huge space to store the digital signatures of the whole messages; on the other hand, it is a high computational cost encrypting the whole message by asymmetric encryption, and the processing speed is also very slow. Therefore, communications between adjacent level entities in this article adopt the secondary hash iteration method. Both of the sender’s public key and the message $S_{i}$ are the input of the hash function; at the same time, the hash arithmetic message authentication code is acquired (Hash-based Message Authentication Code, HMAC). It can be used as an eigenvalue, and the calculation method is as follows

HMAC (PK, S_{i}) = H (PK \oplus opad | H (PK \oplus ipad | S_{i}))

(22)

Thereinto, PK is the public key of the sender, $S_{i}$ is the message that will be sent, H is the hash function, opad and ipad are two different pre-specified strings, ⊕ represents xor, and ∣ represents connection.

Use the sender’s private key to sign the message authentication code obtained by equation (22). Due to the small size of the data, it can be ensured that the operation process is fast. The entity passes a message authentication code and message body signed by the sender to its neighborhood entities. The steps of dynamic data traceability information flowing between adjacent entities are as follows:

Step 1. Sign the dynamic data traceability information with the private key of entity i, prove that it is the authorized node.

Step 2. Calculate the hash value of $S_{i}$ and $P K_{i}$ to reduce the size of the signature information of entity i;

Step 3. Sign $H_{i}$ with the private key $S K_{i}$ of entity i and get $M_{i}$ ;

Step 4. Encrypt $M_{i}$ and $S_{i}$ with the public key of entity $i + 1$ by entity i and send it;

Step 5. Decrypt the received encrypted information $R'_{i i}$ with its own private key $S K_{i + 1}$ by entity $i + 1$ , and get $S'_{i}$ , $M'_{i}$ ;

Step 6. Verify signature $M'_{i}$ with the public key of entity i by entity $i + 1$ ;

Step 7. Calculate the hash value of $S'_{i}$ and $P K_{i}$ , get $H_{i}^{″}$ ;

Step 8. Judge whether $H'_{i}$ and $H_{i}^{″}$ are equal. If it is, receive the data; otherwise, turn to exception handling.

The formation process of a block in the instance system can be described in Figure 2. The account of each node is the hash value of its public key, and the verification information is signed with its own private key. Creation process of new traction TX is defined by “Ownership state representation and state transition function,” and it is broadcasted through the P2P network. Each node in the blockchain constantly listens to the network and collects the list of transactions that have not yet been included into the blockchain. The candidate verified blocks are generated; each node validates the received block and judges whether there is invalid trading in the block, such as incorrect signature and duplication. The verified results of the block are broadcasted through the P2P network again, and the system finally selects the current consensus block as a newborn block in accordance with the consensus mechanism put forward in section “Consensus mechanism based on verification nodes list” in this article, including it into the ledger through the header link of the previous block. Now the new ledger is the longest blockchain of the system. It will be broadcasted to the whole network by the account node. Other nodes in the network will compare it with local blockchain after receiving. If the length of the new chain is longer, the local blockchain will be replaced with it.

Figure 2.

The formation of dynamic data blocks.

Traceability mechanism: After the user releases the transaction and gains consensus verified by the VNL list, the system stores the released summary information of the transaction on the blockchain. The released summary information includes the addresses of both publisher A and receiver B, the hash value of the file, the timestamp, and so on. The released summary information can be queried according to the release order number. After receiving the transaction file, the receiver B decrypts the file with its own private key and compares the hash value with the data on the blockchain according to the release order number. If they are inconsistent, it indicates that the file has been tampered and the system alerts the monitoring center.

Assuming that the genesis block exists and the newborn block B is not empty, the block validity verification algorithm is shown in Algorithm 1.

Algorithm 1. Block validity verification algorithm.
Inputs: blockchain $C$ , newborn block B Outputs: If blockchain $C$ or newborn block B does not exist, return error; else if B is correctness, add B to blockchain $C$ , then return $C$ ; if B is incorrectness, return B.
Steps: Function validate_block $(C, B)$ $B \leftarrow V (x_{c})$ If $C \land (B \neq ε)$ then do { $〈 Num, Type, Code, Len, S 〉 \leftarrow (T X_{i} \in M)$ $i + +$ } While $(V a l i d T X (T y p e_{i}, C o d e_{i}, L e n_{i}) \land S_{i} = 1 \land (i < M a x (B)))$ If $(Φ^{Γ} (n, m) = 1) \land (i = Max (B)) \land (ρ = = 1)$ then $C^{↲} \leftarrow B \| h (tail (C))$ Else $B \leftarrow False$ Return(B) End if Return( $C$ ) Else Return(Error) End if End function

Algorithm 1. Block validity verification algorithm.

Inputs: blockchain

C

, newborn block B
Outputs: If blockchain

C

or newborn block B does not exist, return error; else if B is correctness, add B to blockchain

C

, then return

C

; if B is incorrectness, return B.

Steps:
Function validate_block

(C, B)

B \leftarrow V (x_{c})

C \land (B \neq ε)

then
do {

〈 Num, Type, Code, Len, S 〉 \leftarrow (T X_{i} \in M)

i + +

} While

(V a l i d T X (T y p e_{i}, C o d e_{i}, L e n_{i}) \land S_{i} = 1 \land (i < M a x (B)))

(Φ^{Γ} (n, m) = 1) \land (i = Max (B)) \land (ρ = = 1)

then

C^{↲} \leftarrow B | h (tail (C))

Else

B \leftarrow False

Return(B)
End if
Return(

C

)
Else
Return(Error)
End if
End function

Function $v (x)$ holds the current transactions and packages them into blocks. If the newborn block B is not empty and the genesis block exists, transactions in the block are validated based on the method defined by the five-element group. On the premise that honest nodes are the majority of the blockchain, if all transactions in block B are verified and gained consensus during a certain round, then take B as a newborn block to the end of the chain $C$ , return the newly generated chain as the longest chain currently, or else mark block B as false and return.

Performance analysis and verification

We propose a dynamic blockchain consensus mechanism based on the VNL. The system assigns the VNL nodes by agencies, as well as provides database services that cannot be tampered and can be restored at any time. Dynamic data blocks are distributed and stored in each active node of the system. All these nodes make up the dynamic data storage system and its tough distributed database system. The destruction of data in any node can be verified through the “Simplified Transaction Verification Protocol,” namely, accessing only part of the hash nodes in the database. At the same time, since all other healthy nodes have stored the complete database, the normal operation of the entire database will not be affected if any node of the dynamic data block is destroyed. Its performance analysis and verification are described in detail below.

Quality and growth characteristics analysis

The experimental system environment in this article is recorded as Z, assuming the number of dishonest terminals (A) in n participants (P) is t, the system state is recorded as ${STATE}_{Γ, Θ (λ), Z}^{t, n}$ . Suppose n participants $P_{1}, \dots, P_{n}$ have implemented protocol $Γ$ in environment Z, the cascaded ${{STATE}_{Γ, Θ (λ), Z}^{P_{i}, t, n}}, (i = 1, \dots, n)$ for each participant’s running state is denoted as ${STATE}_{Γ, Θ (λ), Z}^{t, n}$ . Due to the uncertainty of the communication model, participants cannot be informed of the total number of participants who are simultaneously implementing protocol $Γ$ at the same time. During execution of the system, participants are required to be verified before entering. Therefore, the number of participants is relatively fixed during the execution of the protocol. Assuming that the honest participants in the system are the majority and meet the conditions

\begin{matrix} t / (n - t) \leq 1 - δ, (0 < δ < 1) \\ t \leq (1 - δ) (n - t) \end{matrix}

(23)

Next, by quantifying all possible dishonest participants A and the polygon-bounded environment Z, the quality and growth characteristics of the dynamic data blockchain in the stochastic state model ${STATE}_{Γ, Θ (λ), Z}^{t, n}$ are analyzed under the system consensus boundary condition $Θ (λ)$ .

Property 1

Quality characteristics of dynamic data blockchain. In the stochastic state model ${STATE}_{Γ, Θ (λ), Z}^{t, n}$ , it is assumed that the kth block B in the chain $C$ is generated by an honest node in the kth round. If there is another dynamic data blockchain $C'$ in other nodes in the system, then the kth block in $C'$ is either B or generated by a dishonest node.

Proof

Suppose that the kth block in $C'$ is $B'$ generated by the honesty node, and $B'$ is another block different from B, as shown in Figure 3. Because block B is the kth block in the dynamic data blockchain $C$ , and it is generated by the honesty node in the kth round, according to the consensus mechanism proposed in section “Consensus mechanism based on verification nodes list,” it can only produce a consensus block each round; thus, $B'$ must not be generated in the kth round. Set $B'$ is generated in the rth generation, take the common prefix $C^{k - 1}$ of the dynamic data blockchain $C'$ and $C$ , and broadcast the $MIN (k, r) th$ block, all honesty nodes in the instance system can receive it and change the local dynamic data blockchain length to k, and then broadcast the $MAX (k, r) th$ block, so all honesty nodes in the instance system can receive it and change the local dynamic data blockchain length to $k + 1$ . It is conflict with the hypothesis. Proof is completed.

Figure 3.

Quality characteristics analysis of dynamic data blockchain.

Property 2

Growth characteristics of dynamic data blockchain. Define random variable $X_{i}$ , if a consensus block of dynamic data generated by an honest node in the ith round, $X_{i} = 1$ ; otherwise, $X_{i} = 0$ . In the stochastic state model ${STATE}_{Γ, Θ (λ), Z}^{t, n}$ , assuming the length of the dynamic data blockchain that honest node received in the rth round is l, then the dynamic data blockchain length received by each honest node in the sth round $(s \geq r)$ is at least $l + \sum_{i = r}^{s - 1} X_{i}$ , and any $k \geq 2 η kf$ continuous blocks in the chain are generated at least in $η k$ consecutive rounds.

Proof

1. It is known that $s - r \geq 0$ , when $s = r$ , assuming that the length of the dynamic data blockchain recorded by an honest node in the rth round is l, it can be known from the mechanism of consensus agreed by the dynamic data blockchain that the node will broadcast the blockchain before the end of the rth round; thus, each honest node will receive the dynamic data blockchain with length l at the rth round.

The following inductive method is used to prove that length of dynamic data blockchain each honest node received is at least $l + \sum_{i = r}^{s - 1} X_{i}$ when $s > r$ . Assuming that the length of dynamic data blockchain is $l' = l + \sum_{i = r}^{s - 2} X_{i}$ recorded by an honest node in the $s - 1 th$ round; when $X_{s - 1} = 0$ , the proposition clearly holds; when $X_{s - 1} = 1$ , it is known that the length of the dynamic data blockchain received by each honest node at the $s - 1 th$ round is at least $l'$ , and the length of the dynamic data blockchain broadcasted by each honest node before the end of the $s - 1 th$ round is at least $l' + 1$ , so $l' + 1 = l + \sum_{i = r}^{s - 1} X_{i}$ .

2. The probability that a valid dynamic data block is generated by at least one honest node in a round is

f = 1 - {(1 - p)}^{q (n - t)} \geq \frac{pq (n - t)}{1 + pq (n - t)}

(24)

p is the probability that new block can be verified by other blocks after being broadcast. Let S be a set of at least $η k$ consecutive rounds, and then

(1 - ε) f | S | < X (S) < (1 + ε) f | S |

(25)

Z (S) < (1 + ε) \cdot \frac{A}{n - A} \cdot \frac{f}{1 - f} \cdot | S | \leq (1 + ε) (1 - δ) \cdot \frac{f | S |}{1 - f}

(26)

(1 + ε) (1 - δ) < (1 - ε) {(1 - f)}^{2}, f + ε \leq \frac{δ}{2}

(27)

It can be derived from equations (25)–(27)

Z (S) < (1 + \frac{δ}{2}) \cdot \frac{A}{n - A} \cdot X (S) < (1 - \frac{δ}{2}) X (S)

(28)

X (S) + Z (S) < (1 + ε) f | S | (1 + \frac{1 - δ}{1 - f}) < 2 η kf \leq k

(29)

That is, any $k \geq 2 η kf$ consecutive blocks in the chain are generated in at least $η k$ consecutive rounds. Proof is completed.

It is known from Property 1 and Property 2, in the protocol designed in this article with the approved access mode of consortium chain, as long as the proportion of honest nodes is high enough, the uniqueness and growth characteristics of dynamic data blockchain can be ensured without considering delay of network transmission, so as to realize the consistency and growth characteristics of dynamic data in each participating terminals. The transmission delay in the actual communication of each node in the instance system is analyzed below.

Assuming that the communication delay of each node within the same instance system is small enough, in order to simplify the problem, only communication delay between different agencies is considered here. For example, the block $B_{i}$ packed by node $M_{i}$ obtains a consensus at time t will link the block $B_{i}$ to the blockchain $C$ and broadcast the newly formed blockchain $C B_{i}$ . Another node $M_{j}$ generates another new block $B_{j}$ at time $t' \in [t, t + Δ t)$ . Because of transmission delay, $M_{j}$ receives blockchain $C B_{i}$ from $M_{i}$ at time $t + Δ t$ . $M_{j}$ will consider the local current blockchain $C$ as the longest chain at time $t'$ , then will link the new consensus block $B_{j}$ packaged by itself to the blockchain $C$ and broadcast the newly formed blockchain to other nodes of the system. Thus, in the system, there will be temporary coexistence of the two blockchains $C B_{i}$ and $C B_{j}$ with the same length, which will be solved with the arrival of the next new block $B_{new}$ . The node which generates the new consensus block $B_{new}$ determines the chain to be linked based on the longest chain of its local records being $C B_{i}$ or $C B_{j}$ .

Assuming that the new chain finally formed is $C B_{i} B_{new}$ (or another case), then the longest chain received by other nodes at this time is $C B_{i} B_{new}$ , and the block $B_{j}$ becomes an orphan block at this time. The system should repackage the transactions in $B_{j}$ that are not in the existing blockchain to form a new block and attach it to the blockchain.

Following is a formal analysis of the above process. Assuming that the rate at which an instance system generates new blocks is $r_{1}$ , and the rate at which the rest of the system generates new blocks is $r_{2}$ ( $r_{2} > r_{1}$ ), transmission delay within the same agency is negligible, only the transmission delay between agencies needs to be considered here. Suppose that from the initial state, the number of blocks generated inside the agency of instance system is k, the number of blocks generated by the rest of the system is l, and the new state is denoted by $(k, l)$ , then there is

q ((k, l), (k + 1, l)) = r_{1}, k \geq 0, l \geq 0

(30)

q ((k, l), (k, l + 1)) = r_{2}, k \geq 0, l \geq 0

(31)

q ((k, l), (k', l')) = 0, otherwise

(32)

The relationship between the new state $(k, l)$ and the initial state $(0, 0)$ is expressed in the following equation

π (0, 0) (r_{1} + r_{2}) = \sum_{k = 0}^{\infty} \sum_{l = 0}^{\infty} π (k, l)

(33)

If $k \neq l$ , then

\begin{matrix} π (k, l) (r_{1} + r_{2}) = & π (k - 1, l) r_{1} I (k > 0) \\ + π (k, l - 1) r_{2} I (l > 0) \end{matrix}

(34)

By equations (33) and (34)

\begin{matrix} π (k, l) = π (0, 0) r_{1}^{k} r_{2}^{l} \cdot \sum_{i = 0}^{min (k, l)} ((| k - l | + i) 2^{i} (\begin{matrix} k + l - i \\ k \end{matrix})) / ((k + l - i) {(r_{1} + r_{2})}^{i} {(r_{1} + r_{2})}^{k + l - i}) \end{matrix}

(35)

Assuming $r_{1} = 0 . 2^{*} (r_{1} + r_{2})$ , that is, the agency calculation of the system as a total power of 20%, the probability of state distribution of the two-tuple $(k, l)$ , $k, l = 0, 1, 2$ is shown in Table 2.

Table 2.

Status distribution probability of $(k, l)$ .

$(k, l)$	0	1
0	0.981	0.016
1	0.001	0.004
2	0.000	0.000

As shown in Table 2, the consensus probability between the nodes within the same agency and the rest of the instance system is 98.1%. The probability that new blocks generated within an agency will not be received by the rest of the instance system is 0.1%. The probability that new blocks generated by the rest of the system will not be received by the nodes within the agency is 1.6%. The probability of generating new blocks within the agency and the rest of the system at the same time due to network transmission delay is 0.4%, and the probability of other cases is less than 10⁻³. Analysis shows that the instance system dynamic data blockchain is less likely to generate multiple chains whose lengths differ by more than 1 due to communication propagation delays, as described earlier in this section; this situation can be solved by the coming of new blocks.

Deployment and experiment

We have conducted a public experiment to evaluate the performance of the mechanism proposed in this article. One thousand two hundred and nineteen volunteers recruited from the Internet participated in the open research under ChainSQL platform for 1 month. The platform is part of our long-term researches on the security of IoT based on blockchain.

We revised part of the consensus mechanism of ChainSQL. In our consensus mechanism, we keep the transaction set generation consensus module of ChainSQL unchanged and revised the block generation module and VNL management module of ChainSQL according to the description of section “Consensus mechanism based on verification nodes list” in this article:

We revised the threshold increasing based multi-round voting block consensus mechanism of ChainSQL to the time-benefit–based block consensus mechanism proposed in this article, which improved the speed of block consensus.

We changed the static management of ChainSQL on VNL into the dynamic management method based on the verification node reputation value proposed in this article, which enhanced the credibility of the verification list and the security of the system.

The improved consensus mechanism is already available on the ChainSQL platform. In addition, the company has signed a cooperative R&D (research and development) agreement with our research team and is ready to work with us to improve the ChainSQL platform further.

Figure 4 shows the quality and growth characteristics of traceability data under ChainSQL platform, while the total number of operational entities is n, the number of nodes in the list of verification is $n_{VNL}$ , and the consensus time r takes different values.

Figure 4.

Quality and growth characteristics of traceability data under ChainSQL platform: (a) $n = 500, n_{VNL} = 100$ , (b) $n = 1000, n_{VNL} = 100$ , (c) $n = 500, n_{VNL} = 200$ , and (d) $n = 1000, n_{VNL} = 200$ .

In Figure 4(a), $n = 500, n_{VNL} = 100$ , when $r = 5 s$ , data size is greater than $r = 10 s$ , but the increase is less than twice of the data size when $r = 10 s$ , which indicates that compared with $r = 5 s$ , some rounds at $r = 10 s$ are completed in step 1 of the consensus mechanism proposed in section “Consensus mechanism based on verification nodes list.” Similarly, compared with $r = 5 s$ or $r = 10 s$ , data size at $r = 1 s$ has a significant increase, but the increase is smaller than the corresponding multiplier value.

In Figure 4(b), $n = 1000, n_{VNL} = 100$ , compared with Figure 4(a), active nodes are doubled, but data size drops when r takes three different values, and the drop is greater when $r = 5 s$ and $r = 10 s$ , compared with $r = 1 s$ , which indicates that as the scale of the system increases, the speed of reaching consensus slows down, and the rounds completed in step 1 of the consensus mechanism are greatly affected than the rounds completed in step 2 or 3.

In Figure 4(c), $n = 500, n_{VNL} = 200$ , compared with Figure 4(a), nodes in VNL authorized by agencies are doubled. The consensus speed slows down when r takes three different values; thus, data size drops correspondingly. When $r = 1 s$ , the impact is relatively weak, indicating that time of the round is relatively small, so that most rounds completed in step 2 or step 3 of the consensus mechanism, thus setting an upper limit for the time to reach consensus.

In Figure 4(d), $n = 1000, n_{VNL} = 200$ , compared with Figure 4(b), nodes in VNL authorized by agencies are doubled. The consensus speed slows down when r takes three different values; thus, data size drops correspondingly. Compared with Figure 4(c), active nodes are doubled, but the increase of data size is not remarkable. It indicates that the increase of nodes for verification will lead to an increase of consensus time, so that most of the rounds are completed in step 2 or step 3 of the consensus mechanism.

From the comparison of Figure 4(a)–(d), it can be seen that as active nodes and the nodes in VNL authorized by the agencies increase, speed of the system consensus is robust to both parameters and is mainly affected by the preset time of each round.

Conclusion

In this article, the consensus mechanism based on the VNL is proposed to store and manage the dynamic data under the consortium chain. Block with high votes and least time qualified would be selected to be linked to the chain in each round. All the dynamic data and operations are permanently logged into the blockchain for authorized access. The quality and growth characteristics of dynamic data blockchain under this mechanism have been proved, while the influence of transmission delay between agencies on the formation of blockchain has been analyzed. It is concluded that the dynamic data blockchain is less likely to generate multiple chains with length difference more than 1 due to transmission delay. It shows that when quantity of verification list nodes is big enough, the block generated by attacker has the minimum chance to be elected as the best block in the round, only when the attacker successfully get compromise of all terminals and be fast enough. However, even the chance theoretically exists, the probability of actual happening is very small.

Footnotes

Handling Editor: Ximeng Liu

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work is supported by National Key R&D Project of China (no. 2016YFB0800203), Program for Innovative Research Team (in Science and Technology) in University of Henan Province (no. 17IRTSTHN009).

ORCID iD

Rui Qiao

References

You

Wang

XP.

Survey of cyber security research in power system. Power Syst Prot Control 2011; 39(10): 140–147.

Lee

Palekar

Qualls

Supply chain efficiency and security: coordination for collaborative investment in technology. Eur J Oper Res 2011; 210(3): 568–578.

Ning

QY.

Research on global Internet of Things’ developments and it’s construction in China. Acta Electron Sin 2010; 11: 2590–2599.

Hou

Bai

Research on the shortage of Internet of Things standards—take the development of key technical standards as an example. Sci Technol Prog Policy 2015; 12: 61–66.

Wen

Liang

Suggestions on effective application of continuous internal audit. Enterp Econ 2010; 4: 155–157.

Ullah

Edwards

Ramdhany

et al . Data exfiltration: a review of external attack vectors and countermeasures. J Netw Comput Appl 2017; 101: 18–54.

Wagner

Rasin

Glavic

et al . Carving database storage to detect and trace security breaches. Digit Invest 2017; 22: 127–136.

Khanduja

Database watermarking, a technological protective measure: perspective, security analysis and future directions. J Inf Secur Appl 2017; 37: 38–49.

Trivedi

Zavarsky

Butakov

Enhancing relational database security by metadata segregation. Procedia Comput Sci 2016; 94: 453–458.

10.

El-Hajj

Brahim

Hajj

et al . Security by construction in web applications development via database annotations. Comput Secur 2016; 59: 151–165.

11.

Feng

Zhang

et al . Study on cloud computing security. J Softw 2011; 22(1): 71–83.

12.

Wei

Zhu

Cao

et al . Security and privacy for storage and computation in cloud computing. Inform Sciences 2014; 258: 371–386.

13.

Alfian

Rhee

Ahn

et al . Integration of RFID, wireless sensor networks, and data mining in an e-pedigree food traceability system. J Food Eng 2017; 212(11): 65–75.

14.

Gandino

Montrucchio

Rebaudengo

A security protocol for RFID traceability. Int J Commun Syst 2017; 30(6): e3109.

15.

Nissim

Yahalom

Elovici

USB-based attacks. Comput Secur 2017; 70: 675–688.

16.

Dawle

Naik

Vande

et al . Database security using intrusion detection system. Int J Sci Eng Res 2017; 8(2): 30–34.

17.

Yang

Chen

Blockchain principle, design and application. Beijing, China: China Machine Press, 2017, pp.9–19.

18.

Choi

Kim

Lee

et al . A fully integrated CMOS security-enhanced passive RFID tag. ETRI J 2014; 36(1): 141–150.

19.

Zhang

Wang

Liu

et al . Survey on cloud computing security. J Softw 2016; 27(6): 1328–1348.

20.

Yang

Shen

et al . Enabling public auditing for shared data in cloud storage supporting identity privacy and traceability. J Syst Software 2016; 113: 130–139.

21.

Top threats to cloud computing V1.0, http://wenku.baidu.com/view/db3506ea81c758f5f61f67e5.html

22.

Liu

TT.

Research on key technologies of data security towards cloud computing. PhD Dissertation, PLA Information Engineering University, Zhengzhou, China, 2013, pp.35–57.

23.

Chen

Liang

et al . Cloud data storage security and privacy protection policies under IoT environment. Comput Sci 2012; 5: 62–65+90.

24.

Yuan

Wang

FY.

Blockchain: the state of the art and future trends. Acta Autom Sin 2016; 42(4): 481–494.

25.

Nakamoto

Bitcoin: a peer-to-peer electronic cash system, 2009, https://bitcoin.org/bitcoin.pdf

26.

Fan

Shu

JW.

Research on the technologies of Byzantine system. J Softw 2013; 24(6): 1346–1360.

27.

Pass

Seeman

Shelat

. Analysis of the blockchain protocol in asynchronous networks. In: Annual international conference on the theory and applications of cryptographic techniques, Paris, 30 April–4 May 2017, pp.643–673. Berlin: Springer.

28.

Garay

Kiayias

Leon ardos

. The Bitcoin backbone protocol: analysis and applications. In: Annual international conference on the theory and applications of cryptographic techniques Sofia, 26–30 April 2015, pp.281–310. Berlin: Springer.

29.

Eyal

Gencer

Sirer

et al . Bitcoin-NG: a scalable blockchain protocol. NSDI 2016, pp.45–59, https://www.usenix.org/system/files/conference/nsdi16/nsdi16-paper-eyal.pdf

30.

Nakasumi

. Information sharing for supply chain management based on blockchain technology. In: IEEE 19th conference on business informatics (CBI), 24–27 July 2017, Thessaloniki, pp.140–149. New York: IEEE.

31.

Pinzón

Rocha

Double-spend attack models with time advantange for Bitcoin. Electron Notes Theor Comput Sci 2016; 329: 79–103.

32.

Sikorski

Haughton

Kraft

Blockchain technology in the chemical industry: machine-to-machine electricity market. Appl Energ 2017; 195: 234–246.

33.

Wood

Ethereum: a secure decentralised generalised transaction ledger. Ethereum Project Yellow Paper, 2014, https://gavwood.com/paper.pdf

34.

R3, http://www.r3cev.com/ (accessed 10 December 2017).

35.

Hyperledger, https://www.hyperledger.org/ (accessed 10 December 2017)

36.

Gramoli

From blockchain consensus back to Byzantine consensus. Future Gener Comp Sy. Epub ahead of print 21 September 2017. DOI: 10.1016/j.future.2017. 09.023.