Abstract
Internet of Things is widely used in many fields such as industry, medical care, education, and supply chain. With the participation of multi-authorized entities, a large number of dynamic data will be generated in the basic dimension of time. The operations on these data have to be safe and traceable for use in various forensics and decisions. Therefore, the key point of dynamic data security protection is to reject tampering of unauthorized users and to realize the process in evidence and tracing of the dynamic data operation. In order to find a solution to the problem above, an optimization of dynamic data traceability mechanism based on consortium blockchain is proposed in this article. First, a mathematical model for the security of dynamic data storage has been established, followed by analysis on honest behavior motive of individual node decision-making in group game and distributed node cooperation essence in specific industry background. After that, ownership transition function and the architecture of the dynamic data storage system are optimized; quality and growth characteristics of the system under stochastic state model are analyzed. Result shows that the solution can effectively avoid potential attacks such as tampering and faking under approved accession mode. The mechanism has good application value while ensuring the dynamic data storage security.
Introduction
The Internet of Things (IoT) has been connecting extraordinarily large number of devices to the Internet. In many fields such as industry, medical care, education, supply chain, new data will be generated in time dimension with participation of multiple authorized entities, which is called dynamic data in this article. Handling the massive amount of data generated by these devices in efficient, secure, and economic way is an essential research subject. Stringent security and traceability is required by dynamic data and operations on the data for various forensics and decision-making.1,2 Common characteristics existed in dynamic data are as follows: (1) Continuity. As time goes on, new dynamic data are generated continuously by multi-entities. (2) Time sensitivity. Dynamic data are sensitive to the generation time and application time. For example, to predict future tendency by dynamic data generated during a certain period. (3) Multidimension. In different applications, dynamic data have multiple dimensions besides time dimension. For example, address and trading volume in supply chain system, operation history and permission settings in industrial control system, etc.3,4 (4) Availability. Dynamic data should be qualified with availability and meet the management requirements by users, especially enterprise users, 5 such as log information analyzing and checking, illegal operation investigation. Traceability is an important prerequisite for integrity and reliability of dynamic data, and it is an important manifestation of availability of dynamic data. Therefore, it is significant to ensure traceability of dynamic data.
Traceability of dynamic data includes traceability of dynamic data itself and historical operations on it. Prerequisite for the availability of dynamic data is to ensure the integrity and reliability of them, namely, to ensure dynamic data are not tampered or falsified in the process of storage and transfer. However, preventing tampering and forging of dynamic data are increasingly becoming a challenging task due to two main reasons. First, over the span of last few years, cyber-crime has transformed from individual act to organizational act. This transformation provided attackers with high budget, resources, and sophistication to become more professional in data tampering and forging. 6 Second, the existing data infrastructure is originally designed for the legitimate storage of dynamic data but with potential security risks. For example, dynamic data are usually stored by central database combined with certain safety measures, such as access authentication, access control, information encryption, digital watermark.7–10 As well as stored upon cloud-based data storage technology by abstracting all kinds of data resources into resource pools,11,12 which can be provided to users in a more flexible and convenient way. However, the above traditional methods have a variety of problems, such as vulnerable to be attacked because of the concentrated value, high complexity of algorithms, and so on.
In order to improve the security of dynamic data storage, the system must be strengthened from two aspects. On one hand, the correctness of the dynamic data should be verified and not be tampered or faked. On the other hand, the system should implement the traceability of dynamic data operation history and provide the capability of data recovery. In view of the problems above, the whole life cycle of dynamic data recorded by blockchain is put forward in this article to implement security and reliable storage, ensure the integrity, reliability, traceability, and permanence of dynamic data.
In this article, the consistency between local behavior of consensus terminals maximizing their own benefits and the overall goals of ensure system security and effectiveness has been analyzed. Consensus mechanism applicable for the dynamic data storage has been proposed accordingly. Compared with proof of work (PoW), the consensus mechanism we proposed has greatly reduced the waste of power and computing resources. With key distribution mechanism, dynamic data stored at different levels could be transmitted and verified, as well as through secondary hash iteration communication between adjacent levels and encryption operation asymmetrically the security of the system has been improved a lot.
The major work of this article is as follows:
By analyzing the honest behavior motive of individual node making decision in group game and the distributed nodes cooperation essence in specific industry background, a consensus mechanism applicable to dynamic data storage is proposed in this article and the security of dynamic data are guaranteed in consensus way based on blockchain technology.
Hierarchical traceability mechanism of dynamic data in multi-entities transferred dynamically is put forward in this article. Communication channel is built on key mechanism, and secure transmission of end-to-end encrypted data communication system is achieved. Tampering or faking dynamic data are effectively prevented, by providing timing relationship while using the positive and negative asymmetric encryption operations.
An universal distributed dynamic data traceability mechanism for IoT is put forward based on the blockchain consensus mechanism and information transmission mechanism proposed in (1) and (2), providing an effective means for dynamic data monitoring and protection.
The organizational structure of this article: First, current traceability technologies and the concepts of blockchain were introduced, followed by analysis of consortium blockchain applicability to solve the problem of IoT dynamic data storage security. Second, universal methods and processes for the security of dynamic data operation were abstracted, as well as the dynamic data storage model was proposed. After that, consensus boundary conditions of IoT were analyzed based on Game Theory, consortium blockchain consensus mechanism based on verification node list was proposed, and the dynamic data storage architecture based on the above-mentioned consensus mechanism was further put forward. Finally, by theoretical analysis and deployment experiments, the effectiveness of the solution for resisting common attacks and the traceability of dynamic data operations was proved.
Related work
Traceability methods
Traceability is a key activity in many industries. Many modern traceability systems are based on radio frequency identification (RFID) technology. Alfian et al. 13 proposed an e-pedigree food traceability system, utilizing RFID technology to track and trace product location and wireless sensor network to collect temperature and humidity during storage and transportation. According to the application of the RFID system, the security solutions have specific characteristics and requirements. Several security approaches, in various application domains, are based on a central database with large computational resource. 14 The existing centralized database is vulnerable by attackers because of its concentrated value, so data security and authenticity of them cannot be guaranteed. Nissim et al. 15 reviewed 29 different USB-based attacks and classified them into four major categories. It put forward a method to identify the associated and vulnerable USB peripherals and hardware for each attack, but it meant that the method allowed occurrence of attack. Dawle et al. 16 proposed a database intrusion detection mechanism to enhance the security of database through logging all the activities of an intruder using SQL injection in a website. Administrator can show the details and prevent attackers in injecting malicious codes to the database and stealing or destroying or modifying database. However, as the view of Yang and Chen, 17 the single-party trust mechanism cannot avoid the malicious tampering or falsification in dynamic data by staff with advanced access right. What is more, the above algorithms are usually very complex. Generated dynamic data volume and incremental volume with time by persistence will be extremely high. In addition, processing and storage performance of intelligent processing terminals or field sampling devices, used for collecting, coding, and dynamic data storage are usually limited, hence, high complexity security algorithm is not suitable to solve the problem above. 18
Cloud storage service has been widely adopted by diverse organizations, through which users can conveniently share data with others. Zhang et al. 19 pointed out that cloud data are open to multiple authorized users, as well as sufficient evidences of data whereabouts and operation histories of all level subjects’ are hardly to be recorded and provided. Therefore, it cannot meet the entire process recording requirements of dynamic data in some special fields (such as industrial control systems and traceability systems). Accordingly, it is difficult to determine responsibility under cloud storage. In order to solve the problem that uncontrolled malicious modifications may wreck the usability of the shared data, Yang et al. 20 proposed an efficient public auditing solution that can preserve the identity privacy and the identity traceability for group members simultaneously. But under the cloud platform, people cannot build trust with cloud service providers and cannot make sure they would fulfill the service agreements by the web front-end interface either. 21 To avoid sensitive information being stolen, tampered, or faked, the system needs an absolutely reliable cloud platform service provider, but it is unrealistic.
As can be seen from the above, RFID is suitable only for tracking tangible assets, not for dynamic data or tracking intangible assets. The centralized database such as cloud computing can only implement the storage of dynamic data, but cannot guarantee the security of the data. Natural defects are existing in these traditional centralized databases in resisting malicious users (including internal personnel with advanced permissions) to tamper or fake dynamic data.22,23 Blockchain can provide tamper-resistant and other secure features guarantee without the third party. By analyzing blockchain application scenarios of dynamic data oriented to multi-agencies, a method has been put forward to limit the consensus and visibility of the blocks within the consortium blockchain, and a new way has been provided to solve the problem for storage and traceability of dynamic data simultaneously.
Blockchain
Blockchain technology is a new “distributed accounting system.” 24 Since the groundbreaking paper, “Bitcoin: A peer-to-peer electronic cash system” published by S Nakamoto 25 in 2008, blockchain has been emerging as a new trust mechanism in recent years. Simple technologies such as hash chain, manufacturing work delay, incentive mechanisms have been adopted to bypass the problems in traditional academic. Through decentralization and de-trust, problem of the multi-parties cooperation and mutual trust has been almost perfectly solved. “Universal participation” and “Shared writing” of data information have been realized in collectively to maintain the solution of reliable database and point-to-point value registration and transfer.26,27 As an underlying security technology, blockchain has attracted the attention of the cryptography and other domains. Researchers have conducted extensive researches on blockchain technology, including the analysis of the protocol28,29 and blockchain technology applications in special fields.30–32
Generally, according to the restriction on participating nodes, blockchain can be roughly divided into three types, namely, public blockchain, consortium blockchain, and private blockchain.
Consortium blockchain can set the openness to the public according to the application scenario. The network is maintained by the member agencies, and nodes are accessed through the gateway of the member agencies. Therefore, consortium blockchain is applicable to storage, management, authorization, monitoring and auditing of dynamic data by multiple agencies in specific background.
Dynamic data storage security model
According to the security problem of dynamic data storage in the transaction process of the instance system, the assumptions are made as follows:
The attack source is unlimited, and each attacker comes separately and independently.
The arrival of the attackers reaches the Poisson distribution of the parameters
Tamper of dynamic data caused by each attack is under the negative exponential distribution of the parameter
Intervals and results of each attack are independent.
Set
According to the assumption, when
By
Considering the special case, when
If system has not been attacked at time
If system has not been attacked at time
If system has been attacked at time
Thus, there is
Therefore,
The model established in this article is as follows
Equation (4) is the probabilistic equation which shows the system has been attacked by
The traditional security model of data storage is modeled from the perspective of centralized defense. Centralized database storage combined with certain traditional safety methods of cryptography, such as access control, access authentication, information encryption, digital watermark, can improve system storage security to some extent, but it cannot avoid the security threat caused by potential vulnerabilities of the system or the malicious sabotage of the staffs. In addition, traditional security model based on central enabled IoT framework manifests a number of significant disadvantages, such as security and trust issues, high server maintenance costs, weakness for supporting data traceability based on IoT applications, which impede its wide adoption. The dynamic data security model described as the above equations (4)–(8) is modeled based on two elements of de-trust and defense capability, and it cannot be solved by such traditional methods fundamentally. In contrast, the dynamic data storage security model we proposed is available for byzantine situation. Therefore, it can solve the aforementioned problems well and be used to design new decentralization frameworks for IoT. Therefore, a method to optimize dynamic data storage mechanism based on blockchain technology is proposed in this article.
Optimization of dynamic data storage mechanism
Boundary conditions for consensus
Distributed consensus is the key problem that has to be solved to build a “zero trust” dynamic data traceability mechanism based on blockchain technology. However, the conditions for consensus are quite different between the requirements of public anonymous scenarios and those with rights management. 36 For example, financial systems such as Bitcoin adopt an economic incentive mechanism in systems with highly decentralized decision-making power to enable each node to reach an agreement on the validity of block data efficiently. It is simple and effective for nodes to join in the public chain freely. However, the dynamic data are usually internal data of industry closely related to specific work process, as well as consortium chain allowing only approved nodes to join in is more suitable for the management of dynamic data. Obviously, consensus incentives under the monetary system do not apply to the management of dynamic data under the consortium blockchain.
Under the consortium chain, there are certain prerequisites of trust and interest restraints in the multi-party participation. In this section, by analyzing the honesty motive of decision-making of a single node during the group game, we proposed that the essence of distributed nodes cooperation in a particular industry is to make the maximal cumulative utility of each nodes in interaction with the environment and distributed computing, then further analyze the boundary conditions agreed upon by each node in the dynamic data traceability system to optimize the consensus mechanism.
Assume that the set of the nodes involved in the block information validation of the dynamic data traceability system is a finite set. For each node
The goal of each participating node is to maximize its own revenue. To simplify the problem, all other nodes except node
Yield matrix of nodes
In Table 1,
The construction of consensus model for each node of traceability system is based on the following hypothesis:
1. For node
The above equation shows that in the case of inconsistent node behavior, the party adopting the betrayal strategy can obtain a higher return from the cooperative behavior of sacrificing the remaining nodes than all the nodes work cooperatively. All the nodes cooperate to a consensus can benefit more than betrayal. If a single side cooperates, but the remaining betray, there will be a great loss to the cooperative partner or lead to the lowest benefit.
2. Node
3. According to hypothesis 2, there is little chance that the node
Further analysis based on the above hypothesis leads to the following conclusions: the strategy of cooperation or betrayal by node
In these equations, if equation (11) is established, node
From equation (13)
The derivation process is similar to the above. When node
By equations (11), (14), and (15), the condition for node
From equation (16)
Equation (17) is the boundary condition of consensus reached by all nodes in the dynamic data traceability system, which is denoted as
Ownership state representation and state transition function
Similar to cryptocurrency trading, transaction process in the instance system can also be considered as a state transition system in the technical aspect. The system includes “state” and “state transition function” of all existing objects. The item here is a general concept and can be tangible goods or digital assets in the instance system. Relative description is given as below.
Definition 1
The “state” of an instance system is a set of all objects that have been encoded, distributed, unsold or ownership transferred (coded and unsale products outputs (CUPO)).
Each CUPO has an amount and an owner (defined by the hash value of its cryptographic public key address). A transaction involves one or more input and output. Each input contains a reference to the existing CUPO and a cryptographic signature created by the private key corresponding to the owner’s address. Each output contains a new CUPO added to the state.
Definition 2
State transformation function of instance system is defined as follows
Rules for each input definition of a transaction are defined as follows:
Rule 1 prevents sender of the transaction from selling or transferring objects that do not exist; Rule 2 prevents sender of the transaction from selling or transferring other people’s objects; Rules 3, 4, 5 ensure conservation of the value.
Consensus mechanism based on verification nodes list
One of the core advantages of blockchain technology is to use incentive mechanism in decentralized systems to reach consensus among nodes on the validity of block data effectively. However, the application of this mechanism in dynamic data storage system is obviously inadequate.
In this article, we proposed a consensus mechanism based on verification nodes list. By authorizing some of the trusted nodes based on the boundary conditions in this article (see section “Boundary conditions for consensus”), agencies assign a list of verification nodes list (VNL) and allocate different credibility to the nodes. Each node maintains credibility by serving others. When a node experiences long-term denial of service or selfish behavior, its credibility is lowered. When the credibility of the node is below a certain threshold, the node is moved out of the list. When more than 1/3 nodes of the VNL are removed, the agencies must reauthorize the new list.
In the system, the full-node servers are responsible for maintaining the VNL list. The verification nodes only consider the verification results of members in the VNL to complete the block generation. This consensus mechanism greatly improves the efficiency of system consensus while ensuring the security. Meanwhile, as the verification nodes are the authority nodes, once there is a betrayal node, it is convenient for the system to verify its identity and hold responsibility.
According to the analyzation of the consistency between the local behavior of consensus terminals to maximize their own benefits and the overall goals to ensure the system security and effectiveness, it can be concluded: when all terminals have blocks to be submitted for verification, neither side will change their verification results for other blocks in order to maximize their own revenues.
Each node participating in the verification gets all the valid actions that are not recorded before the consensus and exposes them as “candidate sets.” Then, each node participating in the verification merges the candidate set of all other verification nodes in VNL, verifies and votes all the operations. The effective operation of dynamic data includes two situations: one refers to the release of new data and the other is the dynamic data flow between different entities. The above two operations must be implemented through the authority nodes and are considered as a transaction. New data can be released without an input, but must have an output. The node that has the private key corresponding to the public key of the output address is the authorized node that can effectively operate on the address data. The transfer of dynamic data between different entities requires both input and output. The input must be an unused output of a transaction in the system. At the same time, this input needs to be signed by the private key corresponding to the previous output address to verify whether the current node is an authorized node. The trust foundation is achieved by pre-issuing the Root CA certificate (Root CA) through the top management of the industry and building a complete certificate trust chain based on the root CA and middle-tier CA to the lowest-level entity CA.
The mathematical form of the block consensus process is described below. In dynamic data storage system,
Thus, there are several steps in a certain round of block consensus:
Dynamic data storage architecture
Dynamic data storage system uses multilevel access control model, supporting data to be changed dynamically between adjacent entities. Ownership transfer of dynamic data can be referred as the description in section “Ownership state representation and state transition function.”
The key distribution mechanism generates the pair key
After each round of consensus, verification nodes will group hash operations for all transactions to meet the requirements, the hash value will be stored in the Merkle tree data structure, and it is convenient for blocks rapid induction and integrity checking. The block generation mechanism in blockchain is used to generate the data block. Blocks are connected to a chain using the hash values of block headers. After receiving the dynamic data files, the receiver calculates the hash value locally and compares it with the corresponding data in blockchain by “simplifying the payment verification protocol” of Merkle tree. If inconsistent, it is obvious that files have been changed and alarms should be made to the monitoring center. The process of data delivery by adjacent entities is analyzed in Figure 1.

Traceability information flows between adjacent entities.
For any two entities
Thereinto,
Use the sender’s private key to sign the message authentication code obtained by equation (22). Due to the small size of the data, it can be ensured that the operation process is fast. The entity passes a message authentication code and message body signed by the sender to its neighborhood entities. The steps of dynamic data traceability information flowing between adjacent entities are as follows:
The formation process of a block in the instance system can be described in Figure 2. The account of each node is the hash value of its public key, and the verification information is signed with its own private key. Creation process of new traction

The formation of dynamic data blocks.
Traceability mechanism: After the user releases the transaction and gains consensus verified by the VNL list, the system stores the released summary information of the transaction on the blockchain. The released summary information includes the addresses of both publisher A and receiver B, the hash value of the file, the timestamp, and so on. The released summary information can be queried according to the release order number. After receiving the transaction file, the receiver B decrypts the file with its own private key and compares the hash value with the data on the blockchain according to the release order number. If they are inconsistent, it indicates that the file has been tampered and the system alerts the monitoring center.
Assuming that the genesis block exists and the newborn block
Function
Performance analysis and verification
We propose a dynamic blockchain consensus mechanism based on the VNL. The system assigns the VNL nodes by agencies, as well as provides database services that cannot be tampered and can be restored at any time. Dynamic data blocks are distributed and stored in each active node of the system. All these nodes make up the dynamic data storage system and its tough distributed database system. The destruction of data in any node can be verified through the “Simplified Transaction Verification Protocol,” namely, accessing only part of the hash nodes in the database. At the same time, since all other healthy nodes have stored the complete database, the normal operation of the entire database will not be affected if any node of the dynamic data block is destroyed. Its performance analysis and verification are described in detail below.
Quality and growth characteristics analysis
The experimental system environment in this article is recorded as
Next, by quantifying all possible dishonest participants A and the polygon-bounded environment
Property 1
Quality characteristics of dynamic data blockchain. In the stochastic state model
Proof
Suppose that the

Quality characteristics analysis of dynamic data blockchain.
Property 2
Growth characteristics of dynamic data blockchain. Define random variable
Proof
1. It is known that
The following inductive method is used to prove that length of dynamic data blockchain each honest node received is at least
2. The probability that a valid dynamic data block is generated by at least one honest node in a round is
It can be derived from equations (25)–(27)
That is, any
It is known from Property 1 and Property 2, in the protocol designed in this article with the approved access mode of consortium chain, as long as the proportion of honest nodes is high enough, the uniqueness and growth characteristics of dynamic data blockchain can be ensured without considering delay of network transmission, so as to realize the consistency and growth characteristics of dynamic data in each participating terminals. The transmission delay in the actual communication of each node in the instance system is analyzed below.
Assuming that the communication delay of each node within the same instance system is small enough, in order to simplify the problem, only communication delay between different agencies is considered here. For example, the block
Assuming that the new chain finally formed is
Following is a formal analysis of the above process. Assuming that the rate at which an instance system generates new blocks is
The relationship between the new state
If
By equations (33) and (34)
Assuming
Status distribution probability of
As shown in Table 2, the consensus probability between the nodes within the same agency and the rest of the instance system is 98.1%. The probability that new blocks generated within an agency will not be received by the rest of the instance system is 0.1%. The probability that new blocks generated by the rest of the system will not be received by the nodes within the agency is 1.6%. The probability of generating new blocks within the agency and the rest of the system at the same time due to network transmission delay is 0.4%, and the probability of other cases is less than 10−3. Analysis shows that the instance system dynamic data blockchain is less likely to generate multiple chains whose lengths differ by more than 1 due to communication propagation delays, as described earlier in this section; this situation can be solved by the coming of new blocks.
Deployment and experiment
We have conducted a public experiment to evaluate the performance of the mechanism proposed in this article. One thousand two hundred and nineteen volunteers recruited from the Internet participated in the open research under ChainSQL platform for 1 month. The platform is part of our long-term researches on the security of IoT based on blockchain.
We revised part of the consensus mechanism of ChainSQL. In our consensus mechanism, we keep the transaction set generation consensus module of ChainSQL unchanged and revised the block generation module and VNL management module of ChainSQL according to the description of section “Consensus mechanism based on verification nodes list” in this article:
We revised the threshold increasing based multi-round voting block consensus mechanism of ChainSQL to the time-benefit–based block consensus mechanism proposed in this article, which improved the speed of block consensus.
We changed the static management of ChainSQL on VNL into the dynamic management method based on the verification node reputation value proposed in this article, which enhanced the credibility of the verification list and the security of the system.
The improved consensus mechanism is already available on the ChainSQL platform. In addition, the company has signed a cooperative R&D (research and development) agreement with our research team and is ready to work with us to improve the ChainSQL platform further.
Figure 4 shows the quality and growth characteristics of traceability data under ChainSQL platform, while the total number of operational entities is

Quality and growth characteristics of traceability data under ChainSQL platform: (a)
In Figure 4(a),
In Figure 4(b),
In Figure 4(c),
In Figure 4(d),
From the comparison of Figure 4(a)–(d), it can be seen that as active nodes and the nodes in VNL authorized by the agencies increase, speed of the system consensus is robust to both parameters and is mainly affected by the preset time of each round.
Conclusion
In this article, the consensus mechanism based on the VNL is proposed to store and manage the dynamic data under the consortium chain. Block with high votes and least time qualified would be selected to be linked to the chain in each round. All the dynamic data and operations are permanently logged into the blockchain for authorized access. The quality and growth characteristics of dynamic data blockchain under this mechanism have been proved, while the influence of transmission delay between agencies on the formation of blockchain has been analyzed. It is concluded that the dynamic data blockchain is less likely to generate multiple chains with length difference more than 1 due to transmission delay. It shows that when quantity of verification list nodes is big enough, the block generated by attacker has the minimum chance to be elected as the best block in the round, only when the attacker successfully get compromise of all terminals and be fast enough. However, even the chance theoretically exists, the probability of actual happening is very small.
Footnotes
Handling Editor: Ximeng Liu
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work is supported by National Key R&D Project of China (no. 2016YFB0800203), Program for Innovative Research Team (in Science and Technology) in University of Henan Province (no. 17IRTSTHN009).
