Abstract
The crowdsourcing schemes which utilize the social network to solve complex tasks are an important part of open cooperation over the Internet. Although blockchain-based crowdsourcing schemes have considerable advantages in decentralization and data sharing, there is still a challenge to gurantee the security of crowdsourced-sensitive information and the fairness of crowdsourcing on the blockchain. To this end, this article investigates a crowdsourcing scheme based on blockchain. First, we define the basic requirements of blockchain-based crowdsourcing schemes including fairness, confidentiality, and integrity. And then, using secure hash, commitment, and homomorphic encryption, we propose a blockchain-based secure and fair crowdsourcing scheme, that is, BFC. The analysis results show that our scheme can satisfy the above requirements. Finally, the experimental results show that the computational overhead of the BFC scheme is acceptable to both the requester and the workers. In a word, our proposed crowdsourcing scheme has good expansibility in reality.
Introduction
In recent years, the rise of crowdsourcing has attracted more and more attention. Therefore, many companies adopt crowdsourcing models on the Internet to solve some difficult problems in a distributed manner. The crowdsourcing schemes are a distributed problem-solving model for publicly recruited solutions.1,2 Traditional social-network-based crowdsourcing schemes include three types of participants: 3 requester, workers, and a centralized crowdsourcing platform. The requester publishes a crowdsourcing task and a corresponding incentive policy. A group of workers interested in the task submit the solutions to the crowdsourcing platform. Then, the platform will select the appropriate solutions and reward the prominent workers. When there is a dispute between the requester and the worker, the third-party platform will arbitrate the crowdsourcing process.4,5 For example, voting mechanism 6 and auction mechanism 7 are typical crowdsourcing tasks in real life.
Traditional centralized crowdsourcing schemes require third-party platforms to ensure that the crowdsourcing process is readily conducted to achieve a fair exchange between crowdsourcing solutions and rewards. 8 However, this model brings many inevitable challenges to crowdsourcing. First, the crowdsourcing schemes, which rely entirely on centralized servers to maintain the crowdsourcing, are not suitable for distributed environments. Second, when there is a dispute 9 between users, the platform may make dishonest decisions leading to “false reporting” and “free-riding.” Third, the crowdsourcing sensitive information is typically stored in centralized servers, which possibly leads to data leakage. 10 Fourth, the centralized crowdsourcing schemes are vulnerable to denial-of-service (DoS) attacks, remote hijacking, and Sybil attacks to cause the failure of crowdsourcing service. 11
The idea of introducing decentralization in the crowdsourcing is imminent. The blockchain based on peer-to-peer network13,14 is an open, transparent, and distributed data storage model. All nodes in the network jointly maintain the operation of the blockchain. In recent years, there have been many blockchain-based cryptographic protocols.15–17 Therefore, the blockchain has the potential to be one of the attempts for the decentralized crowdsourcing schemes.
Compared to traditional crowdsourcing schemes based on centralized management, a blockchain-based crowdsourcing scheme should satisfy the following requirements. First, it is essential to guarantee the verifiability of crowdsourced-sensitive information while ensuring the confidentiality. And then, because the interaction between users is directly on the blockchain, the integrity of the information should also be protected. Most importantly, on account of the decentralization of blockchain, the fairness of crowdsourcing should be ensured, that is, completing the fair exchange between the requester and the workers is a crucial challenge. In the crowdsourcing process, first, the requester may refuse to execute the incentive policy or execute it dishonestly. Second, when the requester is dissatisfied with existing solutions, it is possible to collude with dishonest workers and create an unfair distribution of rewards. For example, in the auction, if the owner thinks that the bids are not up to expectation, it may combine with other buyers to interrupt the auction process with a higher bid. Third, since the incentive policy may include the privacy of the requester, such as location and identity, it is not appropriate to be disclosed before receiving the solutions to avoid workers submitting false reports. For example, in the voting, the requester may be unwilling to disclose the conditions for the candidate to be elected (similar to more than half) in advance, causing the worker to vote dishonestly. Fourth, workers may change their own solutions by observing others on the blockchain. In summary, all of the above behaviors may undermine the fairness of the crowdsourcing process.
However, using only the common cryptosystems, such as public key authentication, 18 password authentication, 19 attribute-based encryption, 20 digital signature, 21 privacy-preserving outsourced calculation,22,23 and location-based verification, 24 cannot completely solve the problems in the blockchain-based crowdsourcing schemes, especially in terms of ensuring fairness.
Our contributions
In this article, we focus on a blockchain-based crowdsourcing scheme that satisfies the requirements of confidentiality, integrity, and fairness. We first adopt the homomorphic encryption25,26 to simultaneously ensure the confidentiality and verifiability of the solutions. In this way, any user can verify the reward distribution based on the solution ciphertexts. And then, the secure hash27,28 and commitment29,30 are used to satisfy the requirement of fairness, that is, to ensure that the requester’s incentive policy and the workers’ solutions are unchangeable and unavailable at some phases.
Our contributions in this article are fourfold, as follows:
We classify the required properties of blockchain-based secure crowdsourcing scheme including confidentiality, integrity, and fairness, especially to defend against the collusion of the dishonest requester and workers.
We propose BFC, a novel blockchain-based secure and fair crowdsourcing scheme. The BFC scheme can provide the fair exchange between the requester and the workers, while protecting the confidentiality and integrity of the crowdsourcing information.
We theoretically analyze that BFC satisfies the above required properties.
We report experimental evaluations and show that BFC is feasible for both the requester and the workers.
Related work
Centralized crowdsourcing scheme
Since Jeff Howe proposed the concept of crowdsourcing in 2006, it has acquired the considerable attention and adoption. The existing centralized crowdsourcing schemes include Upwork, 31 Amazon Mechanical Turk (MTurk), 32 Waze, 33 and Airbnb. 34 These schemes provide services such as worker selection, incentive policy implementation, and solution verification from centralized crowdsourcing platforms. The requester can hire workers through a crowdsourcing platform to get correct solutions for the task. However, taking MTurk as an example, since the workers send plaintexts of the solutions to the third-party crowdsourcing platform, there may be data leakage and the single-point failure. At the same time, this scheme cannot completely avoid dishonest behaviors of the crowdsourcing platform. X Zhang et al. 35 proposed the auction-based schemes, EFF and DFF, in which the EFF mechanism arbitrates the disputes with the help of a trusted third party to eliminate dishonesty; the DFF mechanism can deny the dishonest behaviors without any third party. Y Zhang and M van der Schaar 36 introduced a reputation-based incentive mechanism in the crowdsourcing system. Although the above schemes have solved the problems of fairness to a certain extent, the confidentiality of crowdsourcing information is still neglected. Most importantly, none of the above schemes are fully applicable to distributed crowdsourcing scenarios.
Blockchain-based crowdsourcing scheme
Buccafurri et al. 37 proposed an alternative scheme to blockchain based on online social networks to implement crowdsourcing services, but it did not guarantee the confidentiality of the information and there was no incentive mechanism. C Tanas et al. 38 utilized the blockchain as a channel to pay rewards, but their proposed scheme cannot defend against some malicious users’ attacks, such as the requester’s dishonest execution of incentive policy, undermining the fairness of crowdsourcing. And the confidentiality of crowdsourcing information is not protected. M Li et al. 16 designed CrowdBC, a distributed crowdsourcing framework based on blockchain. Although it can enforce incentive policy by deploying smart contracts, the requester has a certain probability to collude with the workers to dishonestly interrupt the crowdsourcing processes. Y Lu et al. 17 proposed ZebraLancer, a private and anonymous crowdsourcing system based on the blockchain. Even if the solutions are encrypted with the requester’s public key, similar to the CrowdBC, this scheme cannot solve the problem of complicity between a dishonest requester and the workers. All in all, because the above schemes are impossible to avoid dishonest users from collusion, they are flawed in ensuring the fairness of crowdsourcing.
Preliminaries
Secure hash function
For the function
Definition 1
For a given hash function
1.
2.
Homomorphic encryption
Homomorphic encryption is a cryptographic technique based on computational complexity theory of mathematical problems. When the output calculated by the homomorphic encrypted data is decrypted, the obtained result is the same as that obtained by calculating the original data in the same manner.25,26 In this section, we mainly introduce homomorphic encryption based on public key cryptosystem.
Definition 2
Let
The homomorphic encryption algorithm based on public key cryptosystem should satisfy the following conditions:
1.
2.
Commitment scheme
Typically, the commitment scheme is a two-phase protocol between the sender and the receiver of two polynomial-time algorithms, and the protocol satisfies validity, completeness, binding, and hiding.29,30 The specific description is as follows.
Definition 3
The general commitment scheme consists of three polynomial-time algorithms:
The general commitment scheme should satisfy at least the following secure conditions:
1.
2.
Blockchain
The blockchain based on peer-to-peer network13,14 is an open, transparent, and distributed data storage model. A series of chronological transactions are recorded in a structure called a “block.” Since each block contains the hash value of the previous block, these blocks can form a chain structure. A blockchain is essentially a public, immutable, and ordered distributed ledger. The miners compete to get the right to add new blocks to the blockchain by providing power, and the winners will obtain the rewards.
We take Bitcoin 39 as an example to introduce the basic construction principle of blockchain. Transactions on the blockchain typically include three parts: input, output, and digital signature. For a valid transaction, the input must be the unspent of the previous transaction. In the case of Bitcoin, each transaction generates a hash value for the previous transaction and the next user’s public key, and signs it with the private key. Other users can verify the signature. In addition, users can add some additional information to the transaction. So, all transactions over a period of time are interrelated. Miners organize legitimate transactions into a structure called “merkle tree” and fill it into new blocks with the hash of previous block. Later, the miners run the consensus mechanism (proof-of-work) to try to find a nonce for attaching the new block to the blockchain. When the block is added to the blockchain, these transactions will no longer be able to be modified.
Problem formulations
System model
The system model of our BFC scheme is shown in Figure 1. There are three participants in this model including requester, workers, and blockchain. Briefly, the requester first publishes a crowdsourcing task on the blockchain. Workers who want to get rewards submit their own solutions on the blockchain after receiving the requester’s task. Finally, the requester rewards the workers for their correct solutions. To be specific, consider the following.

The system model.
Requester
A requester in crowdsourcing system aims to publish a crowdsourcing task, with task descriptions, a incentive policy, a budget and a period of validity on the blockchain. After the task is released, the requester collects the solutions submitted by workers within the validity period and verifies their correctness. According to the original incentive policy established, the requester distributes rewards to corresponding workers.
Workers
Workers in crowdsourcing system aim to submit solutions for the task published by the requester and expect to receive rewards. In order to ensure the fair exchange between correct solutions and rewards, all users as well as workers can continually verify the entire crowdsourcing process.
Blockchain
As an open decentralized platform, blockchain applies to crowdsourcing scenarios. The requester and workers communicate with each other via the blockchain and issue crowdsourcing information in the form of transactions.
Threat model
The entities in decentralized crowdsourcing system have the possibility to play the role of attackers.
Requester
A requester is semi-honest and curious. It may have following adversarial behaviors: First, the requester may launch the change-incentive policy attack after publishing a crowdsourcing task to affect the fairness of crowdsourcing. And then, the requester is potential to launch the dishonestly-run-incentive policy attack to refuse to reward the appropriate workers for their correct solutions in an attempt to reduce the task spending. Next, the requester may launch the known-solution attack before reward distribution to conspire with the malicious workers, resulting in unfair distribution of rewards.
Workers
Workers are semi-honest and curious. They may have following adversarial behaviors: In order to get more rewards, the workers may attempt to initiate the change-own solution attack after submission through snooping on solutions submitted by others on the blockchain. Besides, it is possible for the workers to launch the known-incentive policy attack before reward distribution to spy on the sensitive information of the requester.
Outsiders
Outsiders are malicious users outside the crowdsourcing system. They may launch the modify-incentive policy attack and the modify-solution attack without the users’ permission to prevent the crowdsourcing processes from proceeding smoothly. For the task that has already been completed, the outsiders may try to launch the illegally-acquire-correct solution attack to maliciously get desired solutions on the blockchain.
Design goals
This article intends to propose a secure and fair crowdsourcing scheme in a decentralized environment. The requester publishes the crowdsourcing request and related information to the blockchain. Then, the workers get the task and submit solutions. After that, the corresponding workers receive rewards from the requester if they offer the proper solutions.
The BFC scheme proposed in this article should have the following attributes:
The incentive policy must be unchangeable after task publishment against the change-incentive policy attack. In the crowdsourcing system, for preventing the requester from achieving the purpose of misallocating rewards, the incentive policy issued by the requester cannot be changed.
The true information of the incentive policy must be unavailable before reward distribution against the known-incentive policy attack. In practical application scenarios, the crowdsourcing policy may contain specific information about how the solutions should be verified. In order to prevent workers from getting rewards through dishonestly submitting solutions, it is necessary to ensure that the policy is unavailable for workers at some phases.
The solutions must be unchangeable after submission against the change-own solution attack. Due to the open and transparent nature of the data on the blockchain, it is easy for workers to get solutions submitted by others. Therefore, in order to prevent workers from changing their own solutions with reference to others, once the solution is submitted, it cannot be changed.
The solutions must be unavailable before reward distribution against the known-solution attack. In order to eliminate the possibility that the requester obtains the solutions in advance and colludes with the malicious workers to undermine the fair distribution of the rewards, the solutions have to be unavailable until the requester begins to distribute the rewards.
The final distribution results of the requester must be verified against the dishonestly-run-incentive policy attack. In the premise that the incentive policy has not been changed, in order to ensure that the requester fairly rewards the workers who correctly submit the solutions according to the initial published policy, our scheme must guarantee that the requester is honest in verifying the solutions and distribution. Any user can check whether the requester correctly rewards the workers.
The proposed scheme
Overview
Our BFC scheme consists of the following four phases: initialization phase, task publishment phase, solution submission phase, and reward distribution phase.
In the initialization phase of the BFC scheme, the registered requester generates a homomorphic encryption key pair. And during the task publishment phase, the requester commits to the policy and publishes the task, the commitment, and related parameters on the blockchain. Then, the workers encrypt the solutions with the requester’s homomorphic public key and generate commitments to the ciphertext, which will be submitted to the blockchain in the solution submission phase. Finally, in the reward distribution phase, the requester and the workers open the commitments separately. If the crowdsourcing information on the blockchain is valid, the requester will verify the solutions according to the policy and distributes rewards to the workers. Any user can check whether the requester honestly assigns rewards on the basis of the solution ciphertexts.
In Figure 2, we indicate the crowdsourcing information that the users need to publish on the blockchain at various phases except the initialization phase. The details of our BFC scheme will be described in the following sections.

Process of the BFC scheme.
Initialization
As the initialization phase begins, each user
In general, each user in crowdsourcing system obtains the unique identifier and the key pair used to publish transactions. The requester
Task publishment
The main purpose of this phase is for the requester to publish a crowdsourcing task in the form of a new transaction on blockchain. For each transaction inserted with a crowdsourcing task, the requester
At the end of this phase, the requester
Solution submission
The main purpose of this phase is for the workers to submit their solutions in the form of new transactions on the blockchain. When the worker
At the end of this phase, the worker encapsulates its solution
Reward distribution
The main purpose of this phase is for the requester to distribute rewards to workers who correctly submit the solutions. During the valid period
It should be added that steps 1, 5, and 6 of this phase require the users to publish the crowdsourcing information to the blockchain in the form of transactions. The specific processing of the transaction is similar to the previous two phases. Here, we omit them.
After the end of this phase, the crowdsourcing process is completed. Any user in the crowdsourcing system can verify the information on the blockchain. If someone has objections to certain parts, he or she can attach the task identifier to post its opinion in the form of a transaction.
Discussion of the BFC scheme
Analysis of scheme
In this section, we analyze our BFC scheme to satisfy the design goals proposed in section “Design goals.”
Fairness
Our scheme should guarantee the fair exchange between the requester and workers, as follows:
1. The incentive policy published by the requester should be unchangeable against the change-incentive policy attack. If the dishonest requester
Therefore, it is computationally infeasible to find a new policy which is different from the original one for the dishonest requester.
Thus, it is computationally infeasible for the dishonest requester to find a new hash value of incentive policy.
2. The true information of the incentive policy must be unavailable before reward distribution against the known-incentive policy attack. The requester
Based on the above analysis of the commitment, the dishonest workers cannot acquire any information about the hash value
And then, even if the attacker obtains the hash
Thus, it is computationally infeasible to get the specific incentive policy
3. The solutions must be unchangeable after submission against the change-own solution attack. If the worker
4. The solutions must be unavailable before reward distribution against the known-solution attack. Workers submit their solution ciphertext commitments on the blockchain. If the requester
5. The final distribution results of the requester must be correct against the dishonestly-run-incentive policy attack. In our proposed scheme, not only the requester
Therefore, users can verify the correctness of the solution
Confidentiality
Our scheme should guarantee the confidentiality of correct solutions after distribution against the illegally-acquire-correct solution attack. After the crowdsourcing task is completed, the outsiders can only get the requester’s homomorphic public key
Therefore, the outsiders cannot decrypt the correct solution ciphertexts without the homomorphic secret key.
Integrity
Our scheme should guarantee the integrity of incentive policy and solutions against the modify-incentive policy attack and the modify-solution attack. We assume that the block containing the transaction of the task policy
Comparison on security
In this section, we compare our scheme with the several related schemes including MTurk, EFF and DFF, CrowdBC, and ZebraLancer. In Table 1, G1 represents confidentiality, G2 integrity, and G3 fairness. In addtion, A1 represents the illegally-acquire-correct solution attack, A2 modify-incentive policy attack, A3 modify-solution attack, A4 change-incentive policy attack, A5 known-incentive policy attack, A6 change-own solution attack, A7 known-solution attack, and A8 dishonestly-run-incentive policy attack. The table shows the comparison results, where “√” means that the scheme can resist the corresponding attack, “×” means not, and “−” means uninvolved.
Comparison with related works.
It is obvious that MTurk, 32 EFF and DFF, 35 CrowdBC, 16 and ZebraLancer 17 cannot fully guarantee the fairness of crowdsourcing. And except for ZebraLancer 17 and our scheme, other schemes also perform poorly in terms of confidentiality and integrity. Therefore, our BFC scheme, which satisfies all the required properties, can be applied to provide secure and fair crowdsourcing service.
Performance evaluation
In this section, we mainly focus on evaluating the computational overhead of our proposed BFC scheme. The performance evaluation consists of four phases, that is, initialization, task publishment, solution submission, and reward distribution. Since the proposed scheme is a universal crowdsourcing model, the choice of specific algorithms is related to the actual needs of the users. Therefore, we selected several representative hash algorithms and semi-homomorphic algorithms in this section and chose Pedersen’s algorithm as the commitment scheme to evalaute and analyze their computational overhead. The experiments are implemented on a PC (Intel(R) Core(TM) i3-4170 CPU with 3.70 GHz, 8G RAM, and Windows 7 OS) using Crypto++ 5.6.2 and gcc/g++.
Initialization
In the initialization phase of our scheme, requester
Task publishment
In the task publishment phase, the requester first generates a hash value for the established incentive policy and then commits the hash. We assume that the plaintext-space of the incentive policy is 1–1024B. And we adopt Pedersen’s commitment scheme, MD5 algorithm, and SHA-1 algorithm to evaluate the computational overhead of the requester. Since the Pedersen commitment requires the input to be less than or equal to 160 bits, the hash functions we selected are MD5 and SHA-1. In Figure 3, the results show that the difference in the hash function has little effect on the computational overhead of this phase. When the length of the policy is 1KB, the computational overhead of using the MD5 algorithm is about 2.76 ms and the overhead of using the SHA-1 algorithm is about 3.01 ms. Because when the length of the input is short, the computational overhead of the hash function is within microseconds and the overhead of the Pedersen commitment is within milliseconds. The computational overhead of the Pedersen commitment scheme has a major impact on this phase. Moreover, it is obvious to conclude that larger plaintext-space of incentive policy published by the requester will lead to higher computational cost.

Computational overhead in task publishment phase.
Solution submission
After receiving the crowdsourcing task published by the requester, the workers who intend to submit the solutions first encrypt their solution with the requester’s homomorphic public key and then commit the ciphertexts. We assume that the plaintext-space of the solution created by the worker is 128–1024B. And we adopt Pedersen’s commitment scheme, RSA-1024, RSA-2048, Paillier-1024, and Paillier-2048 algorithms to evaluate the computational overhead of the worker. In Figure 4, first, we notice that the RSA algorithms with different parameters lead to slightly different computational overhead, and the difference in the parameters of the Paillier algorithm has a relatively large impact on the overhead. In this phase, when the length of the solution is 1KB, the computational overhead of using the RSA-1024 algorithm and the RSA-2048 algorithm is about 18.86 and 19.20 ms, respectively. In addtion, the computational overhead is approximately 104.07 and 375.55 ms when using the Paillier-1024 algorithm and Paillier-2048 algorithm, respectively. Second, the computational overhead is positively correlated with the length of the solution. Third, compared to the RSA-1024 algorithm and the RSA-2048 algorithm, using the Paillier algorithm takes much longer when the length of the solution is the same.

Computational overhead in solution submission phase: (a) evaluation with different RSA algorithms; (b) evaluation with different Paillier algorithms.
Reward distribution
In this phase, the requester and the workers open their own commitments, and if the commitments are valid, the requester decrypts the solution ciphertexts with the private key. Based on the incentive policy developed at the time of the task publishment, the requester evaluates the solutions and sends the assessment results and corresponding rewards to the worker. Any user on the blockchain can verify the correctness of the reward distribution based on the solution ciphertext.
The computational overhead of the Pedersen Commitment scheme when verifying the commitments is similar to the generation. We use RSA-1024, RSA-2048, Paillier-1024, and Paillier-2048 algorithms to evaluate the computational overhead of decrypting a solution ciphertext for the requester in this phase. In Figure 5, we can see that the computational overhead of these four algorithms increases distinctly when the input grows from 128 to 1024B. In addition, in the case of the same solution length, the RSA-1024 algorithm has the shortest decryption time, and the longest is the Paillier-2048 algorithm. Assuming that the requester’s crowdsourcing task is to compute the summation and the product of all effective solutions. According to the multiplicative homomorphism of the RSA algorithm and the additive homomorphism of the Paillier algorithm, any user should multiply all valid solution ciphertexts when verifying the distribution. As shown in Figure 6, we describe the relationship between the number of solution ciphertexts

Computational overhead of decryption in reward distribution phase.

Computational overhead of verification in reward distribution phase: (a) evaluation with different RSA algorithms; (b) evaluation with different Paillier algorithms.
Comparison on performance
In this section, we studied some of the other crowdsourcing schemes mentioned in the related work, and we only focused on the part of the crowdsourcing process. We selected two blockchain-based schemes, CrowdBC 16 and ZebraLancer, 17 and a centralized crowdsourcing scheme EFF. 35 Since the schemes CrowdBC 16 and ZebraLancer 17 are based on smart contracts, we ignore the communication overhead and the calculations on the blockchain and only evaluate the computational overhead of the requester and the worker locally at different phases. Similarly, we do not consider the overhead of the crowdsourcing platform for the EFF scheme. 35 In the experiments, it is assumed that the policy published by the requester is 1KB, and the solution submitted by the worker is 127B. In addition, the number of solutions that need to be verified is 100. Table 2 shows the results. Since the ZebraLancer 17 scheme uses the RSA-2048 algorithm, the computational overhead of our scheme shown in Table 2 is also based on the RSA-2048 algorithm. Here, “−” means that the computational overhead of the requester and the worker in this scheme is negligible or there is almost no local computation.
Computational overhead in different schemes.
As shown in Table 2, the computational overhead is relatively acceptable to the requester and the worker in the proposed BFC scheme. Although the centralized crowdsourcing shceme EFF 35 is efficient, it does not fully satisfy the requirements of crowdsourcing through the previous analysis. The other two blockchain-based schemes have much on-chain computation, which may also reduce the performance.
Conclusion
In this study, we propose BFC, a secure and fair crowdsourcing scheme based on blockchain. The proposed scheme achieves that the requester and workers can share the crowdsourcing information without a third-party platform while ensuring the confidentiality and integrity. In addition, any user in the crowdsourcing system can verify the entire crowdsourcing process to ensure fairness. And the analysis testifies that our scheme is able to satisfy the proposed security requirements and goals. Finally, the experimental results show that the computational overhead is acceptable to both the requester and the worker.
Footnotes
Handling Editor: Francesco Longo
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was funded by the National Natural Science Foundation of China (Grant Nos 61872283, U1536202, U1708262, 61672413, 61672415, 61671360, 61602360, and 61702404) and the China 111 project (Grant No. B16037).
