Sage Journals: Discover world-class research

Abstract

Image denoising is a fundamental tool in the fields of image processing and computer vision. With the rapid development of multimedia and cloud computing, it has become popular for resource-constrained users to outsource the storage and denoising of massive images. However, it may cause privacy concerns and response delays. In this scenario, we propose an efFicient privAcy-preseRving Image deNoising schEme (FARINE) for outsourcing digital images. By introducing a key conversion mechanism, FARINE allows removing noise from a given noisy image using a non-local mean way without leaking any information about the plaintext content. Due to its low computational latency/communication cost, edge computing is considered to improve the user experience. To achieve a dynamic user set efficiently, we design a fine-grained access control mechanism to support user authorization and revocation in multi-user scenarios. Extensive experiments over several benchmark data sets show that FARINE obtains comparable performance to plaintext image denoising.

Keywords

Privacy-preserving image denoising edge computing homomorphic encryption access control

Introduction

Image denoising aims to recover the latent clean content from its noisy version and is widely applied to computer vision tasks, such as image classification,¹ object detection,² and semantic segmentation,³ where high-quality images generally contribute to the performance improvement of these tasks. In recent years, sufficient high-resolution training samples are desired to support image processing-related tasks as deep learning develops rapidly.^4,5 However, digital images are frequently subject to some random noise during image acquisition, due to inherent physical limitations of the sensor and the complicated camera processing pipelines.⁶ Therefore, image denoising has always been an urgent issue in image processing, and its importance is self-evident. Image denoising has been an active research topic and attracted a lot of research interest in the past few decades. Until now, many solutions have been put forward to promote the vigorous development of image denoising, such as a relatively classical filtering-based algorithm⁷ and its some improvements.^8,9 To obtain better performance, researchers have recently proposed deep learning-based denoising approaches in this direction.^10,11

With the increasing popularity of various imaging devices in our daily life, a huge number of images are produced all the time. The generation of massive images poses a challenge to storage localization and denoising, so more and more users have a tendency to resort to cloud computing. Although cloud computing has achieved tremendous success, large-scale centralized computing is prone to negative effects, such as response delay and high communication costs. To address these issues, in 2011, the industry and academia began to explore the network computing model in the post-cloud computing era.¹² In this process, new and better computing models constantly emerged, among which edge computing pushes computing and storage closer to the data source, making it widely used in various scenarios with low communication costs and strong real-time performance. In the last few years, the development of 5G technology has led to new traffic patterns, bringing new opportunities for multi-access edge computing (MEC) that becomes a catalyst for the growth of the market space for edge computing.¹³ The global market size of edge computing is anticipated to reach $$ 61.14$ billion by $2028$ , exhibiting a compound annual growth rate of $38.4 %$ over the forecast period, according to a recent report by Grand View Research, Inc.¹⁴

Data is often encrypted before outsourcing to enhance the confidentiality, but this limits further processing of the data by the cloud. Hence, it is crucial to perform image denoising directly over outsourced encrypted data without revealing privacy. Following the image denoising techniques in the plaintext domain, the privacy-preserving image denoising methods can be generally divided into non-local means (NLM)-based methods^15–17 and deep learning-based methods.^18,19 The former leverages the weighted average idea to securely update each pixel of the encrypted image mainly by employing the Paillier cryptosystem. However, introducing Paillier encryption makes the image-denoising system easy to suffer from high computation costs and data extensions. The latter can achieve outstanding denoising performance by resorting to the deep learning technique. The scheme¹⁸ is the first attempt to propose a deep learning-based secure image denoising. But it relies heavily on the Paillier encryption and has a similar drawback as the former. To address this drawback, Zheng¹⁹ employed a lightweight encryption technique, additive secret sharing (ASS), to train a deep neural network. Despite this scheme can greatly improve efficiency, it is still very challenging that the acquisition of massive images and real-time synchronization between servers are highly demanded.

Due to the adoption of cloud computing, a common one is that all of these schemes^15–19 are highly susceptible to response latency and bandwidth consumptions. But these issues can be mitigated by edge computing, which helps bridge the gap between resource-limited users (such as mobile devices) and the cloud. Apps in the system can easily access locally owned infrastructure and services by deploying edge servers and mobile devices on the same LAN,²⁰ where arbitrary mobile devices in the hands of data owners efficiently implement the related computing, and are allowed to outsource local data. As far as we know, edge computing is a new attempt in the field of secure outsourced image denoising. In addition, the above schemes do not focus on the verification problem of user authorization/revocation that is the desired function in practical applications. To solve these challenges, we introduce edge computing for efficient computation/communication and combine key conversion mechanisms to propose an efFicient privAcy-preseRving Image deNoising schEme (FARINE) for outsourced digital images, where edge servers correctly verify user authorization and revocation. The main contribution and novelty of our FARINE are summarized as follows:

Secure low latency outsourced image denoising. Our FARINE constructs a novel architecture to carry out image denoising, without compromising the privacy of outsourced data. This architecture can reduce computing latency by introducing edge computing instead of cloud computing.

Support multi-user scenario with an unshared key. We construct a novel key transfer protocol supporting unshared key. It allows different users to have their keys, which significantly enhances system security.

Flexible authorization and non-public key authentication. FARINE allows a content owner to authorize/revocate the denoising and decryption privileges for multi-level users. Besides, edge servers can verify the authorized/deprived privileges using their secret keys, without knowing the keys used to sign the permissions.

Ease of use. In FARINE, the content owners/authorized users encrypt or upload data before outsourcing, where any pre-processing is needless. Furthermore, FARINE does not require any interaction between users and servers, which brings low-cost benefit and convenience to users.

The rest of the article is arranged as follows. In the “Preliminaries” section, we review essential technical and cryptographic knowledge. Problem formulations are introduced in the “Problem formulation” section. Detail of the proposed FARINE is described in the “Proposed FARINE scheme” section. Analysis of correctness, security, and performance are shown in the “Analysis of our FARINE” section. Related work is discussed in the “Related work” section. Conclusions and future work are summarized in the “Conclusions and future work” section.

Preliminaries

We will review essential technical and cryptographic knowledge, which are the foundation of FARINE.

Image denoising

The goal of image denoising is to remove the noise $n$ from the noisy image $v = x + n$ while restoring the original clean image $x$ as well as possible. And $n$ is commonly assumed as a zero-mean additive white Gaussian noise with standard deviation $σ$ .²¹ NLM is a seminal technique of exploiting non-local similar patches within the image to achieve image denoising. It is first proposed by Buades et al.,⁷ and has been widely employed in many patch-based denoising algorithms due to its great success in the denoising effect.^22,23 In this article, we also focus on NLM denoising to demonstrate the effectiveness and availability of our FARINE.

Specifically, let $v (i)$ denote the $i$ -th pixel of the gray-scale noisy image $v$ . Based on the NLM mechanism, its new value is calculated by

\begin{matrix} N L [v (i)] = \sum_{j \in D} w (i, j) v (j) \end{matrix}

(1)

where

D

represents a search window and the weight

w (i, j)

is produced as follows:

\begin{matrix} ω (i, j) = \frac{1}{Z (i)} \exp (- \frac{{‖ N (i) - N (j) ‖}_{2}^{2}}{h^{2}}) \end{matrix}

(2)

where

‖ \cdot ‖_{2}^{2}

denotes the squared Euclidean distance, and

h

is a filtering parameter depending on the standard deviation

σ

of the zero-mean Gaussian noise.

N (γ)

are the pixels that fall in a fixed patch centered at a pixel

γ

Z (i)

is the normalizing factor, satisfying

\begin{matrix} Z (i) = \sum_{j \in D} \exp (\frac{{‖ N (i) - N (j) ‖}_{2}^{2}}{h^{2}}) \end{matrix}

(3)

Multi-level homomorphic encryption

In this article, multi-level homomorphic encryption (MHE) proposed by Xiao et al.²⁴ is employed to construct our FARINE. The main thought of MHE is described below:

Description: MHE mainly contains three algorithms, key generation ( $K$ ), encryption ( $E$ ), and decryption $(D)$ . The correlation between these algorithms meets the conditions of $D (E (m, k), k) = m$ , where $m$ is the plaintext message, and $k$ is a private key distributed by $K$ .

Homomorphism: MHE has two important properties, that is, additive and multiplicative homomorphism, which can be given by

E (m_{1}, k) + E (m_{2}, k) = E (m_{1} + m_{2}, k)

(4)

E (m_{1}, k) \cdot E (m_{2}, k) = E (m_{1} \cdot m_{2}, k)

(5)

Obviously, a polynomial operation

f

can be further done in the encrypted domain, namely

\begin{matrix} f [E (m_{1}, k), E (m_{2}, k), \dots, E (m_{n}, k)] \\ = E (f (m_{1}, m_{2}, \dots, m_{n}), k) \end{matrix}

(6)

Key conversion: MHE has the following property of re-encrypting ciphertext by the matrix transformation:

\begin{matrix} (\binom{\prod}{i} k_{i}^{- 1}) \cdot E (m, k_{j}) \cdot (\binom{\prod}{i} k_{i}) \\ = E (m, k_{j} \binom{\prod}{i} k_{i}^{'}) = E (m, k) \end{matrix}

(7)

where

k = k_{j} \prod_{i} k_{i}^{'}

. It should be noted that the ciphertext

E (m, k_{j})

can be converted into another ciphertext

E (m, k)

without the decryption operation.

As for MHE, refer the readers to the literature²⁴ for more details.

Problem formulation

We formalize the FARINE system model, define problem statement/attack model, and outline design goals.

System model

Our FARINE comprises five parties: Content Owner (CO), Authorization User (AU), Edge Platform (EP), Computation Service Provider (CSP), and Trusted third party (TTP). Note that the EP and CSP are two edge servers with different capabilities.

The CO first encrypts the noisy images and then sends the encrypted versions to the EP. Meanwhile, all authorization certificates for specific users are uploaded to the EP.

The AU issues a denoising request to the EP. If the AU is authorized by a CO, it can decrypt and obtain the denoised images.

The EP primarily responds to storage/denoising requests from the COs/AUs. Besides, the EP is responsible for verifying users’ access rights.

The CSP calculates the weight coefficients according to decrypting the ciphertext Euclidean distances, and sends them to EP for denoising.

The TTP is tasked with the generation and distribution of all keys required by COs, AUs, EP, and CSP in the system.

Problem statement

Considering that the CO is inclined to share images with different users for profit, where high-quality images are always more attractive. Due to its limited resources, however, the CO may be unable to denoise for large amounts of images. Thus, the CO tends to outsource the denoising service. For privacy protection, images need to be encrypted before uploading. Image denoising is done independently by edge servers without involving CO and AU. Furthermore, the AU requires a clear enough set of images for experimentation or commercial applications. Therefore, it must obtain the authorization from the corresponding CO and get the denoised images if and only if its access permission is validated by edge servers. Moreover, CO has the requirement to deprive the authorized rights within the authorized period when images are used illegally. In this case, we need to overcome the following challenges because the whole process is in the encrypted state.

Secure weight computation challenge. Securely calculating weight coefficients is the key to the NLM-based image denoising. Therefore, the weight calculation protocol needs to be constructed, where the privacy of the pixels is not leaked.

Efficiency challenge. In order to ease the response latency and broadband costs, we need to introduce an alternative computing model to move the high-intensity computing tasks closer to COs or AUs.

Key independence challenge. To obtain better security, the key agreement mechanism needs to be built to support different users with independent keys.

Authentication challenge. To guarantee that the system can support flexible user authorization and time-controlled revocation. An access control policy for different levels of users needs to be devised.

Attack model

Following the attack model that is already widely used by Liu et al.^25,26 and Yang et al.,²⁷ we assume that EP and CSP are honest-but-curious parties, which honestly follow the pre-defined protocol but are curious about the private data, such as the image outline and the middle-distance calculation results. The assumption mentioned above has been adopted by these schemes.^25,28,29 Besides, we assume that EP and CSP do not collude with each other, based on the fact that they come from different service providers with commercial competition. TTP only generates keys and is assumed to be trusted entirely. During the execution of the entire system, it does not collude with other entities. Meanwhile, we assume that each entity may be compromised by an external adversary $A^{*}$ , aiming to restore the plaintext content of the encrypted images outsourced by CO and AU’s denoising images. Some capabilities of $A^{*}$ are defined as follows:

$A^{*}$ may intercept communication links between entities, and get the corresponding encrypted data.

$A^{*}$ may compromise EP to correctly recover the plaintext content from the encrypted data sent by CO. Meanwhile, and it could try to get the plaintext versions of the denoised ciphertext images.

$A^{*}$ may compromise CSP to obtain the valuable statistical information associated with the weights.

$A^{*}$ may compromise some non-challengers COs or AUs and strive to infer the plaintext information over the ciphertext data from the challenger CO/AU.

Note that

A^{*}

cannot cooperate with EP and CSP simultaneously, and is not allowed to compromise the challenger.

Design goals

To solve the above attack model, we develop a practical secure image denoising scheme with user authentication outsourcing, which can achieve efficient denoising performance without compromising the related data privacy. Thus, the following goals should be obtained in our FARINE:

Data privacy. The privacy of image data should be ensured in our FARINE. Besides, it should not leak the distribution of the intermediate computation data during image denoising.

Secure multi-user and multi-key support. To ensure that our FARINE has the excellent flexibility, it should be designed to support a multi-user scenario. Furthermore, users should be allowed to have their private keys for better security.

Verifiable permission. FARINE should have an access permission mechanism to allow COs to have the rights of flexible authorization and time-controlled revocation on AUs. To maintain the authenticity of user authorization/revocation, the system should provide the verification for the relevant permissions.

Proposed FARINE scheme

In this section, we first give the notation definitions. And then, we elaborate the proposed FARINE scheme, as shown in Figure 1, which incorporates the privacy-preserving image denoising and verifiable user authorization/revocation.

Figure 1.

Framework of the proposed efFicient privAcy-preseRving Image deNoising schEme (FARINE).

Notations

To better present the proposed scheme, we first define some notations as listed in Table 1.

Table 1.

Notation descriptions in FARINE scheme.

Notations	Descriptions
$k$ or $κ$	Master key for the system
$k_{α}$ or $κ_{α}$	Key of participant $α$
$k_{α}^{- 1}$ / $κ_{α}^{- 1}$	Inverse of $k_{α}$ / $κ_{α}$
$E (., .)$	Encryption function for MHE
$D (., .)$	Decryption function for MHE
$φ (., .)$	Key conversion function
${v_{i}}_{1 \leq i \leq M}$	Noisy image set
$v_{t} (i)$	Pixel value at the $i$ -th position of image $v_{t}$
$c_{t} (i)$	Encrypted version of $v_{t} (i)$
$f_{t} (i)$	Filtered version of $c_{t} (i)$
$T / R$	Authorization/revocation certificate
$\bar{T}$ / $\tilde{T}$	Encrypted version of $T$ for CO/AU
$\bar{R}$ / $\tilde{R}$	Encrypted version of $R$ for CO/AU

FARINE: efFicient privAcy-preseRving Image deNoising schEme; MHE: multi-access edge computing; CO/AU: content owner/authorization user.

Privacy-preserving image denoising

During the privacy-preserving image denoising, the CO only needs to encrypt images and upload the encrypted versions to the EP. When obtaining the denoising request, the EP and CSP jointly perform denoising, without compromising the privacy of the related data. Besides, the authorized users obtain the plaintext denoised image with the help of his/her private key. In this subsection, we presents the details of secure image denoising, which consists of five algorithms as follows:

KeyGen : Firstly, the TTP selects and publishes public parameter for MHE. Next, the TTP generates a master key $k$ over $Z_{N}$ , which is kept secret. Whenever a new user (CO or AU) joins the system, as shown in step ①, TTP randomly generates some private keys over $Z_{N}$ for CO, AU, EP, and CSP. Specifically, it sends $k_{C O}$ to the CO, $k_{A U}$ to the AU, $k_{CO}^{'}$ , $k_{E P}$ , $k_{AU}^{'}$ to the EP, and $k_{C S P}$ to the CSP. Meanwhile, these private keys meet the following requirements: $k_{C O} \cdot k_{C O}^{'} = k$ , $k_{C O}^{' - 1} \cdot k_{C O}^{- 1} = k^{- 1}$ , $k_{C S P} \cdot k_{E P} = k$ , $k_{E P}^{- 1} \cdot k_{C S P}^{- 1} = k^{- 1}$ , $k_{A U}^{'} \cdot k_{A U} = k$ and $k_{A U}^{- 1} \cdot k_{A U}^{' - 1} = k^{- 1}$ .

ImEnUp : As shown in step ②, the CO encrypts each image ${v_{t}}_{1 \leq t \leq M}$ with the noise distribution $N (0, σ^{2})$ using the key distributed by the TTP. To be specific, given an image $v_{t}$ , its $i$ -th pixel $v_{t} (i)$ $(i \in {1, 2, \dots, ϱ})$ can be encrypted into $c_{t} (i)$ by

\begin{matrix} c_{t} (i) = E (v_{t} (i), k_{C O}) \\ = k_{C O}^{- 1} \cdot diag (v_{t} (i), x_{1}, \dots, x_{g}) \cdot k_{C O} \end{matrix}

(8)

where

x_{1}, \dots, x_{g}

can be calculated by the Chinese remainder theorem. And

diag (v_{t} (i), x_{1}, \dots, x_{g})

is a

(g + 1)

-dim diagonal matrix with entities

v_{t} (i), x_{1}, \dots, x_{g}

After encryption, the CO uploads these encrypted images to EP for storage and sharing with the authorized users.

WeightCal : The goal of this step is to securely calculate the weight in equation (2). It is a crucial step towards performing privacy-preserving image denoising. Algorithm 1 briefly gives the computational process of the WeightCal.

Algorithm 1.

Computational process of WeightCal.

Input: Master key $k$ , private keys $k_{C O}^{'}$ , $k_{E P}$ , $k_{C S P}$ , noisy image $c_{t} (1 \leq t \leq M)$ , scale factor $Q$ .

Output: The ciphertext weight set.

1 Let $E (c_{t} (i), k_{CO}^{'})$ be $c_{t}^{'} (i)$ ;

2 Let $N_{t} (γ)$ be the pixels centered at pixel $v_{t} (γ)$ ;

3 EP re-encrypts $c_{t} \to c_{t}^{'}$ ;

4 for $i = 1$ to $ϱ$ , EP

5 Denote $‖ N (i) - N (j) ‖_{2}^{2}$ as $S_{t} (i, j)$ ;

6 While not at the end of set $D$

7 $\sum_{s = 1}^{d^{2}} (c_{t}^{'} (i_{s}) - c_{t}^{'} (j_{s}))^{2} \to E (S_{t} (i, j), k)$ ;

8 $φ (E (S_{t} (i, j), k), k_{E P}^{- 1}) \to E (S_{t} (i, j), k_{C S P})$ ;

9 Send $E (S_{t} (i, j), k_{C S P})$ to CSP;

10 For $i = 1 t o ϱ$ , CSP

11 While not at the end of set $D$

12 Decrypt $E (S_{t} (i, j), k_{C S P}) \to S_{t} (i, j)$ ;

13 Calculate $⌊ Q \cdot w_{t} (i, j) ⌉ \to W_{t} (i, j)$ ;

14 Encrypt $W_{t} (i, j) \to E (W_{t} (i, j), k_{C S P})$ ;

15 return $E (W_{t}, k_{C S P}) .$

The secure weight calculation is described in detail below.

When obtaining a denoising request, the EP first uses its private key $k_{CO}^{'}$ to re-encrypt $c_{t}$ $(1 \leq t \leq M)$ into another ciphertext form $c_{t}^{'}$ , that is

\begin{matrix} c_{t}^{'} (i) = E (c_{t} (i), k_{CO}^{'}) \\ = k_{C O}^{' - 1} k_{C O}^{- 1} \cdot diag (v_{t} (i), x_{1}, \dots, x_{g}) \cdot k_{C O} k_{C O}^{'} \end{matrix}

(9)

Based on the relations between the keys in KeyGen,

c_{t}^{'} (i)

is also equal to

\begin{matrix} c_{t}^{'} (i) = k^{- 1} \cdot diag (v_{t} (i), x_{1}, \dots, x_{g}) \cdot k \\ = E (v_{t} (i), k) \end{matrix}

(10)

Secondly, as shown in step ③, the EP starts to calculate the square of Euclidean distance between the pixel sets with the same number of elements, without knowing about any plaintext pixel. Specifically, given any image $v_{t}$ , denote $N_{t} (i)$ / $N_{t} (j)$ as the pixels from the neighborhood window centered at pixel $v_{t} (i)$ / $v_{t} (j)$ , where the window size is sized of $d \times d$ . Besides, $c_{t}^{'} (i_{s})$ / $c_{t}^{'} (j_{s})$ indicates the ciphertext of the $s$ -th pixel in $N_{t} (i)$ / $N_{t} (j)$ . Then the EP computes the sum of the squares of the differences of the corresponding encrypted versions between block pixels $N_{t} (i)$ and $N_{t} (j)$ through the equation $\sum_{s = 1}^{d^{2}} (c_{t}^{'} (i_{s}) - c_{t}^{'} (j_{s}))^{2}$ . Since MHE has the addition and multiplication homomorphisms, the ciphertext of the square of the Euclidean distance between $N_{t} (i)$ and $N_{t} (j)$ can be achieved based on

\begin{matrix} E (Dist (N_{t} (i), N_{t} (j)), k) \\ = E (\sum_{s = 1}^{d^{2}} {(v_{t} (i_{s}) - v_{t} (j_{s}))}^{2}, k) \\ = \sum_{s = 1}^{d^{2}} {(c_{t}^{'} (i_{s}) - c_{t}^{'} (j_{s}))}^{2} \end{matrix}

(11)

After the calculation of $E (Dist (N_{t} (i), N_{t} (j)), k)$ , the EP uses the private key $k_{E P}$ to further transform it into

\begin{matrix} φ (E (Dist (N_{t} (i), N_{t} (j)), k), k_{E P}^{- 1}) \\ = E (Dist (N_{t} (i), N_{t} (j)), k_{C S P}) \end{matrix}

(12)

where

φ (\cdot, \cdot)

is the key transformation function, which satisfies the following relation: if there is any ciphertext

C = E (x, k)

, then

φ (C, k^{'}) = E (x, k k^{'})

. In KeyGen,

k_{C S P} \cdot k_{E P}

is equal to

k

. Hence, it is easy to deduce equation (12). After that, the EP submits all ciphertexts

E (Dist (N_{t} (i), N_{t} (j)), k_{C S P})

to the CSP.

With $E (Dist (N_{t} (i), N_{t} (j)), k_{C S P})$ , the CSP can readily decrypts it and recover its original plaintext distance $Dist (N_{t} (i), N_{t} (j))$ . It is due to the fact that the CSP has the private key $k_{C S P}$ according to the key allocation strategy given in KeyGen. In this case, as shown in step ④ , the CSP can calculate the weight value $w_{t} (i, j)$ based on equations (2) and (3). In order to keep $w_{t} (i, j)$ secret, the CSP encrypts $w_{t} (i, j)$ before uploading to EP, that is,

\begin{matrix} E (⌊ Q \cdot w_{t} (i, j) ⌉, k_{C S P}) \end{matrix}

(13)

where

Q

is a scale factor and

⌊ \cdot ⌉

denotes the rounding operation. The main reason for performing

⌊ Q \cdot w_{t} (i, j) ⌉

is that the MHE cryptosystem is based on integer numbers. For simplicity,

⌊ Q \cdot w_{t} (i, j) ⌉

is rewritten as

W_{t} (i, j)

ImDen : As shown in step ⑤, when getting the ciphertext weight $E (W_{t} (i, j), k_{C S P})$ , the EP leverages the key conversion mechanism to re-encrypt it as the ciphertext under the master key $k$ , namely, $E (W_{t} (i, j), k)$ . By combining the ciphertext $c_{t}^{'} (i)$ , the EP can perform the NLM filtering in the encrypted domain, without learning anything about the plaintext content. Concretely, the filtered value $f_{t} (i)$ of the ciphertext pixel $c_{t} (i)$ is computed by

\begin{matrix} f_{t} (i) & = \sum_{j \in D} E (W_{t} (i, j), k) \cdot c_{t}^{'} (j) \end{matrix}

(14)

Based on the homomorphisms equations (4) and (5), the above equation is actually equivalent to

\begin{matrix} f_{t} (i) & = E (\sum_{j \in D} W_{t} (i, j) \cdot v_{t} (j), k) \end{matrix}

(15)

For any plaintext image

v_{t}

(

t \in 1, 2, \dots, M

), the EP computes all

f_{t} (i)

(i \in {1, 2, \dots, ϱ})

, and then refreshs them into new ciphertexts

{f_{t}^{'} (i)}

by the following conversion:

\begin{matrix} f_{t}^{'} (i) = E (f_{t} (i), k_{A U}^{' - 1}) \\ = E (\sum_{j \in D} W_{t} (i, j) \cdot v_{t} (j), k_{A U}) \end{matrix}

(16)

Subsequently, the EP yields a denoised ciphertext image

f_{t}^{'}

under the key

k_{A U}

, which is subsequently returned to AU. Here, the EP can generate the ciphertext image

f_{t}^{'}

, which cannot be decrypted because the key

k_{A U}

is kept secret by the AU.

AUDec : As shown in step ⑥, AU uses its assigned key $k_{A U}$ to decrypt and restore the corresponding plaintext image of the returned $f_{t}^{'}$ . Specifically, the pixel $f_{t}^{'} (i)$ is decrypted as

\begin{matrix} {\tilde{v}}_{t} (i) & = D (f_{t}^{'} (i), k_{A U}) = \sum_{j \in D} W_{t} (i, j) \cdot v_{t} (j) \end{matrix}

(17)

Following, the AU scales down

{\tilde{v}}_{t} (i)

by the scale factor

Q

used in equation (18), that is,

\begin{matrix} {\bar{v}}_{t} (i) = ⌊ \frac{{\tilde{v}}_{t} (i)}{Q} ⌉ \end{matrix}

(18)

Collecting all

{\bar{v}}_{t} (i)

, AU finally obtains the required denoising result of the noisy image

v_{t}

Remark : In the WeightCal stage, the pixel-to-pixel correlations at different locations in an image may be available to the CSP. To keep these correlations private, the EP should perform the following three steps before uploading the ciphertext squared Euclidean distances ${E (Dist (N_{t} (i), N_{t} (j)), k_{C S P})}$ for all $i \in {1, 2, \dots, ϱ}$ . (1) Pseudo-randomly permute the order of $i$ in the set ${1, 2, \dots, ϱ}$ to disturb the real position of the current pixel in the image to be denoised. (2) When the pixel at position $i$ is determined, the EP shuffles the elements of ${E (Dist (N_{t} (i), N_{t} (j)), k_{C S P})}$ for $j \in D$ . (3) The dummy distances are introduced to keep the real statistical distribution of the weights confidential to the CSP.

Certainly, the EP in FARINE is allowed to randomly select the squared Euclidean distances to upload across different images for the goal of achieving better security.

User authorization and revocation

Considering that the authorization flexibility and the revocation controllability are always desired in the real-world applications, our FARINE provides a fine-grained access control mechanism to achieve a dynamic set of users, which is described as follows.

Denote the identity of CO/AU as $i d_{C O} / i d_{A U}$ . It is assigned when CO or AU joins the system. Let AU’s access level be $v i p_{A U}$ . Different access levels indicate that AU may be provided with different permissions, such as different number of denoising images, different period of time, and so on. For example, an AU with certain level may be allowed to request the total of 500 denoised images within the service time “20191201–20200110,” and only decrypts the number of images pre-specified by the corresponding level.

Verifiable user authorization : As shown in the upper half of Figure 2, the EP can verify the authenticity of user privilege, which is first authorized and then signed by the CO with a private key. If the verification passes, then the EP provides the corresponding level of denoising service for the AU. Otherwise, the EP refuses service to the AU. The related details are presented below:

TTP first generates a master key $κ$ over $Z_{N}$ , which is used in MHE and is not available to any user. Then, it sends random keys $κ_{C O}$ to the CO, $κ_{A U}$ , ${κ^{'}}_{A U}$ to AU, and $κ_{E P}$ to EP. These private keys over $Z_{N}$ are generated under the conditions that $κ_{C O} \cdot κ_{A U} = κ$ , $κ_{A U}^{- 1} \cdot κ_{C O}^{- 1} = κ^{- 1}$ , $κ_{E P} \cdot κ_{A U}^{'} = κ$ , and $κ_{A U}^{' - 1} \cdot κ_{E P}^{- 1} = κ^{- 1}$ .

When an AU wishes to obtain the specific level of permission from CO, it sends the request $(i d_{A U}, v i p_{A U})$ to the CO for applying the corresponding authorization. If it is accepted, the CO will generate a certificate $T$ for the AU as follows:

⟨ c e r = (c n, i d_{C O}, i d_{A U}, v i p_{A U}), S i g ((c e r, r), κ_{C O}) ⟩

where

c n

is a certificate number for each request, and

r

is a random number generated by the CO.

S i g ((c e r, r), κ_{C O})

is used by the CO to sign the authorization

c e r

with its private key

κ_{C O}

, and is actually equivalent to

E ((c e r, r), κ_{C O})

in FARINE. After that, the certificate

T

is distributed to the AU.

For the certificate $T$ , the AU first generates a new signature for $c e r$ by the key conversion, namely,

S i g^{*} ((c e r, r), κ_{E P}) = S i g ((c e r, r), κ_{C O} \cdot κ_{A U} \cdot κ_{A U}^{' - 1}) .

Next, replace

S i g ((c e r, r), κ_{C O})

in the certificate

T

with

S i g^{*} ((c e r, r), κ_{E P})

, and recreate a certificate

\bar{T}

. Finally, the AU submits

\bar{T}

to the EP for an application of image denoising service. Note that the only major difference between the signatures

S i g

and

S i g^{*}

is that they use different private keys for the authorization

c e r

. Moreover, these signatures are indistinguishable under chosen message attacks due to that a random number

r

is involved in the signature generation.

Obtaining the service request, the EP verifies the signature $S i g^{*} ((c e r, r), κ_{E P})$ by using its secret key $κ_{E P}$ . If the verification passes, the AU is permitted to enjoy the corresponding denoising service. Otherwise, the requested service is refused to provide.

Remark : Public-key infrastructure (PKI) is indispensable in a traditional digital signature that is existential unforgeable under adaptive chosen message attack. However, the heavy dependence on PKI readily causes a public-key certificate escrow issue. In our FARINE, the keys for signature generation and verification are private in our FARINE, where the keys are kept secret by the CO and the EP, respectively. In other words, our FARINE can provide public-key certificate-less signature.

Verifiable user revocation : As shown in the second half of Figure 2, our FARINE is configured with a verifiable user revocation mechanism, which is suitable for the following two cases. Case 1 is that some users may decide to cancel the denoising service temporarily for saving costs. In this case, our scheme allows users to request the service cancelation from the system. Case 2 is that some users seeking self-interest may illegally use and disseminate the denoised images without the corresponding CO’s permission. To avoid this case, the CO is allowed to have the privilege to forcibly revoke the certificates that are authorized to malicious users earlier. Since the solution to the first case is similar to the stage of Verifiable user authorization, we only give the process for the second case. The concrete process is demonstrated in Algorithm 2.

Figure 2.

Verifiable authorization and revocation mechanism.

Algorithm 2.

Verifiable user revocation mechanism.

Input: Master key $κ$ , private keys $κ_{C O}$ , $κ_{E P}$ , $κ_{C S P}$ , $κ_{E P}^{'}$ , revoked user with $i d_{A U}$ , certificate number $c n$

Output: “Success” or “Failure”

1 CO randomly generates a number $r$ and signs the message $r e v = (c n, r e v o k e, i d_{C O}, i d_{A U}) \to S i g ((r e v, r), κ_{C O})$ ;

2 CO sends the revocation certificate $⟨ r e v = (c n, r e v o k e, i d_{C O}, i d_{A U}), S i g ((r e v, r), κ_{C O}) ⟩$ to CSP;

3 CSP re-signs the signature $S i g ((r e v, r), κ_{C O}) \to S i g ((r e v, r), κ_{C O} \cdot κ_{C S P} \cdot κ_{E P}^{' - 1})$ ;

4 Denote $S i g ((r e v, r), κ_{C O} \cdot κ_{C S P} \cdot κ_{E P}^{' - 1})$ as $S i g^{*} ((r e v, r), κ_{E P})$ ;

5 CSP sends the new certificate $⟨ r e v = (c n, i d_{C O}, i d_{A U}), S i g^{*} ((r e v, r), κ_{E P}) ⟩$ to EP;

6 EP decrypts $S i g^{*} ((r e v, r), κ_{E P})$ and obtains the message $\bar{r e v}$ ;

7 Check $\bar{r e v}$ $\overset{?}{=}$ $r e v$ ;

8 If the above equation holds, output “Success”; otherwise, output “Failure”

Analysis of our FARINE

In this section, we carry out correctness analysis, security analysis, and performance analysis of our FARINE.

Correctness analysis

Owing to the usage of the NLM technique in our FARINE, the denoised result mainly depends on whether equation (1) is correctly calculated over the encrypted pixels ${c_{t} (i)}$ in the noisy image $v_{t}$ . Thanks to the multiplicative homomorphism of MHE, this problem can be perfectly solved once the ciphertext weights ${w_{t} (i, j)}$ are available. Therefore, the weight calculation in the encrypted domain will play a significant role in the accuracy of image denoising. If the weight errors are controlled, the denoised result of our FARINE is infinitely close to that in the plaintext domain.

Theorem 1

The difference of the denoised results between FARINE and its plaintext version is negligible when the scale factor $Q$ is large enough.

Proof

In the plaintext domain, the ${w_{t} (i, j)}$ in equation (1) are real numbers, and are inappropriate for our FARINE that is based on integer calculation. To solve this issue, the quantization operation needs to be introduced for integer processing of the weights. However, it inevitably results in errors, which affects the correctness of secure weight calculation. In order to better analyze the quantization errors, we will equivalently express the quantized weight $W_{t} (i, j)$ as

W_{t} (i, j) = Q \cdot w_{t} (i, j) + ε_{j}

(19)

where

ε_{j}

is the quantization error and

| ε_{j} | \leq \frac{1}{2}

. Based on equation (19), the decrypted denoised result as shown in equation (17) can be written as

{\bar{v}}_{t} (i) = \sum_{j \in D} w_{t} (i, j) \cdot v_{t} (j) + \frac{\binom{\sum}{j \in D} ε_{j} \cdot v_{t} (j)}{Q}

(20)

It can be found from equation (20) that the error introduced by the quantization is

\bar{ε} = (\sum_{j \in D} ε_{j} \cdot v_{t} (j)) / Q

in our FARINE, such that

| \bar{ε} | \leq \frac{\frac{1}{2} \times 255 \times s^{2}}{Q}

where the size of the search window is set to

s \times s

. In general,

s

is far smaller than

Q

. When

s

is fixed and

Q \to + \infty

, the error

\bar{ε} \to 0

, namely, the difference of the denoised results between FARINE and its plaintext version is negligible if and only if a good choice is for

Q

. However, the larger

Q

will take the higher computation/communication costs. There is always a tradeoff between the denoised accuracy and the costs in practice.

To guarantee that the profits of COs are not injured, and our FARINE can provide the valid authentication for user permissions with the help of the following theorem.

Theorem 2

The user permission is valid under the honest-but-curious model, provided that $\bar{c e r} = c e r$ .

Proof

When receiving the specific denoising request from an AU, if it is permitted, the CO will distribute a corresponding certificate for the AU, which contains the CO’s signature $S i g ((c e r, r), κ_{C O})$ associated with the message $c e r$ . Since the signature is generated by the private key of the CO, it cannot be forged or tampered by the AU with no information about the key $κ_{C O}$ . Therefore, the user authorization can be correctly verified once the EP confirms the $\bar{c e r}$ decrypted from the certificate $\bar{T}$ by AU is equal to $c e r$ . In particular, the EP is not unable to generate the signature of the message $c e r$ . The reason is based on the fact that the signature includes a random number $r$ , which is private to the CO. In addition, the CO is assumed to be honest under the honest-but-curious model.

Similarly, $\bar{r e v} = r e v$ is also proved that the revocation request is legal and comes from a particular CO.

Security analysis

In this subsection, we first present the secure definition, and then illustrate some theorems related to the security of FARINE.

Definition 1

We say that a protocol $π$ is secure if there exists a probabilistic polynomial-time simulator $S$ that can generate a view for the adversary $A$ in the real world and the view is computationally indistinguishable from its real view.

Theorem 3

Our FARINE scheme is secure to ensure the privacy of the denoised ciphertext image under the honest-but-curious model.

Proof

In our FARINE, the view of the EP is $V i e w_{1} =$ ${k_{CO}^{'}, k_{E P}, k_{AU}^{'}, c_{t}^{'}, f_{t}}$ , where $t \in [1, M]$ . According to the homomorphic cryptosystem²⁴ employed in FARINE, the cryptographic keys $k_{CO}^{'}, k_{E P}, k_{AU}^{'}$ are invertible matrixes, each of which is randomly chosen from $Z_{N}$ . Besides, this cryptosystem heavily depends on matrix multiplication so that the generated encryption results $c_{t}^{'}$ and $f_{t}$ are uniformly distributed on $Z_{N}$ due to the randomness of the involved master key $k$ and the introduction of a random number for each encryption. Meanwhile, we can observe that the EP’s $O u t p u t_{1}$ $f_{t}^{'}$ is also uniformly random. This is because that the denoised ciphertext image $f_{t}^{'}$ is equivalent to the encrypted version of NLM process of the original image $v_{t}$ . Thus, $V i e w_{1}$ and $O u t p u t_{1}$ are computationally indistinguishable for the adversary $A$ . In other words, it is impossible for $A$ to distinguish the views between the real word and the ideal world, without knowing the corresponding private key.

Theorem 4

Our FARINE is secure against the adversary $A^{*}$ defined in the attack model.

Proof

The goal of the adversary $A^{*}$ is to correctly deduce the original plaintext information from the ciphertext data that originates from the communication link or EP or CSP. In our FARINE, the ciphertext data involved in secure image denoising mainly includes encrypted noisy images, encrypted denoised images, and intermediate data (i.e. Euclidean distances and weights). We will provide proof to illustrate how to prevent $A^{*}$ with different capabilities from learning about the plaintext contents of these data under the attack model.

Case 1: $A^{*}$ is regarded as a malicious external adversary, which has the ability to eavesdrop all communication links and obtain all transmitted data in ciphertext form. Since $A^{*}$ is not an entity in the model, it cannot know any information of the private keys $k_{C O}$ , $k_{A U}$ , $k_{CO}^{'}$ , $k_{E P}$ , $k_{AU}^{'}$ , and $k_{C S P}$ associated with all entities. In this case, it is a challenge for $A^{*}$ to obtain plaintext versions from the transmitted ciphertext data, which is already proved in Theorem 3.

Case 2: $A^{*}$ is assumed to have an ability to compromise EP and obtain its private key $k_{CO}^{'}$ , such that it can accurately calculate $c_{t}^{'}$ of the ciphertext image $c_{t}$ . However, it knows nothing about the master key $k$ and CO’s private key $k_{C O}$ , so $A^{*}$ can neither decrypt the ciphertext data $c_{t}^{'}$ by the master key $k$ nor restore the plaintext $v_{t}$ from the ciphertext $c_{t}$ that is encrypted and sent by CO using its private key $k_{C O}$ . Due to compromising with the EP, $A^{*}$ can also obtain the denoised ciphertext image $f_{t}$ and $f_{t}^{'}$ , which are taken as the ciphertexts under the master key $k$ and the AU’s private key $k_{A U}$ , respectively. Similarly, $A^{*}$ cannot deduce the plaintext denoised image from the ciphertext data $f_{t}$ and $f_{t}^{'}$ , without learning about the corresponding private keys. Furthermore, the probabilistic encryption technique MHE is leveraged to perform the secure image denoising, where a random number is required for each encryption. In other words, there may be different ciphertext data given plaintext data. As a result, it is impossible for $A^{*}$ to distinguish whether the two ciphertext noisy images/Euclidean distances/weights come from the same plaintext image/Euclidean distance/weight. The data unlinkability means that $A^{*}$ cannot measure the distribution of these above data such that it does not deduce the plaintext contents of these data stored in the EP.

Case 3: $A^{*}$ is supposed to compromise CSP. In this case, $A^{*}$ may have access to obtain the private key $k_{C S P}$ of CSP and further decrypt to obtain the plaintext distance $Dist (N_{t} (i), N_{t} (j))$ , and the weight $w_{t} (i, j)$ . It should be noted that the EP has carried out the confusion operation and dummy introduction over these data before uploading, so that $A^{*}$ cannot get the statistical information of the Euclidean distances/weights, it makes the real Euclidean distance/weight distribution hidden from $A^{*}$ .

Case 4: $A^{*}$ is supposed to compromise a group of non-challengers COs or AUs, and could get their corresponding private keys. It should be stressed that the key transfer protocol in our FARINE can support a multi-user scenario with an unshared key. It means that different users are allowed to have their private keys. Therefore, as long as $A^{*}$ has not the private key of the challenger CO/AU, it will not infer the plaintext information from the ciphertext data that belongs to the challenger CO/AU. The correctness of this conclusion is readily ensured by Theorem 3.

Theorem 5

The FARINE scheme can guarantee the multi-user security.

Proof

According to the key conversion mechanism designed by us, different users (including CO and AU) can be allowed to have their own keys, which are unshared from each other. Based on Theorems 3 and 4, our FARINE with key independence can provide the privacy security for multiple users.

Performance analysis

We conducted experiments on computation overhead, communication and storage costs, and denoising performance. The experiments were performed using Python on PC running CentOS Linux with Intel(R) Xeon(R) CPU E5-2680 v2 @1.90 GHz, GPU Two 8g NVIDIA Tesla, Thread 10 cores and 20 threads, where four sub-account systems were used to simulate CO, AU, EP, and CSP, respectively. According to Hu et al.¹⁵ and Zheng et al.,¹⁷ we used two widely adopted indicators to evaluate the denoising performance, namely peak signal-to-noise ratio (PSNR) and structural similarity (SSIM).

As pointed out by Xiao et al.,²⁴ the MHE has two important security parameters $λ$ and $m$ , which will affect the safety level and overhead of our FARINE. Considering the compromise between safety and efficiency, we choose $λ = 512$ and $m = 2$ , which also refers to the setting by Zhang et al.³⁰ Note that all experiments are tested using the setting. For the sake of comparing denoising performance, we used the three popular image datasets involved in the existing related work in the experiments. Some examples are illustrated in Figure 3.

Datasets :

Standard test images (STI): Thirty eight typical $256 \times 256$ images are taken as test images. These images are widely applied to detect the performance of an image reconstruction algorithm due to their challenging properties, such as strong variations and edges, and uniform areas.

Berkeley segmentation dataset¹ (BSD): It is a significant data set for image segmentation. It has two types of images, gray-scale and color images, each of which contains $300$ images sized $481 \times 321$ . These images were taken in different intricate scenes with diverse and complex textures.

FEI Face dataset² (FFD): It contains $200$ aligned frontal face images with a size of $260 \times 360$ . All of these face images contain subtle textures and irregular edges.

Computation overheads : We first test the computational overhead of the basic operations involved in our FARINE. The experimental results tested on STI are listed in Table 2, where the search window and the neighborhood window are $21 \times 21$ and $5 \times 5$ , respectively. And then, the more details are analyzed as follows.

Figure 3.

Examples selected from different datasets: top: STI; middle: BSD; and bottom: FFD.

Table 2.

Computation overheads of basic operations (millisecond, ms).

Basic operations	Time	Basic operations	Time
Initialization	2.6	Addition	0.016
Master key	0.7	Multiplication	0.217
Participant key	4.45	Euclidean distance	4.84
Key conversion	0.031	Non-local means (NLM) (only equation (14))	0.795

KeyGen algorithm mainly consists of system initialization and key generation/distribution. As shown in Table 2, the total time for initializing our system is $2.6$ ms. In this step, the cost is represented by the selection of MHE’s system parameters. In the key generation/distribution step, the master key and the private keys for participants are required, and cost much < 5 ms in total under single-CO and single-AU scenario. In ImEnUp, the CO only needs to encrypt image for privacy protection before uploading. Here, we construct the probabilistic distribution for the encryption cost by using the cumulative distribution function (CDF), where the STI set is tested in images. The result is illustrated in Figure 4(a). As can be seen that the average encryption time of an image is 3.5 s or so. With the existing matured parallel technique, it can readily reduce to the millisecond level due to the independence of pixel encryption. Meanwhile, it is evident in Figure 4(a) that the decryption is cheaper and just requires about 2.4 s on average for each image in AUDec stage, since the encryption process needs to take some additional computations for the Chinese remainder theorem in MHE. The computation cost of the WeightCal algorithm is relatively large. The major cause is that 441 pixels in the search window are involved to calculate the Euclidean distance for each pixel, so that each image in STI needs to take the average time of about 426.22 s for the calculation of these Euclidean distances. Thanks to the independence of the pixel weight calculation in NLM, this type of overhead can be massively decreased under the parallel environment. Relatively speaking, the CSP calculates the weights in less time and the average time is about 12.27 s per image. As listed in Table 2, the ImDen algorithm that performs the NLM operation is estimated to averagely consume about 52.95 s for an image, which contains 6.47 s of the key conversion overhead.

Figure 4.

Performance evaluation of different operations: (a) CDF of the image-wise encryption time in STI; and (b) running time of user authorization and revocation for different number of AUs.

Then, we also compare the running time efficiency between user authorization and revocation. As presented in Figure 4(b), the overheads of both authorization and revocation linearly increases with the number of AUs. It shows that they have similar running time because of the same number of the key conversions spent on them, but the revocation frequency is far lower than that of the authorization in real-world conditions. Specifically, it is round $4.9 \times 10^{- 5}$ s for an AU to generate the corresponding authorization/revocation certificate. As for user verification process, the EP just spends one decryption time, about $3.7 \times 10^{- 5}$ s, on an authorization/revocation certificate.

Communication and storage costs : Before analyzing the two types of overhead theoretically, let the bit-length of the ciphertext in MHE, the maximum value of the plaintext,and the message $c e r$ / $r e v$ in user authentication to be $| C |, | P |$ , and $| V | / | R |$ , respectively. Subsequently, the theoretical results associated with different algorithms in FARINE are shown inTable 3. Specifically, TTP is responsible for generate one master key $k$ and several private keys for our system, where any key is measured by $ℓ \times ℓ$ . It is worth to note that the master key is owned by TTP and is not transmitted so that the storage costs an extra bit-length $ℓ^{2} | C |$ than the communication in the KeyGen step. The extra cost will be considered as another master key $κ$ is allocated to user authorization/revocation process for better security. When a CO wants to share $t$ noisy images of sized $M \times N$ with a specific AU, it is obvious in ImEnUp that the total of $t M N ℓ^{2} | C |$ bits are required to represent the ciphertext versions of $t$ images, which are then uploaded to the EP. To save cost and convenience, the CO may tend to delete the local images, meaning that the storage costs for images are needless in the CO side. Relatively, the WeightCal process is more complicated. In this step, the EP first employs the key conversion mechanism to re-encrypt the encrypted images by the CO, and then calculates $s^{2}$ Euclidean distances for each pixel on the condition that the search window is set to $s \times s$ . Accordingly, $t M N s^{2} ℓ^{2} | C |$ is consumed to upload these distances without saving locally. To hide the weight distribution from the CSP, $η$ dummy distances are constructed, which results in the additional communication costs of $η ℓ^{2} | C |$ . After that, the CSP further calculates the weight value based on equation (2), and again gets the $(t M N s^{2} + η) ℓ^{2} | C |$ communication costs. Since these weights are calculated in the plaintext form, the CSP needs to store them using $(t M N s^{2} + η) | P |$ bits. Correspondingly, the weight ciphertexts returned by the CSP take up $t M N ℓ^{2} | C |$ bits of EP disk space, where the dummy weights are deleted. The MHE’s homomorphism preserves the file size of the ciphertext image before and after the denoising operation. In ImDen process, therefore, the communication and storage costs are the same as the size of the ciphertext images in ImEnUp, namely, $t M N ℓ^{2} | C |$ bits. Finally, the AU performs AUDec algorithm and recovers the plaintext denoised images with $t M N ℓ^{2} | P |$ bits, where no communication occurs.

Table 3.

Communication and storage costs (bits) in different algorithms.

Algorithms	Communication costs	Storage costs
KeyGen	$γ ℓ^{2} \| C \|$	$(γ + 1) ℓ^{2} \| C \|$
ImEnUp	$t M N ℓ^{2} \| C \|$	$0$
WeightCal	$2 (t M N s^{2} + η) ℓ^{2} \| C \|$	$t M N (s^{2} \| P \| + ℓ^{2} \| C \|) + η \| P \|$
ImDen	$t M N ℓ^{2} \| C \|$	$t M N ℓ^{2} \| C \|$
AUDec	$0$	$t M N \| P \|$
UserAut	$ξ \| V \| + 2 ξ (ℓ^{2} \| C \| + \| V \|)$	$2 ξ (ℓ^{2} \| C \| + \| V \|)$
UserRev	$2 μ (ℓ^{2} \| C \| + \| R \|)$	$2 μ (ℓ^{2} \| C \| + \| R \|)$

“ $γ$ ”: the number of private keys; “ $t$ ”: the number of images shared by CO; “ $U s e r A u t$ ”: verifiable user authorization; and “ $U s e r R e v$ ”: verifiable user revocation.

As shown in the last two rows of Table 3, we also give the theoretic analyses related to user authorization/revocation process. Specifically, $ξ$ AUs send the denoising request with $| V |$ bits to the CO. Accordingly, the $ξ$ authorization certificates that correspond to AUs one by one are returned, each of which includes one $| V |$ -bit message $c e r$ and its $ℓ^{2} | C |$ -bit signature. When obtaining the certificates, the EP will verify the validity of the corresponding signatures, where certificates are not required to store. Based on the above analyses, the communication and storage costs in $U s e r A u t$ process are $ξ | V | + 2 ξ (ℓ^{2} | C | + | V |)$ and $2 ξ (ℓ^{2} | C | + | V |)$ , respectively. Similarly, it is easy to infer that the privileges of $μ$ malicious AUs are deprived at the expense of $2 μ (ℓ^{2} | C | + | R |)$ -bit communication and storage costs in $U s e r R e v$ . Different from $U s e r A u t$ , $U s e r R e v$ does not contain the operation of the certificate application.

Denoising performance : To verify the high accuracy of our FARINE, we conduct comparisons with other state-of-the-art privacy-preserving NLM-based schemes, including Hu et al.¹⁵ and Zhenget al.¹⁷ In order to be fair, the comparison experiments with the above schemes were performed under the same conditions, that is, following the setting in the corresponding schemes, such as dataset, parameter. The related experiments are described as follows.

Our first experiment employs the STI dataset by the scheme.¹⁵ This experiment is completed through the following two stage. In the first stage, we select randomly five gray images from STI as examples, Lena, House, Goldhill, Barbara, and Peppers, shown in the first row of Figure 3 from left to right, and labeled as $I_{1}$ , $I_{2}$ , $I_{3}$ , $I_{4}$ , and $I_{5}$ , respectively. Following,¹⁵ we set the size of the neighborhood window as $5 \times 5$ , the search window as the whole image, $h$ in equation (2) as $0.75$ , and projection dimension by Hu et al.¹⁵ as $18$ . Besides, the noisy images are simulated by adding Gaussian white noise with standard deviations $σ$ to the five images. The comparison results at different $σ$ including $10$ , $30$ , and $50$ are listed in Table 4, where “Clear” indicates that NLM is carried out in the plaintext domain. It can be seen that our FARINE significantly outperforms the scheme¹⁵ in PSNR, and achieves the denoised results closer to that of the plaintext test images. Further, it can be observed that our method has better SSIM results than the scheme.¹⁵ For example, the denoised result of image House is comparable to the plaintext version when $σ = 10$ . Compared with the method,¹⁵ it improves the PSNR by $4.2 %$ and the SSIM by $5.3 %$ , respectively. In the second stage, we also used the whole STI, namely $38$ gray images, as test images, and computed the average values of PSNR and SSIM. Table 5 provides the average PSNR/SSIM performance comparison between our FARINE and the method.¹⁵ As observed, our FARINE clearly outperforms the method¹⁵ by $3.5 %$ on average in terms of PSNR and SSIM when $σ = 10$ . Obviously, this experiment implies that the effectiveness of our FARINE, and shows that our scheme has good denoising performance in the encrypted domain for such images that pose a challenge to image reconstruction algorithms. Although both our FARINE and the method¹⁵ are based on the NLM filter technique, FARINE still achieve the better denoising results. It is mainly attributed to the absence of JL transformation in our FARINE. In Hu et al.,¹⁵ the JL technique is employed to encrypt the weight values for the goal of secure denoising. Nevertheless, the JL-based private projection will change the original weights due to the dimension reduction, and thus lower the final image denoising effect to some extent. Besides, the introduction of JL leads to the extra computation costs and inconveniences for content owners. These above problems caused by JL can be addressed in FARINE, where JL is needless.

Table 4.

Comparisons of PSNR and SSIM results for different secure denoised methods on five test images.

$I$	Method	$σ = 10$		$σ = 30$		$σ = 50$
		PSNR	SSIM	PSNR	SSIM	PSNR	SSIM
$I_{1}$	Clear	32.37	0.91	25.31	0.61	22.19	0.41
	Hu	31.79	0.87	24.87	0.48	21.63	0.40
	Ours	31.90	0.88	24.97	0.51	21.80	0.41
$I_{2}$	Clear	33.76	0.92	26.52	0.57	22.50	0.43
	Hu	31.91	0.86	25.18	0.51	20.01	0.38
	Ours	33.25	0.90	25.59	0.54	22.18	0.42
$I_{3}$	Clear	32.01	0.91	23.69	0.51	21.60	0.41
	Hu	29.90	0.82	23.04	0.48	20.83	0.38
	Ours	31.62	0.88	23.43	0.50	20.98	0.39
$I_{4}$	Clear	31.66	0.92	24.23	0.53	20.41	0.39
	Hu	29.86	0.82	22.72	0.46	20.11	0.38
	Ours	31.20	0.89	23.10	0.49	20.21	0.38
$I_{5}$	Clear	32.93	0.89	25.45	0.61	21.40	0.47
	Hu	31.69	0.83	24.17	0.50	21.13	0.41
	Ours	32.59	0.88	24.91	0.52	21.19	0.42

PSNR: peak signal-to-noise ratio; SSIM: structural similarity.

Table 5.

Average denoising performance comparison by different $σ$ in terms of PSNR and SSIM.

	$σ = 10$		$σ = 30$		$σ = 50$
Method	PSNR	SSIM	PSNR	SSIM	PSNR	SSIM
Clear	32.47	0.90	24.03	0.58	20.79	0.44
Hu	31.09	0.85	23.26	0.52	20.43	0.41
Ours	32.18	0.88	23.63	0.55	20.58	0.42

PSNR: peak signal-to-noise ratio; SSIM: structural similarity.

To further validate the effectiveness of FARINE method, we compare FARINE with the external database-based secure image denoising scheme.¹⁷ To make the comparison more convincing, we follow the parameter setting by Zheng et al.¹⁷ Specifically, let the neighborhood window size, the search window and $h$ in equation (2) to be $9 \times 9$ , the whole image and $0.35$ , respectively. Similar to the first experiment, the Gaussian noise with $(0, σ)$ is adopted to construct noisy images, where the $σ$ value ranges from $10$ to $40$ with a step of $10$ . To be fair, we compare the denoising performance with the method,¹⁷ using the same two data sets (BSD and FFD) to evaluate FARINE. First, we choose four images from BSD to test, where these images are also chosen by Zheng et al.,¹⁷ and shown in the second row of Figure 3. The comparison results are shown in Figure 5. It can be observed that our FARINE consistently outperforms the method¹⁷ in comparison. Specifically, the denoised quality of FARINE enhances dramatically with the gains of $6.24 %$ , $4.50 %$ , $3.29 %$ , and $4.49 %$ in PSNR for four $σ$ values, 10, 20, 30, and 40, respectively. Meanwhile, the corresponding SSIM values are improved by $5.05 %$ , $9.52 %$ , $4.59 %$ , and $7.97 %$ . However, we observe from Figure 5 that FARINE still has a slight gap with the denoising performance of the plaintext images. The reason is that the scale factor $Q$ is introduced in the weight calculation, which affects the noising accuracy. This issue can be further mitigated by increasing $Q$ , as analyzed in Theorem 1. Following Zheng et al.,¹⁷ we randomly select $100$ images from BSD as the test images. This selection is repeated by 10 times, and the average PSNR/SSIM value is calculated and given in Figure 6. Compared to the method,¹⁷ the performance of FARINE in image denoising is comparable to the result in the plaintext domain, and outperforms the method¹⁷ by a large margin of $2.65 % / 3.17 %, 3.34 % / 4.43 %, 3.55 % / 6.72 %$ , and 3.60%/4.77% in terms of PSNR/SSIM, corresponding to $σ$ of $10, 20, 30$ , and 40, respectively. It shows that FARINE can obtain more effective similar image patches by fixing the search window in current image. This is mainly because that the patches within a same image have strong local self-similarity. Based on external cloud dataset, the method¹⁷ may find a patch with the smaller Euclidean distance than that of FARINE due to the accessibility of more candidate patches. However, the smaller distance does not imply the more similarity. There may be such a phenomenon that the matched patch has a completely different content from the given patch by using the Euclidean distance to measure the similarity among image patches. It means that high-quality image similar patches can be ensured within the same image. Again, the conclusion is verified on another dataset FFD.

Figure 5.

Qualitative comparison for images, which are shown in the second row of Figure 3 from left to right: up: PSNR and down: SSIM. PSNR: peak signal-to-noise ratio; SSIM: structural similarity.

Figure 6.

Averagely qualitative comparison on BSD: (a) PSNR and (b) SSIM.

Following Zheng et al.,¹⁷ $100$ face images are selected randomly from the dataset FFD to conduct the comparison. To achieve the reliable and stable result, the selection of the testing set is executed ten times, and the average PSRN and SSIM are used to evaluate, where the experimental parameters are the same as that in the above BSD. The result is presented in Figure 7, It can be found that our FARINE continues to do better than the method¹⁷ in either PSNR or SSIM.

Figure 7.

Averagely qualitative comparison on FFD: (a) PSNR and (b) SSIM. $^{2}$

Related work

As a challenging and open problem that is typically ill-posed inverse, image denoising has attracted considerable attention for years.^34–36 Image priors are viewed as the core component of image denoising, according to the Bayesian theory.³⁷ A representative methodology of image-prior-based denoising approaches is to employ non-local self-similarity (NSS) to suppress noise. The seminal work of the methodology is proposed by Buades et al.,⁷ known as NLM-based image denoising. Its excellent performance depends on the fact that some similar patches are always found for a given patch within a natural image. At present, the NSS prior has been widely used in state-of-the-art patch-based image denoising approaches and has achieved a great success.^8,9,38 However, it is very difficult to find semantically similar patches for all patches of interest within an image by using the Euclidean distance in the NSS-based methods. Recently, owing to the rapid progress in deep learning, some CNN-based methods are sequentially developed and achieve better denoising performance than the traditional NSS-based methods.³⁷

In recent years, a few secure denoising methods for encrypted images are also proposed. SaghaianNejadEsfahani et al.³¹ utilized the secret share technique to propose a secure wavelet denoising scheme. In this scheme, some interactive protocols are designed to perform the privacy-preserving normalization of threshold value after each multiplication. It leads to increased requirements for computing and communication costs during image denoising. To solve this problem, Pedrouzo-Ulloa et al.³² introduced a lattice cipher to achieve the homomorphic computing of single-round filtering and threshold operation, where it is not necessary to interact with the key owner. However, the lattice-based cryptosystems are not efficient in the NLM-based image denoising algorithm. Different from the above methods based on the wavelet domain, several attempts have also been made to perform image denoising in the spatial domain.^15,16 Hu et al.¹⁵ first attempted to design a double-cipher mechanism to achieve privacy-preserving denoising at the pixel level, where both Paillier cryptosystem³⁹ and JL transformation were applied to carry out secure NLM denoising over encrypted images. However, the introduction of Paillier cryptography readily results in huge cipher expansion and high computational complexity. Besides, the image owner needs to execute the JL transformation operation himself/herself except for encrypting images. It not only causes users inconvenience but also affects the denoising accuracy. Hu et al.¹⁶ further investigated a random NLM denoising algorithm based on two servers. Compared to their previous solution by Hu et al.,¹⁵ it reduces cipher extensions and computational costs to a certain extent, but the problem of JL transformation on the user side is not solved. Furthermore, the image owner is required to interact with two cloud servers during the denoising process, increasing the owner’s communication costs. In order to achieve a better denoising performance, Zheng et al.¹⁷ employed an external cloud database to assist privacy-preserving image denoising, where the computational complexity/communication cost is dramatically increased due to patch-wise comparison mechanism for finding similar and high-quality image patches from a cloud database. A similar issue arose by Zheng et al.³³ According to Zheng et al.,^18,19 it is advocated that privacy-preserving deep neural network is a feasible solution to secure image denoising. However, the former heavily uses Paillier cryptosystem with huge computational overhead and data expansion. The additive secret sharing technique used in the latter requires the computing synchronization and multiple rounds of interaction between the two cloud servers.

To the best of our knowledge, the existing related schemes are based on cloud computing, without considering the tradeoff between privacy and user experience. And also, how to manage the privileges of different levels of users is not discussed in these schemes. Thus, we proposed a new framework to better solve the above problems. Table 6 demonstrates the comparison between our scheme and other related schemes in terms of different functionalities.

Table 6.

Comprehensive comparison with the related schemes.

Function/algorithm	SaghaianNejadEsfahani et al.³¹	Pedrouzo-Ulloa et al.³²	Hu et al.¹⁵	Hu et al.¹⁶	Zheng et al.¹⁷	Zheng et al.³³	Zheng et al.¹⁸	Zheng et al.¹⁹	Ours
Method	SS	LC	PC	PC	GC+PC	SE	PC	GC+ASS	MHE
User-side non-interaction	No	Full	No	No	Full	Full	Part	Full	Full
Communication round (user)	$1$	$1$	$1$	$O (1)$	$1$	$2$	$1$	$1$	$1$
Unshared key	$\times$	$\times$	$\times$	$\times$	$\times$	$\times$	$\times$	$\times$	✓
Denoising type	DWT	DWT	NLM	NLM	NLM	NLM	DNN	DNN	NLM
Data storage server	One	One	One	Two	One	One	One	Two	One
Bandwidth consumption	High	Medium	High	Medium	High	High	High	High	Low
Multi-owner multi-user	$\times$	$\times$	$\times$	$\times$	$\times$	$\times$	$\times$	✓	✓
Verification type	$\times$	$\times$	$\times$	$\times$	AC	TA	$\times$	$\times$	ARC
Number of servers	Zero	One	One	Two	Two	One	One	Two	Two
Server type	No	Cloud	Cloud	Cloud	Cloud	Cloud	Cloud	Cloud	Edge

SS: secret sharing; LC: lattice cipher; PC: Paillier cryptosystem; GC: Garbled circuit; SE: symmetric encrytpion; ASS: additively secret sharing; DWT: discrete wavelet transform; AC: authorization check; TA: token authorization; and ARC: authorization and revocation check.

Conclusions

In this article, we have investigated the problem of privacy-preserving image denoising in an outsourcing environment. Based on MHE technique, we propose a secure outsourcing denoising scheme to remove noise from encrypted images. Using benchmark STI, BSD, and FFD, the extensive experiments show that our FARINE significantly outperforms all other related NLM-based approaches on average PSNR and SSIM. In addition, our FARINE is equipped with a verifiable access control to achieve user authorization and revocation, where multi-user and multi-key are supported.

Our future works include developing a secure blind image denoising method on real-world image denoising, and designing more effective policies to speed up the denoising process over outsourcing large image databases.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported in part by the National Natural Science Foundation of China under Grant 62172098; in part by the Natural Science Foundation of Fujian Province under Grant 2020J01497; and in part by the Education Research Project for Young and Middle-Aged Teachers of the Education Department of Fujian Province under Grant JAT200064 and Grant JAT190020.

ORCID iDs

Yongliang Xu

Fei Chen

Notes

References

Zhang

et al. Vitaev2: vision transformer advanced by exploring inductive bias for image recognition and beyond. Int J Comput Vis 2023; 131: 1141–1162.

Xie

Shao

Chen

et al. Cross-modality double bidirectional interaction and fusion network for RGB-T salient object detection. IEEE Trans Circuits Syst Video Technol 2023; 33: 4149–4163.

Wang

Zhang

et al. Delving deeper into pixel prior for box-supervised semantic segmentation. IEEE Trans Image Process 2022; 31: 1406–1417.

Yoon

Fuentes

et al. A comprehensive survey of image augmentation techniques for deep learning. Pattern Recognit 2023; 137: 109347.

Dhar

Dey

Borra

et al. Challenges of deep learning in medical image analysis—improving explainability and trust. IEEE Trans Technol Soc 2023; 4: 68–75.

Chen

et al. Learning to see in the dark. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp.3291–3300.

Buades

Coll

Morel

. A non-local algorithm for image denoising. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), volume 2. IEEE, pp.60–65.

Hasan

El-Sakka

. Improved BM3D image denoising using SSIM-optimized Wiener filter. EURASIP J Image Video Process 2018; 2018: 25.

Xie

Meng

et al. Weighted nuclear norm minimization and its applications to low level vision. Int J Comput Vis 2017; 121: 183–208.

10.

Wan

Shi

Liu

. STN: stochastic triplet neighboring approach to self-supervised denoising from limited noisy images. In 2023 International Conference on Multimedia Modeling. Springer, pp.109–120.

11.

Pang

Zheng

Quan

et al. Recorrupted-to-recorrupted: unsupervised deep learning for image denoising. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp.2043–2052.

12.

Bonomi

. Connected vehicles, the internet of things, and fog computing. In the 2011 eighth ACM international workshop on vehicular inter-networking (VANET), Las Vegas, USA. pp.13–15.

13.

Liu

Peng

Shou

et al. Toward edge intelligence: multiaccess edge computing for 5G and Internet of Things. IEEE Internet Things J 2020; 7: 6722–6747.

14.

Edge computing market worth $61.14 billion by 2028 cagr: 38.4%. Grand View Research May, 2021; https://www.grandviewresearch.com/press-release/global-edge-computing-market.

15.

Zhang

et al. Secure nonlocal denoising in outsourced images. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 2016; 12: 1–23.

16.

Zhang

et al. Secure image denoising over two clouds. In 2017 International Conference on Image and Graphics. Springer, pp.471–482.

17.

Zheng

Cui

Wang

et al. Privacy-preserving image denoising from external cloud databases. IEEE Trans Inform Foren Secur 2017; 12: 1285–1298.

18.

Zheng

Wang

Zhou

. Toward secure image denoising: a machine learning based realization. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp.6936–6940.

19.

Zheng

Duan

Tang

et al. Denoising in the dark: privacy-preserving deep neural network based image denoising. IEEE Trans Dependable Secure Comput 2021; 18: 1261–1275.

20.

Mastorakis

Mtibaa

Lee

et al. ICedge: when edge computing meets information-centric networking. IEEE Internet Things J 2020; 7: 4203–4217.

21.

Elad

Aharon

. Image denoising via sparse and redundant representations over learned dictionaries. IEEE Trans Image Process 2006; 15: 3736–3745.

22.

Baselice

Ferraioli

Pascazio

et al. Denoising of MR images using Kolmogorov-Smirnov distance in a non-local framework. Magn Reson Imaging 2019; 57: 176–193.

23.

Adabi

Ghavami

Fatemi

et al. Non-local based denoising framework for in vivo contrast-free ultrasound microvessel imaging. Sensors 2019; 19: 245.

24.

Xiao

Bastani

Yen

. An efficient homomorphic encryption protocol for multi-user systems. IACR Cryptology ePrint Archive 2012; 2012: 193.

25.

Liu

Deng

Choo

KKR

et al. Privacy-preserving outsourced support vector machine design for secure drug discovery. IEEE Trans Cloud Comput 2020; 8: 610–622.

26.

Liu

Deng

Choo

KKR

et al. Privacy-preserving reinforcement learning design for patient-centric dynamic treatment regimes. IEEE Trans Emerg Top Comput 2021; 9: 456–470.

27.

Yang

Deng

Liu

et al. Privacy-preserving medical treatment system through nondeterministic finite automata. IEEE Trans Cloud Comput 2022; 10: 2020–2037.

28.

Cheng

Wang

Liu

et al. Person re-identification over encrypted outsourced surveillance videos. IEEE Trans Dependable Secure Comput 2021; 18: 1456–1473.

29.

Cheng

Liu

Wang

et al. Securead: a secure video anomaly detection framework on convolutional neural network in edge computing environment. IEEE Trans Cloud Computing 2022; 10: 1413–1427.

30.

Zhang

Jung

Liu

et al. Pic: enable large-scale privacy preserving content-based image search on cloud. IEEE Trans Parallel Distrib Syst 2017; 28: 3258–3271.

31.

SaghaianNejadEsfahani

Luo

Sen-ching

. Privacy protected image denoising with secret shares. In 2012 19th IEEE International Conference on Image Processing. IEEE, pp.253–256.

32.

Pedrouzo-Ulloa

Troncoso-Pastoriza

Pérez-González

. Image denoising in the encrypted domain. In 2016 IEEE International Workshop on Information Forensics and Security (WIFS). IEEE, pp.1–6.

33.

Zheng

Zhou

Cao

et al. PPOIM: privacy-preserving shape context based image denoising and matching with efficient outsourcing. In 2018 International Conference on Information and Communications Security. Springer, pp.215–231.

34.

El Helou

Süsstrunk

. Blind universal Bayesian image denoising with Gaussian noise level learning. IEEE Trans Image Process 2020; 29: 4885–4897.

35.

Liu

Zhang

Mou

. Image denoising based on correlation adaptive sparse modeling. In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp.2060–2064.

36.

Cheng

Wang

Huang

et al. NBNet: noise basis learning for image denoising with subspace projection. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp.4896–4906.

37.

Hou

Liu

et al. NLH: a blind pixel-level non-local method for real-world image denoising. IEEE Trans Image Process 2020; 29: 5121–5135.

38.

Dabov

Foi

Katkovnik

et al. Image restoration by sparse 3D transform-domain collaborative filtering. In 2008 Image Processing: Algorithms and Systems VI, volume 6812. International Society for Optics and Photonics, p.681207.

39.

Paillier

Public-key cryptosystems based on composite degree residuosity classes. In 1999 International conference on the theory and applications of cryptographic techniques. Springer, pp.223–238.

Edge-based secure image denoising scheme supporting flexible user authorization

Abstract

Keywords

Introduction

Preliminaries

Image denoising

Multi-level homomorphic encryption

Problem formulation

System model

Problem statement

Attack model

Design goals

Proposed FARINE scheme

Notations

Privacy-preserving image denoising

User authorization and revocation

Analysis of our FARINE

Correctness analysis

Security analysis

Performance analysis

Related work

Conclusions

Footnotes

Declaration of conflicting interests

Funding

ORCID iDs

Notes

References