Sage Journals: Discover world-class research

Abstract

In this paper, a novel semi-supervised fuzzy clustering algorithm, MFM-SFCM, based on a membership fusion mechanism is proposed for Diffusion-weighted imaging (DWI) brain infarction lesion segmentation. The proposed MFM-SFCM algorithm addresses the issue of weakened constraints and insufficient influence of labeled samples on the clustering process that arises in the semi-supervised fuzzy C-means clustering (SFCM) when emphasizing supervised information. By using a new membership fusion mechanism, MFM-SFCM eliminates this issue, greatly improving the accuracy of clustering results and accelerating convergence speed. This allows fuzzy clustering to achieve good results in the segmentation of DWI brain infarction lesions using a small amount of labeled information. The effectiveness of the MFM-SFCM algorithm is demonstrated through experiments conducted on a real-world dataset of DWI brain images.

Keywords

Semi-supervised clustering supervised information FCM membership fusion mechanism medical image segmentation

1 Introduction

Traditional clustering is an unsupervised machine learning method that divides samples in a dataset into several subsets based on the principle of similarity. Clustering can effectively discover hidden structures and patterns in data, providing valuable information for data analysis, mining, and understanding. Clustering is a common image segmentation technique that can group pixels with similar features or attributes into the same category, enabling recognition and extraction of image content or targets. However, image segmentation is a very complex and challenging problem because image data typically has high dimensionality, noise, redundancy, and low contrast, resulting in unclear or discontinuous boundaries between different categories in the image [1].

To address these issues, traditional fuzzy clustering algorithms have been proposed [2]. They are unsupervised clustering algorithms based on the concept of soft partitioning, where each sample belongs to multiple subsets with different memberships, better reflecting the potential fuzziness and uncertainty in the data. Fuzzy clustering algorithms can enhance the flexibility and robustness of clustering results, making them suitable for handling complex situations such as noise, outliers, and overlapping regions. Commonly used fuzzy clustering algorithms include Fuzzy C-means (FCM) algorithm [2], Possibilistic C-means (FPCM) algorithm [3], Fuzzy Adaptive Resonance Theory (FART) algorithm, Feature Weighted Entropy-based Feature Reduction Fuzzy Clustering Algorithm (FRFCM) [4, 5], etc., which have been commonly utilized in tasks such as image segmentation [6 –17].

Despite the advantages of fuzzy clustering algorithms, they still have limitations, primarily due to the disregard of prior knowledge or label information that may exist in the data. In practical applications, useful auxiliary information can often be obtained through expert knowledge or a small amount of annotation. This information can help guide and constrain the clustering process, thereby improving the clustering performance. Therefore, Pedrycz proposed a semi-supervised fuzzy clustering algorithm [6 , 18–21], initially referred to as “partially supervised". This clustering algorithm uses an objective function that incorporates a supervised component of labeled samples into the FCM’s objective function, forming the semi-supervised FCM algorithm (SFCM). Therefore, it is a clustering algorithm that combines the advantages of unsupervised learning and supervised learning, utilizing a handful of labeled samples to improve fuzzy clustering algorithms. This kind of fuzzy clustering algorithms can effectively utilize label information to adjust membership functions, optimize objective functions, or add constraint conditions, thereby enhancing clustering accuracy and stability [22]. Commonly used semi-supervised fuzzy clustering algorithms include SFCM [18], Semi-supervised Fuzzy Possibilistic C-means (SFPCM) [23], etc. [24].

We proposes a novel semi-supervised fuzzy clustering algorithm based on a membership fusion mechanism for the segmentation of brain infarction lesions in DWI images. Acute cerebral infarction is a common and prevalent disease in clinical practice, characterized by the necrosis of brain tissue due to the sudden interruption of cerebral blood supply. Diffusion-weighted imaging (DWI) can reflect the diffusion changes of water molecules in the local ischemic area by applying diffusion-sensitive gradient fields. In acute cerebral infarction patients, the brain tissue experiences ischemic reactions, resulting in insufficient cerebral blood flow and restricted diffusion of free water molecules. This characteristic is manifested as high signal intensity and decreased Apparent diffusion coefficient (ADC) values in DWI images. Therefore, DWI has high sensitivity for the early clinical diagnosis of brain infarction and is one of the main techniques for diagnosing brain infarction patients [25]. When clinically evaluating patients with acute-stage (within one week of onset) brain infarction, the location, size and shape of lesion areas are closely related to the extent of functional impairment in the patients, serving as crucial reference factors for planning clinical treatments. Precise segmentation of brain infarction lesions is an essential preprocessing step for assessing patient conditions. However, the complicated and variable edema surrounding the infarction makes the entire brain infarction lesion exhibit fuzzy boundaries, irregular shapes, and uneven internal brightness in DWI images, posing challenges to exact segmentation of the lesion. Therefore, it is of great clinical significance to accurately segment the brain infarction lesions in the acute stage of patients on DWI images for image-assisted evaluation. Thus, in this study, our goal is to develop a novel algorithm that is more effective, faster, and more accurate than traditional clustering algorithms [26] for segmenting brain infarction lesions, especially with limited supervised information available.

The innovations of this research are listed below: 1) We identified a problem in the SFCM algorithm, where the influence of labeled data on the clustering centers is weakened when the membership of labeled samples is close to the known membership. 2) To address this issue, this paper proposes a semi-supervised FCM algorithm based on a membership fusion mechanism (MFM-SFCM). This algorithm eliminates the aforementioned problem and significantly improves the accuracy and stability of the clustering results. 3) Through experiments conducted on a real-world dataset of brain DWI images, we prove the superiority of MFM-SFCM. This allows fuzzy clustering to achieve good results in the segmentation of DWI brain infarction lesions using a small amount of labeled information.

2 Related work

2.1 FCM algorithm

In the field of objective function-based clustering algorithms, FCM is the most well-developed and widely used. This algorithm applies fuzzy techniques to K-Means. Let X be the set of objects to be classified, represented as X = {x₁, x₂ … x_n}. Each sample (pattern) x_i has f feature indicators, denoted as (x_i1, x_i2 … x_ij), X ∈ R^n×f. X is divided into c clusters (2 ⩽ c ⩽ n), and a matrix V composed of c cluster center vectors is represented as V = (v₁, v₂, …, v_c) ^T, V ∈ R^c×f. To get the optimal fuzzy classification, a fuzzy classification matrix U is selected from the fuzzy classification space M_fc to make the objective function minimal: $M_{fc} = {U \in R^{c \times n} | u_{ik} \in [0, 1] \forall i, k; \sum_{i = 1}^{c} u_{ik} = 1; \sum_{k = 1}^{n} u_{ik} > 0 \forall i}$ (1) $J (U, V) = \sum_{k = 1}^{n} \sum_{i = 1}^{c} {(u_{ik})}^{m} {∥ x_{k} - v_{i} ∥}^{2}$ (2) where m⩾ 1, ∥ x _k - v_i ∥ represents the distance between object x _k and the i-th cluster center vector v_i. By using the Lagrange multiplier method to solve the constrained optimization problem of the alternating solution formulas are derived $J (U, V, λ) = \sum_{k = 1}^{n} \sum_{i = 1}^{c} {(u_{ik})}^{m} {∥ x_{k} - v_{i} ∥}^{2} + \sum_{k = 1}^{n} λ k [\sum_{i = 1}^{c} u_{ik} - 1]$ (3) $v_{i}^{(l)} = \frac{\sum_{k = 1}^{n} {(u_{ik}^{(l)})}^{m} xk}{\sum_{k = 1}^{n} {(u_{ik}^{(l)})}^{m}}$ (4) $u_{ik}^{(l + 1)} = {(\sum_{j = 1}^{c} {(\frac{∥ x_{k} - v_{i}^{(l)} ∥}{∥ x_{k} - v_{j}^{(l)} ∥})}^{\frac{2}{m - 1}})}^{- 1}$ (5)

Where k = 1, 2, …, n ; i = 1, 2, …, c

The obtained fuzzy classification matrix U^(l+1) and cluster center matrix V^(l) are local optimal solutions relative to the number of clusters c, the initial fuzzy classification matrix U^(l), the convergence threshold ɛ, and the fuzziness index m.

2.2 SFCM algorithm

The quality of semi-supervised clustering results is strongly associated with the availability of supervised information. Supervised information plays a crucial role in the success of the clustering process and the accuracy of evaluating the clustering results. There are two types of supervised information: (1) labeled samples and (2) constraints. In practice, obtaining class labels may be difficult, but it is easier to obtain pairwise constraints, such as must-link and cannot-link constraints, which indicate whether two samples should be assigned to the same cluster or different clusters, respectively. Typically, these constraints are preferred to be satisfied but not mandatory. In the SFCM algorithm, the given input information is assumed to be labeled samples. From the algorithm’s perspective, it is not difficult to transform must-link and cannot-link constraints into labeled samples for processing.

The key to SFCM is to guide the iterative optimization process using labeled samples. Pedrycz proposed the following objective function J_s (s stands for semi-supervised) in paper [18]: $J_{s} = \sum_{i = 1}^{c} \sum_{k = 1}^{n} u_{ik}^{2} d_{ik}^{2} + \sum_{i = 1}^{c} \sum_{k = 1}^{n} {(u_{ik} - f_{ik})}^{2} d_{ik}^{2}$ (6)

To distinguish between unlabeled and labeled samples, Pedrycz, the proposer of the SFCM algorithm, introduced a binary vector b = [b_k] k = 1, 2, …, n $b_{k} = {\begin{matrix} 1 & if pattern x_{k} is labeled \\ 0 & otherwise \end{matrix}$ (7)

The membership values of labeled samples, denoted as a known matrix F = [f_ik], are given, where i = 1, 2, …, c, k = 1, 2, …, n. $J_{m, α}^{s} (U, V; X, F) = \sum_{i = 1}^{c} \sum_{k = 1}^{n} u_{ik}^{m} d_{ik}^{2} + α \sum_{i = 1}^{c} \sum_{k = 1}^{n} {(u_{ik} - f_{ik})}^{m} d_{ik}^{2}$ (8)

The introduction of a scaling factor α (α ⩾ 0) aims to maintain a balance between the unsupervised and supervised optimization mechanisms. The value of α is directly proportional to $\frac{N}{L}$ , where L is the number of labeled samples. Another representation of Equation (8) is as follows: $J_{m, α}^{s} (U, V; X, F) = J_{m} + α \sum_{i = 1}^{c} \sum_{k = 1}^{n} {(u_{ik} - f_{ik} b_{k})}^{m} d_{ik}^{2}$ (9)

$J_{m} = \sum_{i = 1}^{c} \sum_{k = 1}^{n} u_{ik}^{m} d_{ik}^{2}$ Is the objective function of FCM.

3 Semi-supervised fcm based on the membership fusion mechanism utilizing a small amount of labeled sample label information

Observing the objective function of SFCM, we notice that when the membership of labeled samples is close to their known membership |u_ij - f_ijb_j|^m is small, it weakens the constraint on $d_{ij}^{2}$ , which in turn weakens the constraint on ∥x_j - v_i∥. This means that the guidance of labeled samples’ features x on the clustering centers is reduced. This problem is magnified when the coefficient α of the supervised term is large. To address this issue, we introduce the membership fusion mechanism and propose a semi-supervised FCM algorithm based on the membership fusion mechanism (MFM-SFCM).

First, we modify the objective function as follows: $J = J_{m} + α \sum_{i = 1}^{C} \sum_{j = 1}^{N} ({| u_{ij} - b_{j} f_{ij} |}^{m} + u_{ij}^{m}) d_{ij}^{2}$ (10)

This modification ensures that when |u_ij - f_ijb_j|^m is small, the term $({| u_{ij} - b_{j} f_{ij} |}^{m} + u_{ij}^{m})$ in the supervised term remains in the same order of magnitude as the unsupervised term $u_{ij}^{m}$ , ensuring that the clustering centers are always sufficiently guided by the features x of the labeled samples.

When the number of labeled samples is 0, the objective function becomes: $J = J_{m} + α \sum_{i = 1}^{c} \sum_{j = 1}^{n} u_{ij}^{m} d_{ij}^{2} = (1 + 2 α) J_{m}$ (11)

To make the objective function (10) more concise, elegant, and intuitive, we can consider modifying it as follows: $J = \sum_{i = 1}^{C} \sum_{j = 1}^{N} u_{ij}^{m} d_{ij}^{2} + α \sum_{i = 1}^{C} \sum_{j = 1}^{N} b_{j} ({| u_{ij} - f_{ij} |}^{m} + u_{ij}^{m}) d_{ij}^{2}$ (12)

In this way, when the number of labeled samples is 0, this objective function degenerates into the objective function of standard FCM: $J = J_{m}$ (13)

Furthermore, regarding the balancing factor α in the objective function of SFCM, it is mentioned that the choice of α depends largely on the relative sizes of the labeled and unlabeled pattern sets, and it is directly proportional to $\frac{N}{L}$ . This is done to ensure that the sizes of the supervised and unsupervised components are roughly equal, and to prevent the influence of labeled samples from being ignored. However, in many cases, having the supervised and unsupervised components roughly equal in size may not yield satisfactory results. In such cases, it may be necessary to emphasize either the former or the latter. Therefore, we further modify the objective function, and the final definition of the objective function for MFM-SFCM is as follows: $J = (1 - η) \sum_{i = 1}^{c} \sum_{j = 1}^{n} u_{ij}^{m} d_{ij}^{2} + η α \sum_{i = 1}^{C} \sum_{j = 1}^{N} b_{j} ({| u_{ij} - f_{ij} |}^{m} + u_{ij}^{m}) d_{ij}^{2}$ (14)

Where $α = \frac{N}{L}$ , and an additional parameter η ∈ [0, 1] is introduced for artificial adjustment, which is used to further adjust the weight between the supervised and unsupervised terms. η can be regarded as the supervision rate, and when more consideration is needed for the supervised term, increasing η can enhance the guiding effect of the labeled samples on the results.

We use Lagrange multipliers to optimize the equation (14). To reach the optimal output for the objective function, equations are given as follows:

When the membership matrix U is given, V must satisfy the following conditions for the objective function (14) to achieve an extremum: $v_{i} = \frac{(1 - η) \sum_{j = 1}^{N} u_{ij}^{m} x_{j} + η α \sum_{j = 1}^{N} b_{j} ({| u_{ij} - f_{ij} |}^{m} + u_{ij}^{m}) x_{j}}{(1 - η) \sum_{j = 1}^{N} u_{ij}^{m} + η α \sum_{j = 1}^{N} b_{j} ({| u_{ij} - f_{ij} |}^{m} + u_{ij}^{m})}$ (15)

By utilizing the obtained fuzzy membership matrix U, according to the Lagrange multipliers method, we calculate the partial derivatives of the objective function J(V) with respect to V, and then set $\frac{\partial L}{\partial V} = 0$ .

When the center matrix V is given, there are two cases:

For any “m” value other than 2, additional computational work is required for the optimization conditions. This is because now u_ij and the Lagrange multipliers are connected together in the shape of a polynomial equation, and the solutions of this equation need to be calculated numerically.

By setting the fuzzifier or fuzziness coefficient to 2, we are able to obtain explicit formulas for updating the partition matrix. U must satisfy the following necessary conditions for the objective function (14) to achieve an extremum:

u_{ij} = \frac{η α b_{j} d_{ij}^{2} f_{ij} + \frac{1 - \sum_{i = 1}^{C} \frac{η α b_{j} f_{ij}}{(1 - η) + 2 η α b_{j}}}{\sum_{i = 1}^{C} \frac{1}{[(1 - η) + 2 η α b_{j}] d_{ij}^{2}}}}{[(1 - η) + 2 η α b_{j}] d_{ij}^{2}}

(16)

By utilizing the obtained center matrix V and the constraint $\sum_{i = 1}^{C} u_{ij} = 1$ , we can use Lagrange multipliers to derive the corresponding optimization objective function:

$\begin{matrix} L = (1 - η) \sum_{i = 1}^{C} \sum_{j = 1}^{N} u_{ij}^{m} d_{ij}^{2} + η α \sum_{i = 1}^{C} \sum_{j = 1}^{N} \\ b_{j} ({| u_{ij} - f_{ij} |}^{m} + u_{ij}^{m}) d_{ij}^{2} + \sum_{j = 1}^{N} λ_{j} (1 - \sum_{i = 1}^{C} u_{ij}) \end{matrix}$ (17)

According to the Lagrange multipliers method, we calculate the partial derivatives of u_ij and λ and set them to zero. Based on $\frac{\partial L}{\partial u_{ij}} = 0$ , we can draw the following conclusion: $u_{ij} = \frac{2 η α b_{j} d_{ij}^{2} f_{ij} + λ_{j}}{2 [(1 - η) + 2 η α b_{j}] d_{ij}^{2}}$ (18)

Like the FCM algorithm, the sum of membership degrees is 1 to satisfy the property of fuzzy sets, that is, the sum of degrees of belonging to different categories for each element is 1. This ensures that the row vectors of the membership matrix are a probability distribution, reflecting the affiliation of each sample to each category. The sum of membership degrees is also a constraint condition of MFM-SFCM algorithm, which is used to construct the objective function and solve the optimization problem. By applying the constraint $\sum_{i = 1}^{C} u_{ij} = 1$ , we can conclude that: $λ_{j} = 2 \times \frac{1 - \sum_{i = 1}^{C} \frac{η α b_{j} f_{ij}}{(1 - η) + 2 η α b_{j}}}{\sum_{i = 1}^{C} \frac{1}{[(1 - η) + 2 η α b_{j}] d_{ij}^{2}}}$ (19)

By combining equation (18) and equation (19), we can obtain the following equation: $u_{ij} = \frac{η α b_{j} d_{ij}^{2} f_{ij} + \frac{1 - \sum_{i = 1}^{C} \frac{η α b_{j} f_{ij}}{(1 - η) + 2 η α b_{j}}}{\sum_{i = 1}^{C} \frac{1}{[(1 - η) + 2 η α b_{j}] d_{ij}^{2}}}}{[(1 - η) + 2 η α b_{j}] d_{ij}^{2}}$ (20)

Based on the equation (15) and equation (20), witch are the Iterative formulas, the specific steps of the MFM-SFCM algorithm are listed below:

Step 1: Initialization. Given the number of clusters C, fuzziness factor m, and supervision rate η. Obtain the binary vector b = [b_j] and the initial membership matrix F = [f_ij] from the labeled information. Set the maximum number of iterations l_max and the threshold ɛ for algorithm termination.

Step 2: Calculate the distances between all samples from labeled and unlabeled sets to the cluster centers.

Step 3: Update the new membership matrix U using equation (15).

Step 4: Update the new cluster centers V using equation (16).

Step 5: If ∥V^(l+1) - V^(l) ∥ < ɛ or the iteration count l > l_max, terminate the algorithm. Otherwise, go back to Step 2.

The Principle of work of MFM-SFCM is illustrated in Fig. 1.

Fig. 1

The Principle of work of MFM-SFCM algorithm.

Remark

MFM-SFCM algorithm and FCM algorithm have the following differences in their equations:

FCM’s objective function only considers the distance between the sample points and the cluster centers, while MFM-SFCM’s objective function also adds a penalty term, which measures the difference between the membership degree of the sample points and the true membership degree of the known labels.

FCM’s membership matrix and cluster center update formulas are both based on the distance matrix and the membership exponent, while MFM-SFCM also introduces a parameter η, which is used to adjust the weight balance between the supervised and unsupervised terms. These differences bring the following advantages and disadvantages:

MFM-SFCM utilizes the information of the known labels, which can improve the accuracy and robustness of clustering, especially when the data noise is large or the number of clusters is uncertain.

MFM-SFCM requires additional label information, which may not be easy to obtain or completely reliable.

MFM-SFCM requires setting a suitable parameter η , if η is too large or too small, it will affect the clustering effect.

MFM-SFCM and SFCM have the following differences in their equations:

The objective function of SFCM has a coefficient of |u_ij - f_ijb_j|^m before the square of the distance between the sample points and the cluster centers in the penalty term, while MFM-SFCM has $({| u_{ij} - b_{j} f_{ij} |}^{m} + u_{ij}^{m})$

The objective function of the SFCM algorithm uses α to adjust the penalty term. In the objective function of the MFM-SFCM algorithm, α is set as an adaptive parameter, and an additional parameter η is added to manually adjust the balance between the unsupervised term and the supervised term (penalty term). The value of α is equal to the ratio of the total number of samples to the number of labeled samples, and we only need to find a suitable value for η between 0 and 1.

These two differences bring the following advantages:

When |u_ij - f_ijb_j|^m is very small, MFM-SFCM’s objective function’s supervised term $({| u_{ij} - b_{j} f_{ij} |}^{m} + u_{ij}^{m})$ can still maintain in the same order of magnitude as $u_{ij}^{m}$ in the unsupervised term, ensuring that the cluster centers are always sufficiently guided by the features x of the label samples.

In the MFM-SFCM algorithm, the theoretical optimization interval for η is very clear compared to α in the SFCM algorithm. This interval ranges from 0 to 1. It is finite and known, and can be easily covered when adjusting, making it easier to find a suitable value for η. On the other hand, the theoretical optimization interval for α in the SFCM algorithm is infinite and unknown, which could potentially be missed during parameter optimization.

4 Experiments

To validate the effectiveness of MFM-SFCM in handling clustering tasks, MFM-SFCM algorithm was evaluated using real DWI images provided by the hospital. We first describe the construction of the DWI dataset and the experimental setup. Then, we compare and discuss the clustering performance of MFM-SFCM with 4 other algorithms.

4.1 Data sets and settings

4.1.1 Data sets

The experimental data set was obtained from the clinical data of the First People’s Hospital of Changshu. 20 DWI images of brain infarction patients and uneven brightness were randomly selected by radiologists from the clinical data for the purpose of comparing the segmentation results.

4.1.2 Comparison algorithms

In our experiment, we compared 4 algorithms: Fuzzy C-means clustering algorithm (FCM), Kernel-GFCM (KGFCM) [27], POCS-based clustering (POCS-based) [28], and SFCM. The labeled samples for all semi-supervised algorithms were selected to be the same, and their initial membership values were also the same. FCM, SFCM, and MFM-SFCM algorithms used the Euclidean distance in the experiment. The number of clusters was set to 3 based on the composition of the dataset, representing brain infarction lesions, other brain regions, and background, respectively. Each algorithm was tested 20 times using the best parameters selected through the grid search strategy as Table 1.

Table 1
Parameters used when testing algorithms

Algorithm Parameter

FCM m = 1.01,1.05,1.1,1.2,1.5,2,5,10,20,50,80,100

KGFCM p = 3,4,5,6,7,8,9,10

m = 2,3,4,5,6,7,8,9,10

POCS-based –

SFCM α= 0.1,1,10,100,1000,10000,100000,100000,1000000,10000000,100000000

m = 2

MFM-SFCM η ∈ 0, 0.01, . . . , 0.99

m = 2

Algorithm	Parameter
FCM	m = 1.01,1.05,1.1,1.2,1.5,2,5,10,20,50,80,100
KGFCM	p = 3,4,5,6,7,8,9,10
	m = 2,3,4,5,6,7,8,9,10
POCS-based	–
SFCM	α= 0.1,1,10,100,1000,10000,100000,100000,1000000,10000000,100000000
	m = 2
MFM-SFCM	η ∈ 0, 0.01, . . . , 0.99
	m = 2

4.1.3 Evaluation metrics

Two evaluation metrics [29] are used to assess the performance of all algorithms: Dice coefficient(Dice) and Intersection over Union (IoU). Their function in this experiment is to measure the degree of overlap between the segmentation results and the ground truth labels.

Dice is defined as follows: $Dice = \frac{2 | A \cap B |}{| A | + | B |}$ (21)

IoU is defined as follows: $IoU = \frac{| A \cap B |}{| A \cup B |}$ (22)

Where A and B represent the pixel sets of the segmentation results and the ground truth labels, respectively, |A| and |B| represent the size of the sets, |A ∩ B| represents the intersection of the sets, and |A ∪ B| represents the union of the sets. The Dice and IoU are effective measures to assess the consistency between the segmentation results and the ground truth region. Both Dice and IoU values range between [0, 1].The quality of the segmentation improves as the values increase. Therefore, they can be used to evaluate the performance of segmentation networks on different categories, and also to compare the advantages and disadvantages of different segmentation algorithms.

4.1.4 Experimental environment

The experimental hardware platform used an Intel(R) Core(TM) i5-11300H CPU with a clock frequency of 3.10GHz and 16GB of memory. The programming environment was Python 3.8.

4.2 Performance comparison

Table 2 display the clustering results of the five algorithms with the best parameters, represented by the average Dice and IoU scores. The best scores are highlighted in bold.

The analysis of the outcomes in Table 2, as well as Figs. 2, 3, 4, 5 and 6, is as follows:

Compared to the two unsupervised clustering algorithms (FCM and KGFCM), the two semi-supervised clustering algorithms achieved better Dice and IoU scores. This is because FCM and KGFCM algorithms partition the data based on the image features alone, without considering the label information. As a result, they cannot effectively utilize the prior knowledge from labeled samples to guide the clustering process, leading to unsatisfactory clustering performance. In contrast, the two semi-supervised clustering algorithms (SFCM and MFM-SFCM) leverage the prior knowledge from labeled samples, which should improve the clustering accuracy. Improvements in the MFM-SFCM algorithm equation compared to the FCM algorithm equation: The objective function of unsupervised clustering algorithms like the FCM algorithm only considers the distance between sample points and cluster centers. The objective function of the MFM-SFCM algorithm adds a penalty term to measure the difference between the membership degree of sample points and the true membership degree of known labels. Additionally, the MFM-SFCM algorithm introduces a parameter η to adjust the weight balance between supervised and unsupervised terms. These improvements allow the MFM-SFCM algorithm to use known label information to improve clustering accuracy and robustness.

As shown in Fig. 2, the MFM-SFCM algorithm achieves good results in all picture segmentation tasks in the dataset.However, the SFCM algorithm falls significantly behind the POCS-based algorithm on several images, as it fails to effectively emphasize the supervised information, resulting in weakened constraints and insufficient influence of labeled samples on the clustering process. Consequently, it fails to achieve effective segmentation. On the other hand, the results of the POCS-based algorithm are highly unstable, with completely different segmentation results each time, making it unsuitable for image segmentation tasks. In comparison, the proposed MFM-SFCM algorithm, which incorporates the membership fusion mechanism, achieves effective segmentation and optimal performance. Improvements in the MFM-SFCM algorithm equation compared to the SFCM algorithm equation: 1) In the penalty term of the SFCM algorithm’s objective function, the coefficient before the square of the distance between sample points and cluster centers is |u_ij - f_ijb_j|^m. In the MFM-SFCM algorithm, this is improved to $({| u_{ij} - b_{j} f_{ij} |}^{m} + u_{ij}^{m})$ . Thus, when |u_ij - f_ijb_j|^m is very small, $({| u_{ij} - b_{j} f_{ij} |}^{m} + u_{ij}^{m})$ in the supervised term of the MFM-SFCM objective function can still remain on the same order of magnitude as $u_{ij}^{m}$ in the unsupervised term, ensuring that cluster centers are always sufficiently guided by the features x of label samples. 2) In the MFM-SFCM algorithm, the degree of influence of the supervised information is adjusted by parameters α and η, where α is adapted based on the ratio of the total number of samples to the number of labeled samples, and we only need to find an appropriate value for η. The theoretical optimization interval for η is very clear compared to that of α in the SFCM algorithm. This interval ranges from 0 to 1. It is finite and known, and can be easily adjusted, making it easier to find a suitable value for η. On the other hand, the theoretical optimization interval for α in the SFCM algorithm is infinite and unknown, which could potentially be missed during parameter optimization. For example, on the 18th and 20th graphs, the SFCM algorithm may have failed almost because it could not find a suitable value for the parameter α.

As shown in Table 2, although the MFM-SFCM algorithm performs better than the POCS-based algorithm on the Dice and IoU metrics, it has a longer average runtime. This is because the MFM-SFCM algorithm needs to find the optimal value for the parameter η, a process not required by the POCS-based algorithm. This is a shortcoming of the proposed algorithm and will be the focus of analysis in the next section. It also presents a potential research direction for our future studies.

Table 2
Performance of all comparison methods on DWI Image datasets

FCM KGFCM POCS-based SFCM MFM-SFCM

Dice 0.0784 0.0153 0.5416 0.7637 0.8742

IoU 0.0575 0.0078 0.4593 0.6655 0.7808

Time 12.5491 164.2912 4.0804 7.9379 4.8073

	FCM	KGFCM	POCS-based	SFCM	MFM-SFCM
Dice	0.0784	0.0153	0.5416	0.7637	0.8742
IoU	0.0575	0.0078	0.4593	0.6655	0.7808
Time	12.5491	164.2912	4.0804	7.9379	4.8073

Fig. 2

Clustering segmentations on images 1–5, (a) original image, (b) ground truth image, (c) FCM, (d) KGFCM, (e) POCS-based, (f) SFCM, (g) MFM-SFCM.

Fig. 3

Clustering segmentations on images 6–10, (a) original image, (b) ground truth image, (c) FCM, (d) KGFCM, (e) POCS-based, (f) SFCM, (g) MFM-SFCM.

Fig. 4

Clustering segmentations on images 11–15, (a) original image, (b) ground truth image, (c) FCM, (d) KGFCM, (e) POCS-based, (f) SFCM, (g) MFM-SFCM.

Fig. 5

Clustering segmentations on images 16–20, (a) original image, (b) ground truth image, (c) FCM, (d) KGFCM, (e) POCS-based, (f) SFCM, (g) MFM-SFCM.

Fig. 6

Comparison of performance of 5 algorithms in cluster segmentation.

4.3 Parameter sensitive

In the experiment, the parameter η was determined within the given search grid 0, 0.01,..., 0.99. In the subsequent discussion, we explore the performance of MFM-SFCM with different parameter η. Table 3, Table 4, Fig. 7, and Fig. 8 display the average Dice and IoU scores obtained on the partial dataset when using different values of η, while keeping the parameter m fixed at 2.

Table 3
Means of dice by MFM-SFCM on the images using different η, while fixing m = 2

image / η 0 0.3 0.6 0.8 0.9 0.95 0.96 0.97 0.98 0.99

image1 0.0373 0.0557 0.0649 0.8007 0.8448 0.8688 0.8750 0.8750 0.8845 0.8845

image2 0.8570 0.9593 0.9566 0.9552 0.9434 0.9434 0.9434 0.9382 0.9382 0.9382

image3 0.0103 0.0142 0.0173 0.0191 0.0240 0.8378 0.8611 0.8774 0.8774 0.9118

image4 0.0149 0.0194 0.0239 0.5896 0.7143 0.7764 0.7764 0.7962 0.7962 0.8013

image5 0.0112 0.0112 0.0112 0.0114 0.0120 0.0131 0.0145 0.5603 0.7980 0.8462

image / η	0	0.3	0.6	0.8	0.9	0.95	0.96	0.97	0.98	0.99
image1	0.0373	0.0557	0.0649	0.8007	0.8448	0.8688	0.8750	0.8750	0.8845	0.8845
image2	0.8570	0.9593	0.9566	0.9552	0.9434	0.9434	0.9434	0.9382	0.9382	0.9382
image3	0.0103	0.0142	0.0173	0.0191	0.0240	0.8378	0.8611	0.8774	0.8774	0.9118
image4	0.0149	0.0194	0.0239	0.5896	0.7143	0.7764	0.7764	0.7962	0.7962	0.8013
image5	0.0112	0.0112	0.0112	0.0114	0.0120	0.0131	0.0145	0.5603	0.7980	0.8462

Table 4

Means of IoU by MFM-SFCM on the images using different η, while fixing m = 2

image / η	0	0.3	0.6	0.8	0.9	0.95	0.96	0.97	0.98	0.99
image1	0.0189	0.0285	0.0334	0.6667	0.7305	0.7673	0.7771	0.7771	0.7922	0.7922
image2	0.7495	0.9216	0.9167	0.9142	0.8927	0.8927	0.8927	0.8834	0.8834	0.8834
image3	0.0051	0.0071	0.0086	0.0096	0.0120	0.7188	0.7541	0.7797	0.7797	0.8364
image4	0.0075	0.0097	0.0120	0.4161	0.5536	0.6327	0.6327	0.6596	0.6596	0.6667
image5	0.0055	0.0055	0.0055	0.0057	0.0060	0.0065	0.0072	0.3861	0.6610	0.7308

Fig. 7

Dice Performance Comparison of Partial Datasets on Different η.

Fig. 8

IoU Performance Comparison of Partial Datasets on Different η.

Table 5

common table of abbreviations and notations

Abbreviation/Symbol	Full name/Explanation	Source
DWI	Diffusion-weighted imaging
SFCM	semi-supervised fuzzy C-means clustering
FCM	fuzzy C-means clustering	[2]
FPCM	Possibilistic C-means	[3]
FART	Fuzzy Adaptive Resonance Theory
FRFCM	Feature Reduction Fuzzy Clustering	[4,5, 4,5]
SFPCM	Semi-supervised Fuzzy Possibilistic C-means	[23]
ADC	Apparent diffusion coefficient
Dice	Dice coefficient	[29]
IoU	Intersection over Union	[29]
KGFCM	Kernel-GFCM	[27]
POCS-based	POCS-based clustering	[28]
J _m	objective function of standard FCM
X	the set of objects to be classified
N	the number of labeled samples
M _fc	the fuzzy classification space
V	a matrix composed of c cluster center vectors
U	fuzzy classification matrix
m	fuzzifier or weighting exponent
d _ik	the distance between object x_k and the i-th cluster center vector v_i
F	a known matrix of the membership values of labeled samples
b	a binary vector to distinguish between unlabeled and labeled samples
α	scaling factor
η	supervision rate

MFM-SFCM is sensitive to the parameter η. Different values of η result in different clustering performance in terms of Dice and IoU. It can be observed that in most cases, when the Dice score is high, the IoU score is also high. Therefore, using both Dice and IoU as performance metrics is feasible for determining appropriate parameters.

With a fixed value of m, MFM-SFCM obtains the worst Dice and IoU scores when η= 0. When η≠ 0, the clustering performance of MFM-SFCM improves. This is because when η= 0, MFM-SFCM degenerates into classical FCM.

It can be observed that when η takes a relatively large value, MFM-SFCM experiences a sudden and significant improvement in both Dice and IoU scores. This additionally illustrates the efficacy of the proposed membership fusion mechanism. As a result, in subsequent experiments, the range of η in the 0, 0.01,..., 0.99 search grid can be narrowed down. However, the critical value of η that leads to the significant improvement varies and does not follow a specific pattern.

5 Conclusion

In this paper, a novel semi-supervised fuzzy clustering algorithm, MFM-SFCM, based on a membership fusion mechanism is proposed for DWI brain infarction lesion segmentation. It addresses the issue of weakened constraints and insufficient influence of labeled samples on the clustering process in SFCM, where supervised information is not effectively emphasized. MFM-SFCM overcomes this limitation by introducing a new membership fusion mechanism. This significantly improves the accuracy of clustering results and accelerates convergence speed, enabling fuzzy clustering to effectively utilize a small amount of labeled information for DWI brain infarction lesion segmentation. The effectiveness of the MFM-SFCM algorithm is demonstrated through experiments conducted on a real-world dataset of brain DWI images.

Further research can explore the MFM-SFCM algorithm for years to come. One important aspect to consider is the number of clusters, and how to automate the determination of an optimal number of clusters. FCM, as the baseline algorithm for MFM-SFCM, is sensitive to noise. Therefore, applying MFM-SFCM algorithm to noisy images is also a very meaningful research. Additionally, designing a fast learning strategy for MFM-SFCM that is suitable for large-scale DWI brain images is an important area of future research.

Footnotes

Acknowledgments

This work was supported in part by the Suzhou Key Supporting Subjects [Health Informatics(No.SZFCXK202147)], in part by the Changshu Science and Technology Program [No.CS202015, CS202246], in part by the Changshu City Health and Health Committee Science and Technology Program [No. csws201913], and in part by the “333 High level personnel training project of Jiangsu Province”

References

Pitafi

, Anwar

and Sharif

, A Taxonomy of Machine Learning Clustering Algorithms, Challenges, and Future Realms, Appl. Sci. 13(6) (2023), 3529. doi: 10.3390/app13063529.

Bezdek

J.C.

, Ehrlich

and Full

, FCM: The fuzzy c-means clustering algorithm, Comput. Geosci. 10(2–3) (1984), 191–203. doi: 10.1016/0098-3004(84)90020-7.

Krishnapuram

and Keller

J.M.

, A possibilistic approach to clustering, IEEE Trans. Fuzzy Syst. 1(2) (1993), 98–110. doi: 10.1109/91.227387.

Yang

M.-S.

and Nataliani

, A Feature-Reduction Fuzzy Clustering Algorithm Based on Feature-Weighted Entropy, IEEE Trans. Fuzzy Syst. 26(2) (2018), pp. 817–835. doi: 10.1109/TFUZZ.2017.2692203.

Xing

H.-J.

and Ha

M.-H.

, Further improvements in Feature-Weighted Fuzzy C-Means, Inf. Sci. 267 (2014), pp. 1–15. doi: 10.1016/j.ins.2014.01.033.

Pedrycz

and Waletzky

, Fuzzy clustering with partial supervision, IEEE Trans. Syst. Man Cybern. Part B Cybern. 27(5) (1997), 787–795. doi: 10.1109/3477.623232.

Bensaid

A.M.

, Hall

L.O.

, Bezdek

J.C.

and Clarke

L.P.

, Partially supervised clustering for image segmentation, Pattern Recognit. 29(5) (1996), pp. 859–871. doi: 10.1016/0031-3203(95)00120-4.

Guo

, Shi

, Chen

and Ding

, Pixel and region level information fusion in membership regularized fuzzy clustering for image segmentation, Inf. Fusion 92 (2023), pp. 479–497. doi: 10.1016/j.inffus.2022.12.008.

Wang

, Pedrycz

, Li

and Zhou

, Residual-driven Fuzzy C-Means Clustering for Image Segmentation, IEEECAA J. Autom. Sin. 8(4) (2021), pp. 876–889. doi: 10.1109/JAS.2020.1003420.

10.

Hua

, Gu

, Xue

and Ni

, A Novel Brain MRI Image Segmentation Method Using an Improved Multi-ViewFuzzy c-Means Clustering Algorithm, Front. Neurosci. 15 (2021), Accessed: Aug. 30, 2023. [Online]. Available: https://www.frontiersin.org/articles/10.3389/fnins.2021.662674.

11.

Tang

, Ren

and Pedrycz

, Fuzzy C-Means clustering through SSIM and patch for image segmentation, Appl. Soft Comput. 87 (2020), pp. 105928,. doi: 10.1016/j.asoc.2019.105928.

12.

Siva Raja

P.M.

and rani

A.V.

, Brain tumor classification using a hybrid deep autoencoder with Bayesian fuzzy clustering-based segmentation approach, Biocybern. Biomed. Eng. 40(1) (2020), pp. 440–453. doi: 10.1016/j.bbe.2020.01.006.

13.

Jiang

et al. A Novel Negative-Transfer-Resistant Fuzzy Clustering Model with a Shared Cross-Domain Transfer Latent Space and its Application to Brain CT Image Segmentation, IEEE/ACMTrans. Comput. Biol. Bioinform., pp. 1–1, 2020, doi: 10.1109/TCBB.2019.2963873.

14.

Jiang

et al. A Novel Negative-Transfer-Resistant Fuzzy Clustering Model with a Shared Cross-Domain Transfer Latent Space and its Application to Brain CT Image Segmentation, IEEE/ACM Trans. Comput. Biol. Bioinform. (2020), pp. 1–1. doi: 10.1109/TCBB.2019.2963873.

15.

Jia

, Lei

, Du

, Liu

, Meng

and Nandi

A.K.

, Robust Self-Sparse Fuzzy Clustering for Image Segmentation, IEEE Access 8 (2020), 146182–146195. doi: 10.1109/ACCESS.2020.3015270.

16.

Alhassan

A.M.

and Zainon

W.M.N.W.

, BAT Algorithm With fuzzy C-Ordered Means (BAFCOM) Clustering Segmentation and Enhanced Capsule Networks (ECN) for Brain Cancer MRI Images Classification, IEEE Access 8 (2017), pp. 41–51. doi: 10.1109/ACCESS.2020.3035803.

17.

Mohan

, Priya

L.T.

, Nair

L.S.

Fuzzy c-means Segmentation on Enhanced Mammograms Using CLAHE and Fourth Order Complex Diffusion, in 2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC), Erode, India: IEEE, Mar. 2020, pp. 647–651. doi: 10.1109/ICCMC48092.2020.ICCMC-000120.

18.

Pedrycz

, Algorithms of fuzzy clustering with partial supervision, Pattern Recognit. Lett. 3(1) (1985), pp. 13–20. doi: 10.1016/0167-8655(85)90037-6.

19.

Pedrycz

and Waletzky

, Neural-network front ends in unsupervised learning, IEEE Trans. Neural Netw. 8(2) (1997), pp. 390–401. doi: 10.1109/72.557690.

20.

Stutz

and Runkler

T.A.

, Classification and prediction of road traffic using application-specific fuzzy clustering, IEEE Trans. Fuzzy Syst. 10(3) (2002), pp. 297–308. doi: 10.1109/TFUZZ.2002.1006433.

21.

Bouchachia

and Pedrycz

, Data Clustering with Partial Supervision, Data Min. Knowl. Discov. 12(1) (2006), pp. 47–78. doi: 10.1007/s10618-005-0019-1.

22.

Cai

, Hao

, Yang

, Zhao

and Yang

, A review on semi-supervised clustering, Inf. Sci. 632 (2023), pp. 164–200. doi: 10.1016/j.ins.2023.02.088.

23.

Antoine

, Guerrero

J.A.

and Romero

, Possibilistic fuzzy c-means with partial supervision, Fuzzy Sets Syst. 449 (2022), pp. 162–186. doi: 10.1016/j.fss.2022.08.003.

24.

Yasunori

, Yukihiro

, Makito

, Sadaaki

On semi-supervised fuzzy c-means clustering, in 2009 IEEE International Conference on Fuzzy Systems, Jeju Island, South Korea: IEEE, Aug. 2009, pp. 1119–1124. doi: 10.1109/FUZZY.2009.5277177.

25.

Muir

W.K.

, Buchan

, Von Kummer

, Rother

, Baron

J.-C.

Imaging of acute stroke, Lancet Neurol. 5(9) (2006), pp. 755–768. doi: 10.1016/S1474-4422(06)70545-2.

26.

Al-Dmour

, Al-Asni

MR Brain Image Segmentation Based on Unsupervised and Semi-Supervised Fuzzy Clustering Methods, in 2016 International Conference on Digital Image Computing: Techniques and Applications (DICTA), Gold Coast, Australia: IEEE, Nov. 2016, 1–7. doi: 10.1109/DICTA.2016.7797066.

27.

Gupta

, Das

On the Unification of k-Harmonic Means and Fuzzy c-Means Clustering Problems under Kernelization, in 2017 Ninth International Conference on Advances in Pattern Recognition (ICAPR), Bangalore: IEEE, Dec. 2017, 1–6. doi: 10.1109/ICAPR.2017.8593078.

28.

Tran

L.-A.

, Deeh

M.H.

, Do

T.-D.

, Nguyen

T.-D.

, Le

M.-H.

, Park

D.-C.

POCS-based Clustering Algorithm, in 2022 InternationalWorkshop on Intelligent Systems (IWIS), Ulsan, Korea, Republic of: IEEE, Aug. 2022, 1–6. doi: 10.1109/IWIS56333.2022.9920762.

29.

Chang

H.-H.

, Zhuang

A.H.

, Valentino

D.J.

and Chu

W.-C.

, Performance measure characterization for evaluating neuroimage segmentation algorithms, NeuroImage 47(1) (2009), 122–135. doi: 10.1016/j.neuroimage.2009.03.068.

Semi-supervised fuzzy C means based on membership integration mechanism and its application in brain infarction lesion segmentation in DWI images

Abstract

Keywords

1 Introduction

2 Related work

2.1 FCM algorithm

4.1 Data sets and settings

4.1.1 Data sets

4.1.2 Comparison algorithms

4.2 Performance comparison

Table 2 Performance of all comparison methods on DWI Image datasets FCM KGFCM POCS-based SFCM MFM-SFCM Dice 0.0784 0.0153 0.5416 0.7637 0.8742 IoU 0.0575 0.0078 0.4593 0.6655 0.7808 Time 12.5491 164.2912 4.0804 7.9379 4.8073

Footnotes

Acknowledgments

References

Table 2
Performance of all comparison methods on DWI Image datasets

FCM KGFCM POCS-based SFCM MFM-SFCM

Dice 0.0784 0.0153 0.5416 0.7637 0.8742

IoU 0.0575 0.0078 0.4593 0.6655 0.7808

Time 12.5491 164.2912 4.0804 7.9379 4.8073