Research on the subtractive clustering algorithm for mobile ad hoc network based on the Akaike information criterion

Abstract

Large and dense mobile ad hoc networks often meet scalability problems, the hierarchical structures are needed to achieve performance of network such as cluster control structure. Clustering in mobile ad hoc networks is an organization method dividing the nodes in groups, which are managed by the nodes called cluster-heads. As far as we know, the difficulty of clustering algorithm lies in determining the number and positions of cluster-heads. In this article, the subtractive clustering algorithm based on the Akaike information criterion is proposed. First, Akaike information criterion is introduced to formulate the optimal number of the cluster-heads. Then, subtractive clustering algorithm is used in mobile ad hoc networks to get several feasible clustering schemes. Finally, the candidate schemes are evaluated by the index of minimum of the largest within-cluster distance variance to determine the optimal scheme. The results of simulation show that the performance of the proposed algorithm is superior to widely referenced clustering approach in terms of average cluster-head lifetime.

Keywords

Mobile ad hoc networks subtractive clustering algorithm Akaike information criterion

Introduction

A mobile ad hoc network (MANET) is a self-organizing and self-configuring multi-hop wireless network consisting of a group of mobile nodes which can move freely and mutually cooperate to send relaying packets on behalf of one another.¹ The nodes of MANET have to be collaborated and organized to offer both basic network services as routing and management for security. Now, due to the convenience of MANET settings and its developments, the scale of MANET is unimaginably increasing and the performance of the network is degrading. As a key issue of MANET and its applications, the importance of the clustering can be summarized in two aspects.² First, clustering is the most efficient method to manage hundreds of mobile nodes to solve the scalability problems faced in the flat network infrastructure as the network scale increases. Second, clustering serves as the foundation for many other key issues in MANETs, such as routing intrusion detection, topology control, backbone construction, and so on,³ which are all well performed based on an appropriate clustered network structure.

The objective of clustering in MANETs is to divide all nodes into groups, which are exclusive and any instance of the network is belonged.⁴. Cluster-heads are special nodes responsible for the formation of clusters including maintenance of network topology and allocation of nodes in the cluster.⁵ Due to the dynamic nature of the mobile node, the configuration of the cluster-heads is constantly changing, and in order to save resources, it is necessary to minimize the number of cluster-heads. The main concept of clustering is to establish a hierarchy between nodes, and the clusters of network are connected whenever one node in a cluster is directly connected to another node in another cluster at least. The difficulty of clustering lies in how to determine the number and positions of cluster-heads optimally,⁶ which is a non-deterministic polynomial-time (NP)-hard problem.⁷ In this article, Akaike information criterion (AIC) is proposed to calculate the optimal clustering number, then a theoretical analysis is made for solving the contradiction between compactness of clustering and the increased number of cluster-heads.

The organization of this article is as follows. Related works are reviewed in section “Related work.” Subtractive clustering algorithm based on Akaike information criterion (SCAA) in MANETs is proposed and described in detail in section “SCAA.” The simulations conducted to evaluate the performances of SCAA and lowest ID clustering method are presented in terms of average cluster-head lifetime in section “Simulation.” Finally, the conclusion and the future work of the research are outlined in section “Conclusion.”

Related work

Most centralized clustering algorithms proposed for MANETs are typically aiming to determine the number and position of cluster-heads to prolong network lifetime.⁸ Clustering algorithms can be divided into the following categories: partitioning methods, hierarchical methods, graph-based methods, density-based methods, and grid-based methods. In previous research work, the lowest ID and its variants are the most commonly used in many research papers, in which distinct ID is assigned to each node;⁹ if a node has the lowest ID of all the neighbors directly connected, then it is the cluster-head node, otherwise it is an ordinary node; and when two cluster-heads encounter, the one with higher ID should give up its role of cluster-head. The main disadvantage of this algorithm is that it only considers the node ID without the qualification of a node being elected as a cluster-head, and some nodes easily run out of power.¹⁰ The highest connection cluster (HCC) and its variants also are commonly used clustering methods, each node is assigned a degree according to the number of neighbor nodes within its one-hop communication range, the cluster-head is elected by the value of node degree, the disadvantage is that it only considers the number of neighbor nodes without the overall node layout. MAPLE is another clustering algorithm that infers the host mobility by the signal strength received,¹¹ the election of the cluster-head is mostly a random procedure and without considering node distribution and mobility.¹

The subtractive clustering algorithm (SCA) is an unsupervised clustering method based on automatic extraction rules,¹¹ which fully considers the distribution and mobility of nodes to determine the rules of cluster-head selection: each node is regarded as a potential cluster-head and the cluster-heads are determined by calculating the density of nodes, whose principle is to consider each node as potential cluster-head, the clustering rules can be automatically determined by calculating the density of nodes. However, there is no theoretical direction on how to choose the optimal subtraction clustering parameters in the fuzzy system. In Kim et al.’s¹² work, the input membership function of the fuzzy neural network is obtained by fuzzy space partition at first, and then the kernel rules of clustering are obtained by the SCA. In Torun and Tohumolu’s¹³ work, genetic algorithm (GA) is used to construct compact fuzzy model and to determine the optimum clustering rules by selecting efficient inputs and finding the optimum subtractive clustering radius. In Bagheripour and Asoodeh’s¹⁴ work, the optimal parameters of fuzzy clustering (membership function) are extracted by differential clustering method based on hybrid GA-PSO (particle swarm optimization) technology, and the optimal clustering rules are obtained. In Ghane’i-Ostad et al.’s¹⁵ work, the subtractive clustering method is used to determine the clustering center by probability according to the potential of each data point, and then the clustering rules are determined. In Ahmed et al.’s¹⁶ work, subtractive clustering is used as an optimization tool to determine the optimal number of fuzzy membership functions and obtain the kernel rules of clustering. The effects of subtractive clustering parameters such as squash factor, cluster radius, acceptance, and rejection rate of fuzzy models are discussed.¹⁷ The performance of the model is very sensitive to the cluster radius, while the acceptance and rejection rate have little effect.¹⁸ These above-mentioned works almost study the parameter optimization with prior knowledge of a certain number of cluster-heads. However, the condition of the parameter optimization with unknown number of cluster-heads is rarely considered.

In the above cases, the number of clusters needs to be specified before the optimization initial parameters of clustering algorithm. There are two methods to solve the problem of determining the optimal number of clusters. The heuristic approaches are used in the first method, the number of clusters gradually increases from an initial value to an appropriate threshold by running the clustering algorithm frequently leading to poor flexibility and low computational efficiency. In the second method, the finite mixture models are used to formulate cluster number before the clustering which is more efficient and convenient. Belong to the second method are the AIC and its extended criterion that are widely used to optimize fitting order, model parameter, and sample dimension. In the works of Yunlong and Qi et al., the AIC is used to set the fitting order of resistive-capacitance (RC) model at different state of charge (SOCs) to improve the accuracy and practicability of the model. In Ren Chao’s work, the AIC is used to optimize the parameters of radial basis function (RBF) neural network for global position system (GPS) height fitting to improve fitting accuracy. Extending Akaike’s original work, in the work of Bengtsson and Cavanaugh,¹⁹ AICc with normal errors of linear regression is proposed to adjust AIC propensity to favor high-dimensional models when the sample has a smaller maximum order relative to the model in the candidate class. Song et al.’s²⁰ work show that under small sample regression conditions, AICc is significantly better than the AIC model and further extended to a Gaussian autoregressive model containing univariate. However, the optimization of number of clusters based on the AIC is rarely researched, especially with the SCA method to divide the nodes in MANETs to prolong network lifetime.

SCAA

Akaike gives an information on theoretical interpretation of the likelihood function and extends it to define a criterion which is used to test the goodness of assumed models. The criterion is known as the AIC, which is used for selecting linear models and other statistical problems widely. In this article, AIC is used to determine the number of cluster-heads for clustering algorithm, the original formula is expressed as follows,²¹ where $f$ is the likelihood function of the initial nodes, $M$ is a number of parameters used in the statistical model

AIC = - 2 * \ln (f) + 2 M

(1)

The network with $N$ initial nodes, whose coordinate is described as $(x_{i}, y_{i}), i = 1, 2, \dots, N$ , is divided into $K$ clusters. Then, a free space propagation model is built for mobile nodes, in which, the received signal strength of each node is completely dependent on its distance from the transmitter and a node cannot communicate to all other nodes in the network directly for the limitation of its ability of transmit and receive. In such a scenario, if two nodes can communicate with each other, the distance between them cannot exceed R. The distance between node i and node j can be represented by equation (2)

{\begin{matrix} \begin{matrix} l_{ij} = \sqrt{{(x_{i} - x_{j})}^{2} + {(y_{i} - y_{j})}^{2}} & l_{ij} \leq R \end{matrix} \\ \begin{matrix} l_{ij} = \infty & l_{ij} > R \end{matrix} \end{matrix}

(2)

The maximum within-cluster deviation $l_{\max}$ and minimum within-cluster deviation $l_{\min}$ are defined as follows

{\begin{matrix} l_{\max} = max {l_{ij}} \\ l_{\min} = min {l_{ij}} \end{matrix}

(3)

According to the distribution theory of sampling statistics, the likelihood function of the within-cluster deviation with $K$ cluster-heads can be represented by equations (4) and (5)

f = Π_{m = 1}^{k} f_{m}

(4)

f_{m} = \frac{Q (m) / N}{(l_{\max} - l_{\min}) / K} = \frac{K}{N} \frac{Q (m)}{l_{\max} - l_{\min}}, m = 1, 2, \dots, K

(5)

where $f$ is the likelihood function and $f_{m}$ is the density function of the within-cluster deviation, $Q (m)$ is the number of cluster-member in mth cluster, and $K$ can be seem as the parameters number. So, the AIC with $K$ cluster-heads can be calculated through equations (6)–(8), the number of cluster-heads $K_{opt}$ with the minimal value of AIC is optimal

AIC = - N * \ln (Π_{m = 1}^{k} f_{m}) + 2 K

(6)

AIC = - N * \sum_{m = 1}^{k} \ln f_{m} + 2 K

(7)

AIC = - N * \sum_{m = 1}^{k} (\ln \frac{Q (m)}{l_{\max} - l_{\min}} + \ln \frac{K}{N}) + 2 K

(8)

After obtaining the optimal number of clusters $K_{opt}$ , SCAA is improved to determine the optimal clustering scheme including the positions of cluster-heads, the cluster-members of each cluster and the network node density function, the main steps of SCAA are as follows:

Step 1. As above all, there are $N$ nodes $(z_{1}, z_{2}, \dots, z_{n})$ in the MANET. Each node $z_{i} = (x_{i}, y_{i})$ is equivalent to one point in Cartesian coordinate system. $‖ z_{i} - z_{j} ‖^{2}$ is the square of distance between node $z_{i}$ and node $z_{j}$ . A higher value of the mountain function $f_{z_{i}}^{1}$ indicates that $z_{i}$ has more data nodes $z_{j}$ in neighborhood. Thus, it is reasonable to select a $z_{i}$ with a high value of the mountain function $f_{z_{i}}^{1}$ as the first cluster-head, and it can be expressed as follows

f_{z_{i}}^{1} = \sum_{j = 1}^{N} e^{- \frac{{‖ z_{i} - z_{j} ‖}^{2}}{{(r_{1} / 2)}^{2}}}

(9)

where $r_{1}$ is the neighborhood range of the node, and the influence from the nodes out neighborhood is very low to node $z_{i}$ .

Step 2. After calculating the mountain function for each node, $f_{opt}^{1} = \max {f_{z_{i}}^{1}}$ , the node $z_{opt}^{1}$ with the maximum mountain function value $(f_{opt}^{1})$ is selected as the first cluster-head. Then, in order to find other cluster-heads, the effect of the cluster-heads selected has to be eliminated, a value inversely proportional to the distance of the node from the cluster-heads selected is subtracted from the previous mountain function; this process is carried out using equation (10)

f_{z_{i}}^{k + 1} = f_{z_{i}}^{k} - f_{opt}^{k} * \sum_{j = 1}^{N} e^{- \frac{{‖ z_{j} - z_{opt}^{k} ‖}^{2}}{{(r_{2} / 2)}^{2}}}

(10)

where $r_{2}$ is the neighborhood range which density of node decreased significantly, in general $r_{2} = 1.5 r_{1}$ , and $z_{opt}^{k}$ is the $k th$ cluster-head.

Step 3. Repeating Step 2 to calculate the density of each node in the network until $f_{opt}^{p} / f_{opt}^{1} \leq E, E \in (0, 1)$ . Then, there are p cluster-heads in the network. Their positions can be expressed as $Z = {z_{opt}^{1}, z_{opt}^{2}, \dots, z_{opt}^{p}}$ .

Step 4. Repeating from Step 1 to Step 3 with different combinations of parameters $Γ = {ε, r_{1}}$ . Then, several different clustering schemes are obtained. If the number of the cluster-heads $p$ is equal to the optimal number of clusters $K_{opt}$ , the clustering scheme is feasible.

Step 5. If there are more than one feasible clustering schemes, evaluation index should be established to identify the optimal clustering scheme. For each clustering scheme, every node will be assigned to the nearest cluster-head according to the nearest distance principle, the within-cluster variance $v_{ij}$ is calculated through the distance between the cluster-head and the cluster-members, and the maximum within-cluster variance $\max {v_{ij}}$ is selected as the index to evaluate the jth clustering scheme, $i = 1, 2, 3, \dots, K_{opt}$ . The clustering scheme with the minimum maximum within-cluster distance variance $\min {\max {v_{ij}}}$ is the most optimal candidate. The framework of SCAA is shown in Figure 1.

Figure 1.

Frame of subtractive clustering algorithm based on AIC.

Simulation

To demonstrate the effectiveness of the proposed algorithm, the SCAA and lowest ID clustering scheme are applied to a $1000 \times 1000 m^{2}$ square network including 200 nodes distributed randomly, and whose positions are shown in Figure 2. The performance of each algorithm is compared.

Figure 2.

Distribution of network node.

Through equation (6), the AIC value with different number of cluster-heads from 1 to 8 are calculated, the result is shown in Figure 3, the optimal number of cluster-heads is $K_{opt} = 5$ .

Figure 3.

AIC index of different numbers of cluster-heads.

Applying SCAA, $K_{opt} = 5$ as the clustering criterion, searching with different combinations of parameters $Γ = {ε, r_{1}}$ , there are six clustering schemes shown in Table 1 and Figure 4.

Table 1.

Clustering schemes.

Clustering scheme	1	2	3	4	5	6
r ₁	8	13	15	20	22	25
E	0.9	0.6	0.5	0.3	0.2	0.1
Maximum within-cluster variance	40,499	39,225	39,622	32,790	28,675	27,471

Figure 4.

SCA parameters of different numbers of cluster-heads.

From Table 1, the first clustering scheme with the maximum within-cluster variance is the worst scheme $\max {v_{ij}} = 40, 499$ , and the sixth clustering scheme with the minimum within-cluster variance is the optimal scheme $\max {v_{ij}} = 27, 471$ . The performances of clustering schemes are shown from Figures 5 to 10.

Figure 5.

The first clustering scheme.

Figure 6.

The second clustering scheme.

Figure 7.

The third clustering scheme.

Figure 8.

The fourth clustering scheme.

Figure 9.

The fifth clustering scheme.

Figure 10.

The sixth clustering scheme.

There are two aspects to analyze Figures 5 –10. The first is the distance between the cluster-heads, and if two cluster-heads are too close, it may cause significant difference of the cluster-members. The second is the distance from the cluster-head to the boundary, and if the cluster-heads are too close to boundary, it may cause great pressure to the data communication in the cluster and network. Therefore, the sixth clustering scheme is more reasonable than other clustering scheme.

The optimal scheme obtained by SCAA is more stable and reliable, that can reduce the time sending data to the base station of each node and the number of nodes to participate in the channel competition for saving the network energy consumption and prolonging the survival time of the network. Mean lifetime of cluster-heads, also known as cluster lifetime is calculated by cumulating the duration of nodes being cluster-heads and dividing the total number of cluster-heads in one simulation run. A longer mean lifetime implies a longer period of time a cluster may exist. In network simulation, the initial energy of the cluster-head node is set as 1 J, the energy consumption of nodes for receiving information and sending information obeys the first-order energy consumption model, the mean cluster lifetime after 500 business simulation is shown in Figure 11 with four kinds of clustering schemes.

Figure 11.

The mean cluster lifetime of different clustering schemes.

As shown in Figure 11, with the increase of business time, the energy of network is declining. The clustering effect influences the network energy consumption, the network divided with the best clustering scheme (SCAA) has the slowest energy consumption, the network with fourth clustering scheme is superior to first clustering scheme, which is the same as the evaluation of clustering scheme based on the criterion of minimizing maximum distance variance within-cluster, and all the networks divided with SCA clustering scheme are better than the conventional lowest ID clustering method. Therefore, the SCAA protocol can logically divide the network independent of priori knowledge, which makes the bottleneck cluster-head energy security and improve the network survival time effectively.

Conclusion

An improved SCAA is proposed for MANET in the article, first, the AIC is used to calculate the optimal number of cluster-heads, and then SCAA clustering model is established to select the optimal clustering scheme using the minimum maximum within-cluster variance as the evaluation index, and finally, the simulation results show that the network divided by the clustering algorithm proposed can reduce the energy consumption and can improve the network lifetime more effectively than the classical clustering algorithm.

Footnotes

Handling Editor: Sergio Toral

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The authors acknowledge the support of the Zhejiang provincial Public Welfare Technology Application and Research Projects of China (LGF18F010005 and LQ18F030006).

ORCID iD

Liu Banteng

References

Konstantopoulos

Gavalas

Pantziou

. Clustering in mobile ad hoc networks through neighborhood stability-based mobility prediction. Comput Netw 2008; 9(52): 1797–1824.

Cai

Rui

Liu

, et al. Group mobility based clustering algorithm for mobile ad hoc networks. In: Proceedings of the 17th Asia-pacific network operations and management symposium (APNOMS), Busan, 19–21 August 2015. New York: IEEE.

Dror

Avin

Lotker

. Fast randomized algorithm for 2-hops clustering in vehicular ad-hoc networks. Ad Hoc Netw 2013; 7(11): 2002–2015.

Kuila

Jana

. Energy efficient load-balanced clustering algorithm for wireless sensor networks original. Proc Technol 2012; 6: 771–777.

Zhang

Xiao

Tan

. Clustering algorithms for maximizing the lifetime of wireless sensor networks with energy-harvesting sensors. Comput Netw 2013; 14(57): 2689–2704.

Gallagher

. Clustering problems for more useful benchmarking of optimization algorithms. In: Proceedings of the 10th international conference, SEAL, Dunedin, 15–18 December 2014. New York: Springer.

Xibin

William

Yafei

, et al. Optimizing communication in mobile ad hoc network clustering. Comput Ind 2013; 7(64): 849–853.

Jinke

Xiaoguang

Xin

, et al. A cluster-based routing protocol for mobile ad hoc networks. J Beijing Univ Aeronaut Astronaut 2016; 42(11): 2332–2339.

Chengfeng

Tzuchiang

Tingwei

. A virtual subnet scheme on clustering algorithms for mobile ad hoc networks. Expert Syst Appl 2011; 3(38): 2099–2109.

10.

Javad

Reza

. Clustering the wireless Ad Hoc networks: a distributed learning automata approach. J Parallel Distrib Comput 2010; 4(70): 394–405.

11.

Zhu

Zhang

Yang

, et al. Hierarchical clustering algorithm for merged optimal path based on subtractive clustering. J Comput Eng 2015; 41(6): 178–182.

12.

Kim

Lee

, et al. A kernel-based subtractive clustering method. Pattern Recogn Lett 2005; 7(26): 879–891.

13.

Torun

Tohumolu

. Designing simulated annealing and subtractive clustering based fuzzy classifier. Appl Soft Comput 2011; 2(11): 2193–2201.

14.

Bagheripour

Asoodeh

. Fuzzy ruling between core porosity and petrophysical logs: subtractive clustering vs. genetic algorithm–pattern Search. J Appl Geophys 2013; 99: 35–41.

15.

Ghane’i-Ostad

Vahdat-Nejad

Abdolrazzagh-Nezhad

. Detecting overlapping communities in LBSNs by fuzzy subtractive clustering. Soc Networks 2018; 8(1): 23.

16.

Ahmed

Loo

Obo

. Neuro-fuzzy model with subtractive clustering optimization for arm gesture recognition by angular representation of kinect data. In: Proceedings of the 6th international conference on informatics, electronics and vision & 2017 7th international symposium in computational medical and health technology (ICIEV-ISCMHT), Himeji, Japan, 1–3 September 2017. New York: IEEE.

17.

Barchinezhad

Eftekhari

Sanatnama

. A new feature ranking criterion based on density function of subtractive clustering. In: Proceedings of the 13th Iranian conference on fuzzy systems (IFSC), Qazvin, 27–29 August 2013. New York: IEEE.

18.

Alfarraj

Alkhalaf

. Optimized automatic generation of fuzzy rules for nonlinear system based on subtractive clustering algorithm for medical image segmentation. J Med Imag Health Inform 2017; 7(2): 500–507.

19.

Bengtsson

Cavanaugh

. An improved Akaike information criterion for state-space model selection. Comput Stat Data Anal 2006; 10(50): 2635–2654.

20.

Song

Dong

, et al. Blockwise AICc for model selection in generalized linear models. Environ Model Assess 2017; 22(1): 523–533.

21.

Bondarenko

Van Malderen

Treiger

, et al. Hierarchical cluster analysis with stopping rules built on Akaike’s information criterion for aerosol particle classification based on electron probe X-ray microanalysis. Chemometrics Intell Lab Syst 1994; 1(22): 87–95.