Sage Journals: Discover world-class research

Abstract

Bottom-up and top-down are two main computing models in granular computing by which the granule set including granules with different granularities. The top-down hyperbox granular computing classification algorithm based on isolation, or IHBGrC for short, is proposed in the framework of top-down computing model. Algorithm IHBGrC defines a novel function to measure the distance between two hyperbox hgranules, which is used to judge the inclusion relation between two hyperbox granules, the meet operation is used to isolate the ith class data from the other class data, and the hyperbox granule is partitioned into some hyperbox granules which include the ith class data. We compare the performance of IHBGrC with support vector machines and HBGrC, for a number of two-class problems and multiclass problems. Our computational experiments showed that IHBGrC can both speed up training and achieve comparable generalization performance.

Keywords

Hyperbox granule granular computing inclusion measure isolation classification problems

Introduction

There have been many researchers working in the granular computing field. Zadeh^1,2 has identified three basic concepts, namely granulation, organization, and causation, that underlie the process of human cognition. More specifically, granulation is a process that decomposes a universe into parts. Conversely, organization is the way in which parts are integrated into the universe by the operation between two granules. Causation involves the association of causes and effects.

In general, organization process of objectives, granules, information granules are easily used to form granular computing algorithms, such as Jamshidi and Kaburlasos,³ Kaburlasos and Pachidis,⁴ Papadakis et al.,⁵ and Kaburlasos and Kehagias⁶ represent a granule by a vector, obtain the granule set including the granules with different granularaties by the partial ordered relation between the two granules.

An associative memory is a system such that given input x produces output y. That is, the memory associates x with y. An auto-associative memory is an associative memory such that $y = x$ . Furthermore, a binary associative memory is an associative memory containing strictly binary values. Using neural networks, associative memories are able to recall the desired information, given partial or incomplete inputs.⁷

Sossa and Guevara⁸ introduce an efficient training algorithm for a dendrite morphological neural network (DMNN), for a training set, they regard a set as a hypercube, and partition the hypercube into 2 ⁿ smaller hypercubes for N-dimensional space, until each hypercube includes the same class labels.

In this paper, isolation-based hyperbox granular computing classification algorithms are presented. Firstly, a granule is a hyperbox induced by the beginning point and the end point. Secondly, the ith class data are isolated from others and used to form the temporary hyperbox granule set. Thirdly, all the temporary hyperbox granule sets are united to the final hyperbox granule set.

The layout of the remainder of this paper is as follows. The second section introduces motivation and related work. The third section describes algorithm IHBGrC. The fourth section presents comparative experimental results. The final section summarizes our contribution and describes future work.

Motivation and related work

In this section, the motivation for this proposed research work is presented, and some related works are discussed.

Motivation

For granular computing classification (GrC) in view of set theory, a granule is represented as the subset for the training set S. In general, distance between two non-empty sets is the minimum of the distances between any two of their respective points,⁹ i.e.

d (A, B) = min_{x \in A, y \in B} d (x, y)

(1) where d(x, y) is Euclidean distance between two points. For aforementioned distance formula (equation 1), it is suitable that intersection of set A and set B is empty set. In Figure 1, sets A = {x₁, x₂, x₃, x₄, x₅} and B = {y₁, y₂, y₃, y₄, y₅, y₆} are denoted by ball A and B. In Figure 1(a) distance between A and B is the distance between point x₅ and y₆, obviously d(A, B) is greater than 0. In Figure 1(b), distance between set A and B also is the distance between x₅ and y₆. If the distance between x₅ and y₆ in Figure 1(a) is equal to the distance between x₅ and y₆ in Figure 1(b), the distance d(A, B) in Figure 1(a) is equal to the distance d(A, B) in Figure 1(b). Obviously, the distance d(A, B) in Figure 1(b) is less than Figure 1(a), but d(A, B) in Figure 1(b) is equal to Figure 1(b) according to formula (1). Distance formula (1) does not reflect the real distance between the two sets, and we define the distance between the two sets, where sets are represented as the form of hyperbox, and form the hyperbox granular computing based on the defined distance measure.

Figure 1.

Distances defined by formula (1) between two sets.

On the other hand, the operations that join and meet are used to generate the another granule. In most cases, the join operation is used by granular computing algorithms, and meet operation is used to measure the fuzzy inclusion relation, such as the fuzzy inclusion measure σ(G₁, G₂) is defined by the positive valuation values of G₁ and meet of G₁ and G₂.

Decomposition method is the main strategy for divide and conquer methods,^10,11 which are used in the transformation between complex task and simple task. In view of the limitation of traditional distance formula (1) and the advantage of decomposition method, we form the granular classification algorithms by the novel distance between two hyperbox granules and the meet operation between two hyperbox granules, which utilize the decomposition methods to divide the data set into some subsets, these subsets are regarded as granules. The novel distance formula is used to determine whether a datum is located inside the meet hyperbox. The labels of data lying inside the meet hyperbox determine the isolation process.

Related work

GrC has been proposed and studied in many fields, including machine learning and data analysis.^{3–6,12–18} In general, GrC is an emerging computing paradigm of information processing based on lattice computing theory.

GrC includes two kinds of computing models according to lattice computing theory, one is granular structure in terms of the relation between object and attribute. The other is fuzzy lattice reasoning induced by the relation between two objects based on lattice theory, such as fuzzy lattice, which defines the fuzzy inclusion measure between two granules, and the granule is related to the object. Pedrycz and his colleagues set up a certain conceptual framework composed of some generic and conceptually meaningful entities-information granules, which is granular structure and relates to the problem formulation and problem solving. This becomes a framework in which they formulate generic concepts by adopting a certain level of abstraction, carry out further processing, and communicate the results to the external environment. A few examples offer compelling evidence with image processing, processing, and interpretation of time series.^12–15 A triarchic structure of granular computing integrates three important perspectives, namely, philosophy of structured thinking, methodology of structured problem solving, and mechanism of structured information processing.^16,17 Fuzzy lattice reasoning is proposed by Kaburlasos and his colleagues, and generate granules with different granularity by the meet operation and the join operation between the two granules.^3–6,18

The difference between the granular structure and fuzzy lattice reasoning is the different fuzzy relations. The granular structure mainly discusses the relations between object and attribute, and fuzzy lattice reasoning mainly discusses the partial order relations between two objects. In the literature of granular computing, the distance between two granules is seldom discussed. So we discuss the distance between two granules and isolate the ith class data from the other data.

Granular computing classification algorithms based on isolation

For N-dimensional space, we form the top-down IHBGrC in terms of the following steps. Firstly, two points called the beginning point and the end point are used to represent the hyperbox granule, and each sample is regarded as the atomic hyperbox granule that cannot be divided. Secondly, the distance measure between two hyperbox granules is defined to measure the inclusion relation. Thirdly, the meet operation ∧ between two hyperbox granules are designed to isolate the ith class data from the other class data.

Representation of the hyperbox granule

For the training set S composed of ℓ N-dimensional input vectors, two points x = (x₁, x₂,…, x_N) and y = (y₁, y₂,…, y_N) are used to represent the hyperbox granule. The form of the granule is HB = ( x , y , g _r ), where $x \underline{≺} y$ . $x \underline{≺} y$ is the partial order relation between two vectors and defined as follows.

x \underline{≺} y = x 1 \leq y 1 x 2 \leq y 2 \dots x N \leq y N

≤ is the less than or equal relation between two scalars. Here, point x is called the beginning point, and y is called the end point. The granularity is the size of hyperbox granule and defined as the distance between the beginning point and the end point.

For example, in two-dimensional space, HB₁ = [0.1, 0.2, 0.4, 0.6] represents the hyperbox granule shown in Figure 2, which has the beginning point (0.1, 0.2) and the end point (0.4, 0.6). The length of hyperbox granule equals 0.4, and its width equals 0.3. The granularity of hyperbox granule is 0.5, which is determined by the beginning point and the end point. The another example is the atomic hyperbox granule HB₂ = [0.5, 0.6, 0.5, 0.6] shown in Figure 2 with the granularity 0, which represents the single point (0.5, 0.6).

Figure 2.

Hypergranules in two-dimensional space.

Distance measure

Distance is a numerical description of how far apart objects are. Distance between two hyperbox granules is the measure of farness between two objects, such as hyperbox granules. In analytic geometry, the distance between two points of the xy-plane can be found using the distance formula. In the Euclidean space R^N, the distance between two points is usually given by the Euclidean distance. In mathematics, in particular geometry, a distance function on a given set M is a function d: M × M → R, where R denotes the set of real numbers. Similarly, in granule space–induced hyperbox granules, we define the distance between two hyperbox granules HB₁ = (Bp₁, Ep₁) and HB₂ = (Bp₂, Ep₂) as follows.

Firstly, the distance between point P and hyperbox granule HB is defined as

D (P, HB) = d (P, Bp) + d (P, Ep) - d (Bp, Ep)

(2) where Bp is the beginning point and denoted as Bp = (x₁, x₂,…, x_N), Ep is the end point and denoted as Ep = (y₁, y₂,…, y_N), d(.,.) is the Manhattan distance between two points.

Secondly, the distance between two hyperbox granules HB₁ = (Bp₁, Ep₁, g₁) and HB₂ = (Bp₂, Ep₂, g₂) is defined as

D (HB 1, HB 2) = (D (Bp 1, HB 2) + D (Ep 1, HB 2)) / 2

(3)

The distance between two hyperbox granule has the following properties.

Theorem 3.1

D (HB 1, HB 2) \geq 0, D (HB 1, HB 2) = 0 \Leftrightarrow HB 1 \subseteq HB 2

Proof

Because D(Bp₁, HB₂) ≥ 0 and D(Ep₁, HB₂) ≥ 0,

D(HB₁, HB₂) = (D(Bp₁,HB₂) + D(Ep₁, HB₂))/2 ≥ 0.

If D(HB₁, HB₂) = 0, D(Bp₁,HB₂) = 0 and D(Ep₁, HB₂) = 0. Both Bp₁ and Ep₁ are included in hyperbox granule HB₂, namely HB₁ ⊆ HB₂.

If HB₁ ⊆ HB₂, both Bp₁ and Ep₁ are included in hyperbox granule HB₂. According to theorem 1, D(Bp₁,HB₂) = 0 and D(Ep₁,HB₂) = 0, namely

D (HB 1, HB 2) = (D (Bp 1, HB 2) + D (Ep 1, HB 2)) / 2 = 0 .

Especially, if HB₁ is an atomic hyperbox granule induced by a point P, we can judge whether the point P is located in the hyperbox granule HB₂ by D(HB₁,HB₂). If D(HB₁,HB₂) = 0, the point P is located in HB₂, otherwise the point P is out of HB₂.

Theorem 3.2

The distance (equation (3)) is not a metric distance and does not satisfy symmetry, namely

D (HB 1, HB 2) \neq D (HB 2, HB 1)

Proof

D(HB₁, HB₂) = (D(Bp₁, HB₂) + D(Ep₁, HB₂))/2 = (d(Bp₁, Bp₂) + d(Bp₁, Ep₂) − d(Bp₂, Ep₂) + d(Ep₁, Bp₂) + d(Ep₁, Ep₂) − d(Bp₂, Ep₂))/2 = (d(Bp₁, Bp₂) + d(Bp₁, Ep₂) + d(Ep₁, Bp₂) + d(Ep₁, Ep₂))/2 − d(Bp₂, Ep₂)

Similarly,

D (HB 1, HB 2) = (d (Bp 1, Bp 2) + d (Bp 1, Ep 2) + d (Ep 1, Bp 2) + d (Ep 1, Ep 2)) / 2 - d (Bp 1, Ep 1) .

Generally, D(HB₁, HB₂) ≠ D(HB₂, HB₁), especially, D(HB₁, HB₂) = D(HB₂, HB₁) when d(Bp₁, Ep₁) = d(Bp₂, Ep₂).

For two-dimensional space, two hyperbox granules HB₁ = [0.2 0.1 0.3 0.4] and HB₂ = [0.25 0.15 0.4 0.5], the distance between HB₁ and HB₂ are computed as follows. d(Bp₁, Bp₂) = 0.1, d(Bp₁, Ep₂) = 0.6, d(Ep₁, Bp₂) = 0.4, d(Ep₁, Ep₂) = 0.2, d(Bp₁, Ep₁) = 0.3, d(Bp₂, Ep₂) = 0.5, D(HB₁, HB₂) = 0.15, D(HB₂, HB₁) = 0.35.

Operation between two granules

In N-dimensional space, any two points x = (x₁, x₂,…, x_N) and y = (y₁, y₂,…, y_N) can form a hyperbox granule HB = (Bp, Ep), where

Bp = x \land y = (min {x 1, y 1}, min {x 2, y 2}, \dots, min {x N, y N})

(4a)

Ep = x \lor y = (max {x 1, y 1}, max {x 2, y 2}, \dots, max {x N, y N})

(4b)

To avoid the points lying in the line or hyperplane of hyperbox, the relaxation factor ξ is used to form hyperbox for the subset s.

HB = {[Bp - ξ, Ep + ξ] = [\land x i \in s x i, \lor x i \in s x i], s \neq, s =

(5) where ∧s = min{x} for all x ∈ s, and ∨s = max{x} for all x ∈ s.

The join operator ∨ between two hyperbox granules is designed to achieve the hyperbox granule with larger granularity compared with the original hyperbox granules. For two hyperbox granules HB₁ = (Bp₁, Ep₁) and HB₂ = (Bp₂, Ep₂), the join operation ∨ is designed as follows.

HB 1 \lor HB 2 = (Bp 1 \land Bp 2, Ep 1 \lor Ep 2)

(6)

Conversely, the meet operation ∧ between two hyperbox granules is designed to obtain the hyperbox granule with the smaller granularity compared with the original hyperbox granules. The meet operation ∧ is designed as follows.

HB 1 \land HB 2 = {(Bp 1 \lor Bp 2, Ep 1 \land Ep 2) Bp 1 \lor Bp 2 Ep 1 \land Ep 2 otherwise

(7)

From formula (3), we can see Bp₁ ∧ Bp₂ ≼ Bp₁, Bp₁ ∧ Bp₂ ≼ Bp₂, Bp₁ ≼ Ep₁ ∨ Ep₂, Bp₂ ≼ Ep₁ ∨ Ep₂, ||Bp₁ ∧ Bp₂ − Ep₁ ∨ Ep₂||₂ ≥ ||Bp₁ − Ep₁||₂, ||Bp₁ ∧ Bp₂ − Ep₁ ∨ Ep₂||₂ ≥ ||Bp₂ − Ep₂||₂, namely the granularity of HB₁ ∨ HB₂ is greater than or equal to the granularities of HB₁ and HB₂, and the operation ∨ induces the hyperbox granule with larger granularity compared with original granules. From formula (5), we draw the opposite conclusion that the meet operation induces the hyperbox granule with the smaller granularity compared with original granules.

Isolation algorithm

For the training set

S = {(x, y) | x \in R^{N}, y \in N^{+}}

(8) where x is the input vector, and y is the class label, R is real number, N is natural number, and N⁺ is the natural number, which is greater than 0. We form the IHBGrC algorithm by two stages, the first stage corresponds to algorithm 1, which isolates the ith class data from the other class data, and obtain the temporary hyperbox granule set, the second stage is corresponded to the algorithm 2, which obtains the hyperbox granule set by uniting all the temporary hyperbox granule sets.

Algorithm1: the ith class isolation algorithm

Input: the training set S

Output: the hyperbox granule set GS_i and the corresponding class label lab_i

S1. GS_i = Ø, lab_i = Ø

S2. Extracting the ith class data from the training set S

S3. Compute the hyperbox HB₁ = [a₁ b₁] by the ith class and HB₂ = [a₂ b₂] by the other classes

S4. If HB₁ is empty, the procedure is terminated, return GS_i

S5. If the meet hyperbox HB of HB₁ and HB₂ is empty, GS_i = GS_i ∪ {HB₁}, then HBt = HB₁, else

S6. If HB = HB₁, then HBt = HB₁, else

S7. Computing the vertex set P of HB₁ by the beginning point and the end point

S8. If the beginning point x of HB and the beginning point x₁ of HB₁ are identical, then the HBt is induced by x and the vertex set P, else

S9. The HBt is induced by y and the vertex set P

S10. Find labels of data included in hyperbox granule in HBt

S11. If all the labels are equal to i, then GS_i = GS_i ∪ {HBt} and remove the data, else update S, namely S include the data with different class labels, and go to S2.

In S3, the hyperbox can be formed by formula (5).

In S4, there are 2 ^N vertexes in N-dimensional space, for HB₁ = [0.1 0.2 0.3 0.2 0.4 0.7] in three-dimensional space, the beginning point is Bp = (0.1 0.2 0.3), and the end point is Ep = (0.2 0.4 0.7), the vertex set P induced by HB₁ are shown in Figure 3.

Figure 3.

Eight vertexes of hyperbox in three-dimensional space.

In S5, suppose HB₁ = [0.1 0.2 0.3 0.2 0.4 0.7] and HB₂ = [0.15 0.35 0.55 0.3 0.5 0.9], the HB₁ has the positive class and HB₂ has the negative class, the meet hyperbox is HB = [0.15 0.35 0.55 0.2 0.4 0.7], which is obtained by formula (4) and nonempty, the beginning point of HB₁ is a = (0.15 0.35 0.55), and the end point of HB₁ is b = (0.2 0.4 0.7).

In S7, we computed eight hyperboxes, they are [0.1 0.2 0.3 0.15 0.35 0.55], [0.1 0.2 0.55 0.15 0.35 0.7], [0.1 0.35 0.3 0.15 0.4 0.55], [0.1 0.35 0.55 0.15 0.4 0.7], [0.15 0.2 0.3 0.2 0.35 0.55], [0.15 0.2 0.55 0.2 0.35 0.7], [0.15 0.35 0.3 0.2 0.4 0.55], [0.15 0.35 0.55 0.2 0.4 0.7], where [0.15 0.35 0.55 0.2 0.4 0.7] may be include other class data, and the other seven formed hyperboxes are added to the hyperbox granule set GS.

Finally, the data included into hyperbox [0.15 0.35 0.55 0.2 0.4 0.7] are composed of the updated training set S, and re-perform the procedure from S2.

A key issue in the design of GrC is its training. For purposes of explaining the algorithm, a simple example of two-class problem with two attributes is used. Figure 4 shows the example patterns, each input includes two attributes, each output includes 1 or 2.

Figure 4.

Training data in two-dimensional space.

Given n-class classification problem S = {(x, y)|x ∈ R^N, y ∈ N⁺}, the IHBGrC is performed by the following steps:

Algorithm 2

IHBGrC algorithm

Input: the training set S

Output: the hyperbox granule set GS and the corresponding class label lab

S1. GS = Ø, lab = Ø

S2. i = 1

S3. Obtain the hyperbox granule set GS_i and lab_i by algorithm 1 for the ith class data

S4. GS = GS ∪ GS_i and lab = lab ∪ lab_i

S5. if i = n, then the procedure is terminated, and GS is return

S6. i = i + 1, go to S3

For the training data shown in Figure 4, if set ξ = 0.0001, 12 hyperboxes and the corresponding class labels are obtained by performing algorithm 2. They are

HB 1 = [- 0.5901 - 1.18010.98010.9801], lab 1 = 1 HB 2 = [- 0.34011.2399 - 0.33991.2401], lab 2 = 1 HB 3 = [1.2899 - 0.60011.71010.9501], lab 3 = 1 HB 4 = [2.24991.22992.25011.2301], lab 4 = 1 HB 5 = [1.68991.38992.29012.7001], lab 5 = 1 HB 6 = [2.61992.42992.62012.4301], lab 6 = 1 HB 7 = [1.12990.98991.13010.9901], lab 7 = 2 HB 8 = [1.51992.42991.52012.4301], lab 8 = 2 HB 9 = [2.76990.99994.44011.1901], lab 9 = 2 HB 10 = [2.61991.35992.62011.3601], lab 10 = 2 HB 11 = [2.61991.25993.90012.4001], lab 11 = 2 HB 12 = [2.67992.56994.10013.0801], lab 12 = 2

If class label is 1, all the data with class label 1 are included in one of the hyperbox granule set GS₁ shown in Figure 5 with class lab1 by performing S3. If class label is 2, all the data with class label 2 are included in one of the hyperbox granule set GS₂ shown in Figure 6 with class labs by performing S3. The final hyperbox granule set GS = GS₁ ∪ GS₂, and all the training data lie inside one of the hyperbox granule set GS with the corresponding class labels shown in Figure 7.

Figure 5.

The firsth class data and the formed hyperboxes.

Figure 6.

The second class data and the formed hyperboxes.

Figure 7.

The training data and their formed hyperboxes.

For testing, the input unknown datum x is represented by the atomic hyperbox granule [x x] whose beginning point and end point are identical, and the distance between the atomic hyperbox granule [x x] and the anyone of GS is computed, the corresponding label of hyperbox in GS which is nearest to [x x] is the class label of the input unknown datum x.

Taking x = (0.5, 0.4) for example, the atomic hyperbox granule induced by x is HB _x = [0.5 0.4 0.5 0.4], the distances between HB _x and any one of hyperbox granule set GS are

D (HB x, HB 1) = d (x, Bp 1) + d (x, Ep 1) - d (Bp 1, Ep 1) = 2.6702 + 1.0602 - 3.7304 = 0

Similarly,

D (HB x, HB 2) = 3.3596, D (HB x, HB 3) = 1.5798, D (HB x, HB 4) = 5.1596, D (HB x, HB 5) = 4.3596, D (HB x, HB 6) = 8.2996, D (HB x, HB 7) = 2.4396, D (HB x, HB 8) = 6.0996, D (HB x, HB 9) = 5.7396, D (HB x, HB 10) = 6.1596, D (HB x, HB 11) = 5.9596, D (HB x, HB 12) = 8.6996 .

So the class label of datum x is the corresponding class label of HB₁, namely the class label of datum x is 1.

For the achieved granule set, there are two significant properties, (1) there are not intersection between any two hyperbox granules and (2) each training sample is included in the only one hyperbox granule.

Experiments

In this section, validation experimental results are performed for two-class problems and n-class problems in two-dimensional space and N-dimensional space. We evaluated the effectiveness of IHBGrC using Intel(R) Core(TM) i5 CPU with 3.2 GHz and 8 GB memory, running Microsoft Win7, and Matlab2008. We mainly analyze and discuss IHBGrC compared with support vector machines (SVMs)¹⁹ from the size of GS or support vector sets (SVs), testing accuracy (generalization ability) (%), and training time (s).

Classification problems in two-dimensional space

The first data set is the Ripley’s synthetic data set that consists of two classes in two-dimensional space.²⁰ The data are divided into a training data set and a test set consisting of 250 and 1000 samples, respectively, with the same number of samples belonging to each of the two classes. The training data set appears in Figure 8.

Figure 8.

Ripley’s synthetic training data set.

For IHBGrC, the GS including 24 hyperbox granules with class label 1 and 26 hyperbox granules with class label 2 are obtained and shown in Figure 9 by IHBGrC. The testing data and hyperbox are shown in Figure 10. For Ripley’s data set, the training accuracy by IHBGrC is 100%, and the testing accuracy by IHBGrC is 88.6% when the parameter epsilon is set to 0.00001, testing accuracy by SLMP-R is 81.2%, and testing accuracy by SLMP-P is 87.8%.²¹ For Ripley’s data set, IHBGrC testing accuracy is greater than SLMP-R and SLMP-P.

Figure 9.

GS by IHBGrC and 250 training data for Ripley.

Figure 10.

GS by IHBGrC and 1000 testing samples for Ripley.

For SVMs, the best testing accuracy is the index of optimal SVMs, the best testing accuracy is 90.3% when C = 5000, the kernel function is polynomial kernel with order 5. The corresponding training accuracy is 89.6%. The training data and the classification boundary are shown in Figure 11.

Figure 11.

250 Training data and the boundary by SVMs for Ripley, the training accuracy is 89.6%.

The second data set was the spiral synthetic data set that consists of three classes, which is used to verify the feasibility of IHBGrC for multi-class classification prolems in two-dimensional space.²² All the training data are shown in Figure 12, and the achieved hyperbox granules are shown in Figure 13. The achieved hyperbox granule set includes 61 hyperbox granules, where 31 hyperbox granules have the class labels 1, which are marked by the red rectangles in Figure 13, 6 hyperbox granules have the class labels 2 and are marked by the green rectangles in Figure 13, and 17 hyperbox granules have the class labels 3 and are marked by the blue rectangles in Figure 13.

Figure 12.

Training data of spiral classification problem.

Figure 13.

The achieved hyperboxes of IHBGrC for the spiral classification problem.

For SVMs, the one-vs-rest strategy is adopted to the multi-class classification problems. The training accuracy is 100% when the radial basis function (RBF) kernel function with the width 5 is selected to perform the SVMs. Figure 14 shows the boundary of 1-vs-rest SVMs, Figure 15 shows the boundary of 2-vs-rest SVMs, and Figure 16 shows the boundary of 3-vs-rest SVMs.

Figure 14.

Boundary of 1-vs-rest SVMs for the spiral classification problem.

Figure 15.

Boundary of 2-vs-rest SVMs for the spiral classification problem.

Figure 16.

Boundary of 3-vs-rest SVMs for the spiral classification problem.

Classification problems in N-dimensional space

In order to evaluate the performance of IHBGrC in space R^N, five data sets listed in Table 1 from the UCI Machine Learning Repository²³ are selected to perform the algorithms by 10-fold cross validation.

Table 1.

Classification problems in space R^N.

Data sets	Sizes	Attributes (R^N)	Classes
banknote	1372	4	2
wilt	4839	5	2
iris	150	4	3
sensor4	5456	4	4
shuttle	58,000	9	7

Banknote data were extracted from images that were taken from genuine and forged banknote-like specimens. For digitization, an industrial camera usually used for print inspection was used. The final images have 400 × 400 pixels. Due to the object lens and distance to the investigated object, gray-scale pictures with a resolution of about 660 dpi were gained. Wavelet Transform tool were used to extract features from images.

The data set wilt contains some data from a remote sensing study by Johnson et al. (2013)²⁴ that involved detecting diseased trees in Quickbird imagery. There are few training samples for the diseased trees class and many for other land cover class.

The data set iris contains three classes of 50 instances each, where each class refers to a type of iris plant. One class is linearly separable from the other two; the latter are NOT linearly separable from each other.

The data set sensor4 contains four sensor readings named “simplified distances” and the corresponding class label. These simplified distances are referred to as the “front distance,” “left distance,” “right distance,” and “back distance.” They consist, respectively, of the minimum sensor readings among those within 60° arcs located at the front, left, right, and back parts of the robot.

For data set shuttle, approximately 80% of the data belongs to class 1. Therefore, the default accuracy is about 80%. The aim here is to obtain an accuracy of 99–99.9%.

We compared IHBGrC with SVMs, and the performance of the five data sets is shown in Table 2, such as size of GS or SVs, training time, and testing accuracies.

Table 2.

Performance of IHBGrC for classification problems in R^N.

Datasets	Algorithms	Running time (s)			Testing accuracies (%)
Datasets	Algorithms	Min	Max	M	Min	Max	Mean
banknote	IHBGrC	0.1872	0.2184	0.2044	97.0803	100	98.9802
	HBGrC	0.3214	0.3463	0.3360	97.8102	99.2701	98.1022
	SVMs	0.2964	0.5304	0.4352	95.6204	100	98.6152
wilt	IHBGrC	0.4368	0.7176	0.6256	93.9024	99.1718	97.1335
	HBGrC	12.544	12.954	12.786	92.4797	98.7578	97.1776
	SVMs	19.0633	29.9678	26.5327	85.5072	98.5507	95.6719
iris	IHBGrC	0.0156	0.0780	0.0452	86.6667	100	96
	HBGrC	0.4992	0.6864	0.6068	86.6667	100	95.3333
	SVMs	0.0468	0.1404	0.0577	86.6667	100	98
sensor4	IHBGrC	0.0312	0.0624	0.0530	96.1326	100	98.8130
	HBGrC	1.9032	2.7456	2.4196	97.4217	100	99.4551
	SVMs	12.0121	13.8217	12.856	96.3168	100	99.1922
shuttle	IHBGrC	40.5915	43.4775	41.5197	99.8793	99.9828	99.9293
	HBGrC	44.0391	48.1887	46.6100	99.7930	99.9138	99.8655
	SVMs	N/A	N/A	N/A	N/A	N/A	N/A

The best performances are marked by the boldface ones.

For data set banknote, the minimal testing accuracy, the maximal testing accuracy, and the mean of testing accuracies by IHBGrC are greater than or equal to those of SVMs. For data set wilt, the testing accuracies of IHBGrC are better than those of SVMs, such as the minimal, maximal, and mean testing accuracies. For data set iris, the mean of IHBGrC is less than SVMs.

The experiments reveal that IHBGrC algorithm has the following features:

Convergence in a finite number of steps.

Perfect classification of the training data.

No overlap between hyperboxes with distinct class labels.

Once the value of ξ is set, if there are the same training patterns, the hyperboxes generated are always the same.

No dependency of class presentation order.

IHBGrC can be applied to classification problems of n classes and N attributes.

IHBGrC is presented by the meet operation between two hyperboxes with different class labels, and achieves the comparable classification accuracies compared with HBGrC, which is bottom-up granular computing classification algorithm designed by the join operation between two hyperboxes with the same class label. Two main drawbacks of the proposed algorithm are (1) the number of output hyperbox granule grows exponentially as the number of describing dimensions increases and (2) testing accuracy for the classification problems with the numerical attributes is better than the testing accuracy for the classification problems with the symbol attributes. For the symbol attributes, we must transform them into numerical attributes to perform IHBGrC, for example, symbol a is transformed into 1, and symbol b is transformed into 2, this affects the classification accuracy.

For the computational complexity, IHBGrC and HBGrC can achieve the hyperbox granules through only one time scan of the training set, so their time complexities are less than SVMs. The aforementioned property (Theorem 1) of the distance between two hyperbox granules is mainly used to judge the inclusion relation between two hyperbox granules and compute the distance between two hyperbox granule, especially compute the distance between an atomic hyperbox granule and a hyperbox granule during the testing process. The non-symmetry of distance formula (3) results in the higher storage space compared with the symmetry distance, for hyperbox granule set including n hyperbox granules, the symmetry distances between any two hyperbox granules need n(n + 1)/2 storage units, and the distances by formula (3) need n² storage units.

Conclusions

The top-down hyperbox granular computing classification algorithm based on isolation is proposed in the paper. Firstly, a granule was represented as a hyperbox with the beginning point and the end point. Secondly, the meet operation was used to isolate the ith class data from the other data. Thirdly, the novel distance between two hyperbox granules was used to measure the inclusion relation between the two hyperbox granules. Finally, we evaluated the effectiveness of IHBGrC using the benchmark data sets. These results demonstrate a superior learning performance of the proposed algorithm IHBGrC. Two main drawbacks of the proposed algorithm are (1) it grows exponentially as the number of describing dimensions increases and (2) testing accuracy for the classification problems with the numerical attributes is better than the testing accuracy for the classification problems with the symbol attributes.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported in part by the Natural Science Foundation of China (Grant no. 61170202, 61402393) and Natural Science Foundation of Henan (nos. 132300410421, 132300410422).

References

Zadeh

. Advances in fuzzy set theory and applications, Amsterdam: North Holland Publishing, 1979.

Zadeh

. Towards a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic. Fuzzy Sets Syst 1997; 19: 111–127.

Jamshidi

Kaburlasos

. gsaINknn: A GSA optimized, lattice computing knn classifier. Eng Appl Artific Intellig 2014; 35: 277–285.

Kaburlasos

Pachidis

. A lattice-computing ensemble for reasoning based on formal fusion of disparate data types, and an industrial dispensing application. Informat Fusion 2014; 16: 68–83.

Papadakis

Kaburlasos

Papakostas

. Two fuzzy lattice reasoning (FLR) classifiers and their application for human facial expression recognition. Mult Valued Logic Soft Comput 2014; 22: 561–579.

Kaburlasos

Kehagias

. Fuzzy inference system (FIS) extensions based on the lattice theory. IEEE Trans Fuzzy Syst 2014; 22: 531–546.

Romani

Pinkoviezky

Rubin

. Scaling laws of associative memory retrieval. Neural Computat 2013; 25: 2523–2544.

Sossa

Guevara

. Efficient training for dendrite morphological neural networks. Neurocomputing 2014; 131: 132–142.

Deza

. Dictionary of distances, Amsterdam: Elsevier, 2006.

10.

Tan

Ali

NHM

Lai

C-H

. Parallel block interface domain decomposition methods for the 2D convection-diffusion equation. Int J Comput Mathemat 2012; 89: 1704–1723.

11.

Liu

Lai

C-H

Zhou

S-D

. Two-level time-domain decomposition based distributed method for numerical solutions of pharmacokinetic models. Comput Biol Med 2011; 41: 221–227.

12.

Pedrycz

Miao

. Neighborhood rough sets based multi-label classification for automatic image annotation. Int J Approximat Reason 2013; 54: 1373–1387.

13.

Zhong

Pedrycz

Wang

. Granular data imputation: a framework of granular computing. Appl Soft Comput 2016; 46: 307–316.

14.

Liu

Pedrycz

. Covering-based multi-granulation fuzzy rough sets. J Intell Fuzzy Syst 2016; 30: 303–318.

15.

Zhao

Han

Pedrycz

. Granular model of long-term prediction for energy system in steel industry. IEEE Trans Cybernet 2016; 46: 388–400.

16.

Yao

Zhang

Miao

. Set-theoretic approaches to granular computing. Fundamenta Informaticae 2012; 115: 247–264.

17.

Yao

. A triarchic theory of granular computing. Granul Comput 2016; 1: 145–157.

18.

Kaburlasos

Petridis

. Fuzzy lattice neurocomputing (FLN) models. Neural Networks 2000; 13: 1145–1170.

19.

Zhu

Chen

Xing

. Bayesian inference with posterior regularization and applications to infinite latent SVMs. J Mach Learn Res 2014; 15: 1799–1847.

20.

Ripley

. Pattern recognition and neural networks, Cambridge: Cambridge University Press, 1996.

21.

Ritter

Urcid

. Lattice algebra approach to single-neuron computation. IEEE Trans Neural Networks 2003; 14: 282–295.

22.

Chang

Yeung

. Robust path-based spectral clustering. Pattern Recognit 2008; 41: 191–203.

23.

http://archive.ics.uci.edu/ml/datasets.html .

24.

Johnson B, Tateishi R and Hoan N. A hybrid pansharpening approach and multiscale object-based image analysis for mapping diseased pine and oak trees. International Journal of Remote Sensing 2013; 34: 6969–6982.

Isolation-based hyperbox granular classification computing

Abstract

Keywords

Introduction

Motivation and related work

Motivation

Related work

Granular computing classification algorithms based on isolation

Representation of the hyperbox granule

Distance measure

Theorem 3.1

Proof

Theorem 3.2

Proof

Operation between two granules

Isolation algorithm

Algorithm1: the ith class isolation algorithm

Algorithm 2

Experiments

Classification problems in two-dimensional space

Classification problems in N-dimensional space

Conclusions

Footnotes

Declaration of conflicting interests

Funding

References