Low-power mapping algorithm for three-dimensional network-on-chip based on diversity-controlled quantum-behaved particle swarm optimization

Abstract

Three-dimensional network on chip (3D NoC) is developed based on three-dimensional integrated circuit, system on chip and two-dimensional network on chip. The 3D NoC is mainly used to solve the problems such as communication bottleneck of highly integrated chips. Mapping of 3D NoC is a key problem in the research area of 3D NoC. The authors proposed a low-power mapping algorithm based on quantum-behaved particle swarm optimization. Simulation results show that the proposed algorithm can significantly reduce power consumption, but with large-scale application characteristic graph, the efficiency of power optimization cannot be improved very much. To solve this problem, a low-power mapping algorithm for 3D NoC based on diversity-controlled quantum-behaved particle swarm optimization is proposed in this paper. Simulation results show that for large-scale application characteristic graph, this algorithm is able to maintain a stable power optimization efficiency (4.08–8.04%) and converges much faster.

Keywords

Three-dimensional network on chip mapping algorithm quantum-behaved particle swarm optimization diversity-controlled low-power

Introduction

Three-dimensional network-on-chip (3D NoC) is an effective solution to solve the problem of complex interconnection for large-scale system on chip (SoC). Since the adoption of integrated circuit stacking technology, 3D NoC breaks the limitations on performance and size of two-dimensional NoC (2D NoC). Mapping of 3D NoC is a key issue in 3D NoC research field. The results of mapping will directly affect the power consumption of communication in system, delay and other properties.¹ In terms of the optimal solution search method, 3D NoC mapping algorithm can be divided into three categories, i.e. search algorithms, construction algorithms, and enumeration algorithm. Search algorithms are the most commonly used and can be divided into two types of heuristic search and systematic search.² Heuristic search algorithm based on particle swarm optimization (PSO) is a kind of very typical algorithm to solve 3D NoC mapping problem. Zhang Zhen in Guangdong University of Technology improved the PSO to solve the problem of multi-objective and discrete space in Chip Multiprocessors (CMP) on-chip network mapping algorithm.^3,4 Pradip Kumar Sahu et al. in Indian Institute of Technology Kharagpur presented a discrete particle swarm optimization (DPSO)-based strategy to map applications on both 2D and 3D mesh-connected NoC.⁵ Bhardwaj and Mane in BITS Pilani-Goa Campus of India proposed centralized 3D mapping (C3Map) using a new octahedral traversal technology and realized the H-C3Map⁶ which is formed from C3Map and attractive and repulsive PSO (ARPSO).⁷ We have applied quantum-behaved particle swarm optimization (QPSO) to 3D NoC mapping problem for the first time, which reduced power consumption of mapping. But good power-optimized efficiency can only be maintained when the scale of the application characteristic graph (APCG) is within a certain range. To solve this problem, in this paper, a low-power mapping algorithm for 3D NoC based on diversity-controlled quantum-behaved particle swarm optimization (DCQPSO) is proposed. Simulation results show that for large-scale APCG, this algorithm is able to maintain a stable power optimization efficiency (4.08–8.04%), and converges faster.

Basic questions of 3D NoC mapping

Basic concept of 3D NoC mapping

Mapping of 3D NoC is used to decide which IP in a given APCG should be represented by which node of topology architecture graph of 3D NoC to achieve the goal of minimizing power consumption or packet latency described as objective functions.^8,9 Mapping problem belongs to quadratic assignment problem and is a NP-hard problem.¹⁰ Figure 1 shows a mapping instance of NoC.¹¹

Figure 1.

A mapping instance of NoC.

Power consumption model

Mapping of 3D NoC is to find the optimal mapping relationship between IP cores and resource nodes of 3D NoC through an algorithm, so that one or more performance metrics such as lowest power consumption, shortest execution time of application, the most balanced network load and so on can be optimized. A major concern with 3D NoC mapping is power consumption among all these metrics.¹¹

Using the power consumption model given in Yang et al.,¹² power consumption of the process of node t_i sending 1 flit to node t_j can be formulated as:

E_{bit}^{t_{i}, t_{j}} = μ E_{Rbit} + μ_{H} E_{LHbit} + μ_{V} E_{LVbit}

(1)

In equation (1), μ represents the number of routers between t_i to t_j. $μ_{H}$ is the number of edges between t_i to t_j in horizontal direction and $μ_{V}$ is the number of edges between t_i to t_j in vertical direction. $E_{Rbit}$ represents the power consumption of a flit through a router. $E_{LHbit}$ represents the power consumption of a flit through circuit in horizontal direction and $E_{LVbit}$ represents the power consumption of a flit through circuit in vertical direction.

Mapping of 3D NoC is to find a mapping function under the conditions of a given APCG and TAG so that the following conditions in equations (2) and (3) are met and the total power consumption is lowest, i.e. the objective function in equation (4) is minimum ( $w_{i, j}$ represents the communication volume between vertex v_i and vertex v_j):

\forall v_{i} \in V, map (v_{i}) \in R

(2)

\forall v_{i} \neq v_{j}, map (v_{i}) \neq map (v_{j})

(3)

min (\sum_{v_{i}, v_{j} \in V} (w_{i, j} \times E_{bit}^{map (v_{i}), map (v_{j})}))

(4)

For a determined topology, $E_{Rbit}$ , $E_{LHbit}$ , and $E_{LVbit}$ are determined. According to Matsutani et al.,¹³ the power consumption on the circuit is proportional to the length of the circuit. So the length of the circuit between two communicating nodes is a key influence affecting the power consumption of 3D NoC. Power consumption of 3D NoC in vertical direction is much less than that in horizontal direction when the same amount of data is transferred with Through Silicon Vias (TSV) technology adopted.

Review of mapping algorithm of 3D NoC based on QPSO

Description of the problem

QPSO algorithm is proposed by Sun Jun from Jiangnan university of China.¹⁴ QPSO is a global convergence algorithm and has many advantages such as global convergence, less control parameters, faster convergence, and stronger ability of optimization. The authors had applied QPSO algorithm to the mapping problem of 3D NoC for the first time to optimize the mapping position of IP cores of APCG on the resource nodes of 3D NoC.

Realization of mapping algorithm of 3D NoC based on QPSO

In this part, we show the realization of mapping algorithm based on QPSO. The steps are shown in Algorithm 1.

Algorithm 1. Mapping algorithm of 3D NoC based on QPSO

Input: Particle swarm randomly generated.

Output: Optimum position $best$ of particle swarm.

Step 1: Parameters initialization. Define the number of particles as $MAX_NP$ and the maximum number of iterations as $MAX_GENERS$ . Randomly generate $MAX_NP$ initial particles.

Step 2: Calculate the average optimal position of the particle swarm according to equation (10).

Step 3: Calculate the fitness value of each particle according to equation (5). If the particle adapts better than $indibest$ , then the value is assigned to $indibest$ and update the best position of the individual particle.

Step 4: If the fitness of $indibest$ is better than $best$ , then update the value of $best$ .

Step 5: For every dimension of the particle, calculate a random position according to equation (8).

Step 6: Update the position of particles according to equation (7).

Step 7: Correct the obtained position and velocity information of particles according to the determined range of value of position and velocity so that they would not exceed the available space.

Step 8: Decide whether the current iteration number reaches the maximum number $MAX_GENERS$ . If it is reached, then output best, else jump to step 2 and iteration number adds 1.

Step 9: Output best and end the algorithm.

Study of mapping algorithm of 3D NoC based on DCQPSO

DCQPSO algorithm

The diversity of QPSO algorithm is relatively high at the beginning of searching since the initialization of particle swarm. In the subsequence searching process, the diversity of the population continues to decline, resulting in stronger local search capability and much weakened global search capability. In the early and middle stages of the search, reducing the diversity of particle swarm is necessary for the search efficiency. However, the diversity of particle swarm is very low at the late stage of the search because the particles are gathered in a relatively small area. In this time, the capability of global searching gets very weak and the possibility of large-scale searching is very small. In this case, if the global best position is the locally optimal solution or suboptimal solutions, the algorithm prematurity will occur.

In order to effectively avoid prematurity and improve the performance of QPSO algorithm, Sun proposed diversity-controlled QPSO (DCQPSO).¹⁴ In DCQPSO algorithm, the search in the algorithm is guided by a diversity measurement $diversity (S_{x})$ . Particle swarm gets into convergence state after initialization. In this process, the contraction–expansion coefficients α decreases linearly from 1.0 to 0.5 and this parameter control method is the same as QPSO algorithm. In the convergence process, once the diversity $diversity (S_{x})$ is below a predetermined limit $d_{low}$ , adjust the contraction–expansion coefficients α as a preset α₀ (α₀ > 1.782, the particle swarm enters a divergent state), so that the particle swarm moves into a divergent state, then diversity increases temporarily, until more than $d_{low}$ .

Proposal and realization of mapping algorithm of 3D NoC based on DCQPSO

Theory of mapping algorithm of 3D NoC based on DCQPSO

The fitness function of mapping algorithm based on DCQPSO used is shown in equation (5),

Cost = MAXFit - (d_{x} \cdot w_{x} + d_{y} \cdot w_{y} + d_{z})

(5)

In equation (5), $MAXFit$ is biggest fitness value set for in advance. d_x, d_y, and d_x, respectively, represent the distance between communication nodes in the x-axis, y-axis, and z-axis. w_x and w_y, respectively, represent traffic between nodes in the x-axis and y-axis. Power consumption of 3D NoC in vertical direction is much less than that in horizontal direction when the same amount of data is transferred due to TSV technology involved. So in the z-axis direction it is no longer multiplied by the weight. From equation (5), we can know that the fitness value is inverse proportional to the traffic of 3D NoC. The greater the traffic, the lower fitness value is. We can know from the power consumption model above that the greater the traffic, the higher the power consumption is. So the power consumption brought by the algorithm can be judged by the size of the fitness value, i.e. the greater the fitness value, the lower power consumption is. Diversity metric formula of the algorithm is shown as equation (6):

diversity (S_{x}) = \frac{1}{M \cdot | A |} \cdot \sum_{i = 1}^{M} \sqrt{\sum_{j = 1}^{DIM} (x_{i, j} (t) - \bar{x_{j} (t)}) 2}, \bar{x_{j} (t)} = \frac{1}{M} \sum_{i = 1}^{M} x_{i, j} (t)

(6)

In equation (6), $S_{x} = (x_{1}, x_{2}, \dots, x_{M})$ is the population composed of the current position of the particle. M is the size of particle swarm. $| A |$ is the length of the longest diagonal of the search space. $DIM$ is the number of dimensions of the optimization problem and also the number of IP cores of APCG. $x_{i, j} (t)$ is the current position of the ith particle at the jth dimension. $\bar{x_{j} (t)}$ is the average position of all particles at the jth dimension. The evolutionary formula of position of the algorithm is shown in equation (7).

x_{i, j} (t + 1) = p_{i, j} (t) - α \cdot | c_{j} (t) - x_{i, j} (t) | \cdot \ln (1 / μ)

(7)

In equation (7), $p_{i, j} (t)$ represents the position of a random node and the calculated formula is shown as equation (8). ϕ is a number that randomly generated between 0 and 1. The parameter ${indibest}_{i, j} (t)$ represents the best position of the particle itself and ${best}_{i, j} (t)$ represents the optimum position of particle swarm. α is an expansion shrinkage factor and the calculated formula is defined as equation (9). The parameter $count$ represents the current number of iterations and N represents the maximum number of iterations. $c_{j} (t)$ is the average optimal position of the particle swarm and the calculated formula is shown as equation (10). μ is a number that randomly generated between 0 and 1.

p_{i, j} (t) = ϕ \cdot {indibest}_{i, j} (t) + (1 - ϕ) \cdot {best}_{j} (t)

(8)

α = 1.0 - \frac{(1.0 - 0.5) \cdot count}{N}

(9)

c (t) = (c_{1} (t), \dots, c_{j} (t), \dots, c_{DIM} (t)) = \frac{1}{M} \sum_{i = 1}^{M} {indibest}_{i} (t) = (\frac{1}{M} \sum_{i = 1}^{M} {indibest}_{i, 1} (t), \dots, \frac{1}{M} \sum_{i = 1}^{M} {indibest}_{i, j} (t), \dots, \frac{1}{M} \sum_{i = 1}^{M} {indibest}_{i, DIM} (t))

(10)

Design and realization of mapping algorithm of 3D NoC based on DCQPSO

In this part, we show the realization of mapping algorithm based on DCQPSO which is proposed in this paper. The steps are shown in Algorithm 2.

Algorithm 2. Mapping algorithm of 3D NoC based on DCQPSO

Input: Particle swarm randomly generated.

Output: Optimum position $best$ of particle swarm.

Step 1: Parameters initialization. Define the number of particles as $MAX_NP$ and the maximum number of iterations as $MAX_GENERS$ . Randomly generate $MAX_NP$ initial particles.

Step 2: Calculate the average optimal position of the particle swarm according to equation (10).

Step 4: If the fitness of $indibest$ is better than $best$ , then update the value of $best$ .

Step 5: Calculate the diversity $diversity (S_{x})$ of the particle swarm according to equation (6).

Step 6: Calculate the contraction–expansion coefficients α according to equation (9).

Step 7: If the diversity $diversity (S_{x})$ is below a predetermined limit $d_{low}$ , adjust the contraction–expansion coefficients α as a preset α₀.

Step 8: For every dimension of the particle, calculate a random positon according to equation (8).

Step 9: Update the position of particles according to equation (7).

Step 10: Decide whether the current iteration number reaches the maximum number MAX_GENERS. If it is reached, then output best, else jump to step 2 and iteration number adds 1.

Step 11: Output best and end the algorithm.

Experiments and performance analysis

Simulation platform and parameter design

Simulation platform

In this paper, Access Noxim 0.2 developed by Kai-Yuan Jheng in National Taiwan University is adopted as our simulation tool.¹⁵ The Access Noxim is a co-simulation platform for 3D NoC system that couples the network model, power model, and thermal model. They integrate Noxim (a cycle-accurate SystemC NoC simulator) and HotSpot (HotSpot provides the architecture-level thermal model), and adopt the power model of Intel’s 80-core processor.

Choice of topology and routing algorithm

The 3D mesh architecture is formed by extending 2D mesh directly to 3D space¹⁶ and has a regular structure. The position of network nodes of 3D mesh is homogeneous and the wiring is simple. TSV technology is used in the vertical direction which can reduce the overall wiring length. In our experiment, 3D mesh is used in our simulations. As for routing algorithm, we adopt XYZ routing algorithm. In XYZ algorithm, data packets are passed first in the x-axis, and then in the y-axis, finally in the z-axis. The algorithm is easy to understand and be realized and it is the most commonly used routing algorithm.

Parameter initialization

We initialized the parameters of the algorithms and the simulation tool as follows. Population size is set as 200 and the number of iterations is set 100. Data packets are injected in the way of memory-less Poisson distribution. Packet injection rate is set as 0.02. The total power consumption of 5000 cycles is summarized. The size of data packet is between 2 flits to 10 flits. The size of cache of each channel of the router is 8 flits.

Experiment results and performance analysis

In order to compare our proposed mapping algorithm of 3D NoC based on DCQPSO with the original mapping algorithm of 3D NoC based on QPSO, we conduct the experiments of these two algorithms on the same simulation platform. To reflect the universality of the algorithms, we use both classic APCG and random APCG in our experiments. Classic APCG includes PIP (8-core), MWD (12-core), VOPD (16-core), and DVOPD (32-core). Random APCG is generated by using task builder TGFF¹⁷ and different APCGs have different numbers of IP cores. For each APCG, simulation experiments were conducted separately using mapping algorithms based on DCQPSO and QPSO. Then the communication mode files with the best solutions were generated. Finally, the communication mode files are fed into Access Noxim to start the simulation. In this paper, both convergence speed and power consumption of the two algorithms are compared.

Comparison on convergence speed of mapping algorithms based on DCQPSO and QPSO

Comparison on convergence speed of the two algorithms for classic APCG

For classic APCG, comparison on convergence speed in 100 iterations of mapping algorithms based on DCQPSO and QPSO is shown in Figure 2, where the abscissa represents the number of iterations, the ordinate represents fitness value and the solid circle represents the convergence point of the algorithm. Take Figure 2(b) as an example, for MWD, mapping algorithm based on DCQPSO converges to the optimal fitness value 2220 at the 35th iteration while mapping algorithm based on QPSO converges to the optimal fitness value 2189 at the 42th iteration. From Figure 2, we can observe that for any classic APCG, mapping algorithm based on DCQPSO can get optimal solutions within a few iterations and an obvious improvement in convergence speed can be observed compared with the mapping algorithm based on QPSO (about 16.67–66.67%). Also, except for the PIP, optimal solution of mapping algorithm based on DCQPSO fits better than the optimal solution of mapping algorithm based on QPSO. Because the number of IP cores of PIP is less, QPSO algorithm is not easy to fall into prematurity and we can get the same optimal solution with DCQPSO algorithm. It also can be seen that for APCG with large number of IP cores, the advantages of mapping algorithm based on DCQPSO are more obvious, i.e. it can get better solutions within a short period of time.

Comparison on convergence speed of the two algorithms for random APCG

Figure 2.

Comparison on convergence speed for different classic APCG. (a) Comparison on conversation speed for PIP, (b) MWD, (c) VOPD, and (d) DVOPD.

In this paper, random APCGs with 26-core, 51-core, 75-core, 101-core, 127-core, and 151-core are generated using TGFF. For different random APCGs, comparison on convergence speed in 100 iterations of mapping algorithms based on DCQPSO and QPSO is shown in Figure 3. Take Figure 3(e) as an example, for random APCG with 127 cores, mapping algorithm based on DCQPSO converges to the optimal fitness value 164,330 at the 49th iteration while mapping algorithm based on QPSO converges to the optimal fitness value 154,395 at the 77th iteration. From Figure 3, we can observe that for any random APCG, convergence speed of mapping algorithm based on DCQPSO is improved obviously related to the mapping algorithm based on QPSO (about 12.05–50.52%). The final solution of mapping algorithm based on DCQPSO is also better than that of mapping algorithm based on QPSO.

Figure 3.

Comparison on conversation speed for APCGs with different numbers of IP cores. (a) Comparison on conversation speed for 26-core APCG, (b) 51-core APCG, (c) 75-core APCG, (d) 101-core APCG, (e) 127-core APCG, and (f) 151-core APCG.

Comparison on power consumption of mapping algorithms based on DCQPSO and QPSO

In our simulation experiments, for each APCG, the two algorithms are separately repeated for 10 times and the best individual is used to generate communication mode files.

Comparison on power consumption of the two algorithms for classic APCG

For different classic APCGs, mapping algorithm based on DCQPSO and QPSO are repeated for 10 times, respectively, and get the value of power consumption from the simulation platform. The experiment results are analyzed and the average power consumption, maximum power consumption, and minimum power consumption of the two algorithms for the same APCG in 10 simulation trials are compared separately.

The comparison on the average power consumption of mapping algorithms based on DCQPSO and QPSO is shown in Figure 4. From Figure 4 we can observe for all classic APCGs, the power consumption of mapping algorithm based on DCQPSO is lower than that of mapping algorithm based on QPSO with maximum case as 5.29% (for DVOPD) and the minimum case as 0.36% (for PIP).

Figure 4.

Comparison on the average power consumption for classic APCG.

The comparison on the maximum power consumption of mapping algorithms based on DCQPSO and QPSO is shown in Figure 5. From Figure 5 we can see that for all APCGs, the power consumption of mapping algorithm based on DCQPSO is lower than that of mapping algorithm based on QPSO with the maximum case as 6.85% (for DVOPD) and the minimum case as 0.25% (for PIP).

Figure 5.

Comparison on the maximum power consumption for classic APCG.

The comparison on the minimum power consumption of mapping algorithms based on DCQPSO and QPSO is shown in Figure 6. From Figure 6 we can see that for all APCGs, the power consumption of mapping algorithm based on DCQPSO is lower than that of mapping algorithm based on QPSO with the maximum case as 3.89% (for VOPD) and the minimum case as 0.70% (for PIP).

Figure 6.

Comparison on the minimum power consumption for classic APCG.

In summary, it can be observed that power consumption is reduced in varying degrees with mapping algorithm based on DCQPSO compared to the original mapping algorithm based on QPSO. Related to mapping algorithm based on QPSO, the reduction ratio of power consumption of mapping algorithm based on DCQPSO is shown in Figure 7, where the abscissa represents different classic APCGs, the ordinate represents the ratio of reduction of power consumption of mapping algorithm based on DCQPSO compared to mapping algorithm based on QPSO. When the number of IP cores of APCG is small, QPSO algorithm rarely falls in prematurity and the advantages of DCQPSO algorithm is not very obvious. When the number of IP cores of APCG is large, QPSO algorithm tends to fall in prematurity and the ability of optimization of DCQPSO algorithm is stronger.

Comparison on power consumption of the two algorithms for random APCG

Figure 7.

The reduction ratio of power consumption of mapping algorithm based on DCQPSO for classic APCG.

In this paper, random APCGs with 26-core, 51-core, 75-core, 101-core, 127-core, and 151-core are generated using TGFF. For different random APCGs, mapping algorithm based on DCQPSO and QPSO are repeated for 10 times, respectively, and get the value of power consumption from the simulation platform. The experiment results are analyzed and the average power consumption, maximum power consumption, and minimum power consumption of the two algorithms for the same APCG in 10 simulation trials are compared separately. The comparison on the average power consumption of mapping algorithms based on DCQPSO and QPSO is shown in Figure 8. From Figure 8 we can see that for all random APCGs the power consumption of mapping algorithm based on DCQPSO is lower than that of mapping algorithm based on QPSO with the maximal decrease could reach 5.51% (for 127-core) and the minimum case as 4.26% (for 101-core).

Figure 8.

Comparison on the average power consumption for random APCG.

The comparison on the maximum power consumption of mapping algorithms based on DCQPSO and QPSO is shown in Figure 9. From Figure 9 we can see that for all random APCGs, the power consumption of mapping algorithm based on DCQPSO is lower than that of mapping algorithm based on QPSO with the maximum case as 5.57% (for 75-core) and the minimum case as 3.56% (for 26-core).

Figure 9.

Comparison on the maximum power consumption for random APCG.

The comparison on the minimum power consumption of mapping algorithms based on DCQPSO and QPSO is shown in Figure 10. From Figure 10 we can see that for all random APCGs the power consumption of mapping algorithm based on DCQPSO is lower than that of mapping algorithm based on QPSO with the maximum case as 8.04% (for 75-core) and the minimum case as 4.08% (for 151-core).

Figure 10.

Comparison on the minimum power consumption for random APCG.

In summary, it can be observed that for random APCG, power consumption is reduced in varying degrees and the performance is more stable with mapping algorithm based on DCQPSO compared to the original mapping algorithm based on QPSO. This is because once the mapping algorithm based on QPSO falls into prematurity it will lead to poor results. In the comparison of mapping algorithms based on QPSO and PSO, mapping algorithm based on QPSO is most suitable for APCG with about 60 IP cores. But for APCG with large-scale, the efficiency of power-optimization cannot be improved very much. For APCG with large number of IP cores (100-core to 150-core), optimal results of the proposed mapping algorithm based on DCQPSO are still relatively stable and power consumption can decrease by 4.08–8.04%. Related to mapping algorithm based on QPSO, the reduction ratio of power consumption of mapping algorithm based on DCQPSO is shown in Figure 11, where the abscissa represents different random APCGs, the ordinate represents the ratio of reduction of power consumption of mapping algorithm based on DCQPSO related to mapping algorithm based on QPSO.

Figure 11.

The reduction ratio of power consumption of mapping algorithm based on DCQPSO for random APCG.

Conclusions

In this paper, both mapping algorithms based on DCQPSO and QPSO are used to solve the problem of 3D NoC mapping. Simulation results show that mapping algorithm based on DCQPSO can maintain diversity from start to finish and is less prone to fall into prematurity. Convergence speed is significantly improved (12.05–66.67%) and power consumption of mapping is effectively reduced (4.08–8.04%). For APCG with different number of IP cores, the optimal efficiency is very stable. In future studies, we will continue to optimize the objective function and improve the efficiency of 3D NoC mapping in different ways. In addition, we will focus on the study of 3D NoC mapping algorithm for different topologies and routing, such as heterogeneous topologies and personalization routing, so that the proposed low-power mapping algorithm based DCQPSO can be more generally applied.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

References

Liu Y. Research of NoC mapping problem based on particle swarm optimization algorithm. Dalian University of Technology, 2010.

Wang T. Low energy NoC mapping algorithm based on modified electromagnetism-like mechanism. Xidian University, 2014.

Zhang Z. Improved particle swarm optimization algorithm based mapping algorithm for 3D-mesh CMP. Guangdong University of Technology, 2012.

Yang W, Zhang Z and Liu YJ. Improved particle swarm optimization algorithm based mapping algorithm for 3D-mesh CMP. Application Research of Computers 2013; 30: 1345–1348.

Sahu PK, Shah T, Manna K, et al. Application mapping onto mesh-based network-on-chip using discrete particle swarm optimization. In: IEEE transactions on very large scale integration (VLSI) systems, vol. 22, No. 2, 2014, pp.300–312.

Bhardwaj K and Mane PS. C3Map and ARPSO based mapping algorithms for energy-efficient regular 3-D NoC architectures. In: Technical papers of 2014 international symposium on VLSI design, automation and test, 2014.

Riget

Vesterstrom

. A diversity-guided particle swarm optimizer the ARPSO. Dept. Comput. Sci., Univ. of Aarhus, Aarhus, Denmark, Tech. Rep 2002.

Jiawen

Wei

. A dynamic ant colony optimization algorithm for 3D NoC mapping. J Comp Aid Des Comp Graphics 2011; 23: 1614–1620.

Wang J, Li L, Pan H, et al. Latency-aware mapping for 3D NoC using rank-based multi-objective genetic algorithm. In: 2011 IEEE 9th international conference on ASIC, 25–28 October 2011, Xiamen, China, pp.413–416.

10.

Sahni

Gonzales

. P-complete approximation problems. JACM 1976; 23: 555–565.

11.

Xingsheng LV. Study on mapping algorithm of 3D NoC. Qufu Normal University, 2014.

12.

Yang

Zhang

Liu

. Improved particle swarm optimization algorithm based mapping algorithm for 3D-mesh CMP. Appl Res Comp 2013; 5: 1345–1348.

13.

Matsutani H, Koibuchi M and Amano H. Tightly-coupled multi-layer topologies for 3-D NoCs. In: ICPP, 10–14 September 2007, XiAn, China, p.75.

14.

Sun J. Study on quantum-behaved particle swarm optimization. Jiangnan University, 2009.

15.

Jheng

K-Y

Chao

C-H

Wang

H-Y

. Traffic-thermal mutual-coupling co-simulation platform for three-dimensional network-on-chip. VLSI Design Autom Test (VLSI-DAT) 2010; 54: 135–138.

16.

Chen

Ling

. Study on 3D NoC topology. Telecommun Sci 2009; 25: 39–44.

17.

Dick RP, Rhodes DL and Wolf W. TGFF: task graphs for free. In: Proceedings of the 6th international workshop on hardware/software codesign, IEEE Computer Society, 15–18 March 1998, Seattle, WA, USA, pp.97–101.