Abstract
In the device-to-device communication network, there is the interference problem when device-to-device users share the same spectrum with cellular users, a distributed beamforming scheme based on non-cooperative game is proposed to maximize weighted sum rate under the rated transmit power and the users’ quality of service. Since the optimization problem is non-concave, we first obtain the solution of the Karush-Kuhn-Tucher (KKT) condition for the downlink beamforming problem of each sender by dual decomposition. Second, a distributed beamforming algorithm based on the non-cooperative game is proposed, which can quickly converge to the Nash equilibrium point with a lower information exchange overhead. Finally, the simulation results show that the proposed algorithm has better performance in terms of system sum rate and fairness than the existing algorithms.
Introduction
Device-to-device (D2D) communication is a communication method coexisting with a cellular network. When a user equipment is close, the D2D communication can directly communicate through a D2D connection under the control of a base station. Since the advantages that D2D communication increases the throughput of an entire cell, decreases the energy consumption, improves the instantaneous data rate, it has recently received more and more attention.1–3 Attractive applications for D2D communications include streaming video, online games, multimedia downloads, peer-to-peer file sharing, and so on. If some users download data through the base station, other users receive the data through D2D communication, and the load of the cellular base station can be greatly reduced, thereby saving a lot of resources compared to the traditional cellular base station communication. Due to the fact that device-to-device users (DUEs) reuse resources of cellular users (CUEs), this multiplexing method can also cause severe intra-frequency interference. Therefore, overcoming the interference between DUEs and CUEs is a major challenge. 4
On one hand, in order to utilize the advantages of D2D communication without causing too much interference to CUEs, most of the literature focuses on the proposed resource allocation and interference suppression strategies.5–7 On the other hand, there are also some studies on how to reduce the strong interference from CUEs to DUEs.8–13 In Min et al.,8–11 considering uplink resources of multiplexed cellular connections for communication. Min et al. 8 aimed at maximizing the number of D2D access pairs and proposed the best interference-aware channel allocation strategy using Hungarian algorithm. Yu et al. 9 proposed a transmission power control scheme for CUEs. In Tao et al., 10 a cognitive resource allocation scheme was proposed based on greedy algorithm and serial interference cancelation algorithm. In order to maximize the network utility function of D2D communication, Zhou et al. 11 proposed an iterative power and optimized interference cancelation algorithm. Game theory exploited for mitigating interference between DUEs and CUEs is described in Xu et al., 12 The paper uses a sequential second price auction to optimize the overall sum rate of the system. A graph theory-based scheme adopting interference-aware graph-based resource allocation is proposed in Zhang et al., 13 The objective of this article is to allocate radio resources to the DUEs and the CUEs in such a manner that the system sum rate is maximized. The interference relationships among the DUEs and cellular communication are formulated as an interference-aware graph.
In Chen et al., 14 the proposal tries to minimize signaling and computational overhead by utilizing a distributed resource allocation scheme. Consequently, the evolved node B (eNB) allocates resource blocks (RBs) to the DUEs in a centralized manner with a slow timescale while the DUEs decide on their transmission powers and modulation and coding scheme in a distributive manner with a fast timescale. The D2D links can use the same resources if the probability of interference among them is lower than a specific threshold. The proposal in Dong et al. 15 tries to further increase spatial reuse for the DUEs. The objective is maximally utilizing the same resources by the DUEs that do not interfere with each other (i.e. DUEs that are sufficiently spatially distant). In addition, Bagheri and Katz 16 and Pratas and Popovski 17 discussed the problem of downlink channel resources for D2D communication multiplexing cellular communication in single-antenna base station system. Bagheri and Katz 16 proposed a downlink resource allocation scheme for D2D communication to improve throughput and spectrum utilization for the entire system. Pratas and Popovski 17 considered network-assisted D2D communication systems that allow fixed-rate connections to reuse cellular downlink resources.
However, the above research mainly focuses on the single-antenna base station system. The multi-antenna base station and terminal in the next generation mobile communication system will be regarded as the basic system settings, which encourages people to study the multi-antenna scheme in D2D communication. Once D2D communication is equipped with multiple antennas, some conventional techniques, such as beamforming, precoding, interference alignment, can be used to eliminate interference.18–22 Jänis et al. 18 utilized MIMO schemes particularly heuristic precoders at cellular downlink (eNB) that avoid generating cross-tier interference from an eNB to a D2D receiver underlaying the same resources by aligning the transmitted signal from the NB to the null space of the NB-D2D interference channel. A time division duplexing (TDD) single cell with multi-user multiple-input multiple-output (MIMO) architecture is considered in Li et al., in which there are multiple CUEs and multiple DUEs and the base station (BS) is equipped with multi-antennas, the authors formulate an optimization problem to maximize the overall rate of the CUEs and D2D pairs, which is non-deterministic polynomial (NP)-hard and non-convex. Two steps are taken to solve the optimization problem heuristically. 19 In Spencer et al., 20 an effective zero forcing (ZF) algorithm was proposed to suppress each user’s signal to null space in the interfering channel to suppress multi-user interference. In Sadek et al., 21 signal to leakage and interference ratio (SLNR) precoding scheme was proposed, which can improve the performance of the ZF algorithm, especially in the case of low signal-to-noise ratio. Ni et al. 22 studied the beamforming and interference cancelation strategies in multi-antenna base station D2D communication system, but it only considered the case of only one D2D pair and one CUE in the system.
Most of the existing D2D communication networks focus on beamforming of CUEs at the base station and few consider the autonomous beamforming between D2D user pairs. In Tang et al., 23 they propose several new algorithms for cooperative MIMO precoder pair selection for D2D DL to mitigate intra-cell cross-tier interference(eNB-DUE and DUE-CUE) where each CUE shares resources with only one D2D link. The precoding is used on both eNB and D2D transmitter. One class of precoding schemes decouples the precoder pair selection through maximizing SLNR by considering signal and interference strengths from the perspective of the transmitter. The second scheme imposes a restriction on signal to interference plus noise ratio (SINR) at the other co-channel receiver. Recently, Zhu et al. 24 considered the design of transponders in multi-antenna systems and optimized the beamforming matrix of base stations and D2D transmitters with the mean square error and system capacity as the optimization objective. However, only one cell User and one D2D user pair in the system. Wang et al. 25 proposed an iterative second-order cone programming (SOCP) beamforming algorithm by optimizing the weighted sum rate of the system. Furthermore, Zhong et al. 26 studied the resource allocation and precoding scheme in heterogeneous networks of D2D communication, but ignored the fairness of DUEs in resource reuse.
Based on the above analysis and existing problems, this article considers that the base station and the D2D sender have multiple antennas. In order to solve the problem of system sum rate and fairness of D2D communication in the case of multi-cellular users and multiple D2D users, a distributed adaptive pricing beamforming (DAPB) algorithm based on price strategy for non-cooperative game is proposed. The main contributions of this article are listed as follows:
Most of the existing D2D communication networks focus on beamforming of CUEs at the base station, while few consider the autonomous beamforming between D2D user pairs . Assuming that DUEs can reuse channel resources of multiple cellular communications to autonomous beamforming between D2D user pairs. The problem of maximizing the weighted sum rate of the entire network is established under the constraints of the transmission power and the user’s SINR. However, this is a non-convex problem.
We consider the beam matrix sender (including the D2D sender and the sender of the base station) as a game participant and formulate an effective distributed price scheme to optimize the weighted sum rate of the system. Based on the dual decomposition method, the Karush-Kuhn-Tucher (KKT) condition of the downlink beamforming problem at each transmitting end is solved and the convergence of the proposed algorithm is proved. The proposed algorithm can quickly converge to the Nash equilibrium point with low information exchange overhead. It has the characteristics of complete cooperation game performance while taking into account the low information interaction overhead of non-cooperative game. Compared with the existing typical algorithms, proposed algorithm has a great improvement in terms of the weighted sum rate and fairness of the system.
The remainder of this article is organized as follows. The system model and problem formulation are described in details in section “System model and problem formulation.” Then, the distributed beamforming algorithm for non-cooperative games is derived in section “Distributed beamforming algorithm for non-cooperative games.” The update strategy for each transmit end beam matrix is discussed in section “Update strategy for each transmit end beam matrix.” Complexity and overhead of the proposed algorithm is analyzed in section “Algorithm analysis.” Finally, simulations are shown in section “Simulations and analyses” and conclusions are drawn in section “Conclusion.” A summary of major notations used in this article is shown in Table 1.
Notation table.
D2D: device-to-device; SINR: signal to interference plus noise ratio.
System model and problem formulation
As shown in Figure 1, we consider single-cell orthogonal frequency division multiplexing (OFDM) downlink multi-user cellular heterogeneous network with one BS, K CUEs (CUEi, i = 1, 2,…, K), D D2D user pairs (D2DTxj, D2DRxj, j = 1, 2,…, D). We assume that the base station and the D2D user have T antennas for the transmitting end (D2DTx), the single antenna for the CUE and the D2D receiving end (D2DRx). There are N subcarrier channels in the network, each channel can only be allocated to one CUE (the number of carrier channels and the number of CUEs are equal), and each D2D user pair can reuse all the carrier channels.

System model.
Thus, the
where
Furthermore, the spectral efficiency of CUE
Thus, we can get the signal to interference plus noise ratio of the first D2D user to the receiver as follows
where
The spectral efficiency of the second D2D user pair is expressed as
We consider that the goal of linear beamforming is to maximize the weighted sum rate of the system
subject to
In formula (6a), the weighted factor
Define the variable
Distributed beamforming algorithm for non-cooperative games
Game theory provides a mathematical tool to solve the problem of resource competition and cooperation.27–32 In general, cooperative games require frequent exchange of information between game participants, which results in huge information overhead and poor scalability in large networks; non-cooperative game is an effective tool to formulate a distributed algorithm, but the non-cooperative game’s Nash equilibrium point is generally worse than the optimization center algorithm. On the other hand, price strategy is another effective way to overcome the inefficiency of non-cooperative game. Due to the distributed nature of DUEs and the characteristics of D2D communication, a distributed resource allocation strategy is preferable. Therefore, the following discussion is based on the pricing of non-cooperative game distributed beamforming.
Pricing program
We set up the following model for the price-based non-cooperative beamforming game
The elements of the game model are as follows:
Participants in the game:
Strategy collection:
Utility function collection:
Among them,
Reasonable pricing factors are very important to price-based algorithms. The effective pricing schemes need to consider the cost of resources consumed by game participants in meeting their own resource needs. Inspired by the literature,
33
We introduce the user’s interference price
Since the
where
Therefore, the total payment fee of the
Based on the above analysis, in our non-cooperative beamforming game, each sending device needs to solve the following optimization problems
Observe the above equations (10)–(14) taking into account the differences between channels and Quality of Service (QoS), we set different weighted factor
We note that the objective function is still non-convex relative to the beam matrix
Distributed adaptive pricing beamforming algorithm
The main idea of the distributed algorithm is that each sender can automatically design its own beam matrix until convergence. Many update algorithms can be used to solve these problems, such as simultaneous updates, sequential updates, completely asynchronous updates. We use the sequential update scheme in Algorithm 1, which we call distributed adaptive pricing beamforming.
Distributed adaptive price beamforming (DAPB) algorithm.
Note the sequential update algorithm, only when
Algorithm convergence proof
Theorem 1: Algorithm 1 converges to Nash equilibrium
It is proved that when the transmitter updates the beam matrix, the weighted sum rate is non-decreasing,
So
where
Formula (16) is superimposed on all carrier channels to obtain the following formula
Assuming that given the current beamforming matrix
where
Substituting equations (9)–(13) into equation (18), we get equations (19) and (20)
Let
where
At the same time, from equation (9), we know that
From equation (14), we know that
where
Update strategy for each transmit end beam matrix
Dual decomposition
Here, we discuss Algorithm 1 in which each sender updates its beamforming strategy. Dual decomposition is a good way to decompose coupled constraints and is often used to solve distributed optimization problems.
First, we set the scalar variable
Note that optimization problem (27) has only one coupling constraint
where
We define the dual problem as
The objective function is
Since the dual function
Decoupling sub-problems
First, we want to calculate
The Lagrange function of sub-problem
where
Then we get KKT conditions as follows
where
Let
For a given
Theorem 2
For a given
where
The proof is as follows: there are two cases of KKT conditions in equation (31):
Case 1:
Since
From equations (35) and (42), we can find out
Case 2:
In this case, if equation (35) is to be established, one of the following conditions must be met:
If condition 1 holds,
Then, when
We can see that equation (44) is the form of equation (43) at
where
Substituting equation (45) into equation (35), we get equation (46)
Since
That is
Therefore, we can solve
Each sender’s beamforming algorithm
In Algorithm 2, each beam sender can obtain the KKT solution to Problem (14). From Theorem 1, we know that Algorithm 1 converges to a Nash equilibrium point. Furthermore, the eighth line in Algorithm 1 ensures that the beam update of each sender is not lower than the total weighted sum rate of the system, so the value of the system weighted sum of the Nash equilibrium points will not be lower than the value of the original problem (7), the weighted sum rate of which corresponding to the point of KKT solution.
Beam update algorithm for each sender.
Algorithm analysis
Algorithm complexity analysis
We use
Note that the problem of equation (32) is divided into two sub-problems: (1) finding the optimal beam set (including energy and direction) and (2) finding the best duality factor for the dual problem. In this article, the Lagrange dual method and the sub-gradient method are combined to solve the problem. Compared with the multi-dimensional search algorithm, the computational complexity of this algorithm is much smaller, and the optimization effect is much better.
Overhead analysis
In order to implement Algorithm 1, each sender
Channel matrix
Interfere with the price
In our assumed TDD system, the
When the
As each transmit end
For the sake of comparison, we analyze the traditional centralized algorithm, which generally requires a central processing unit (CPU) in the network center to collect all the matrices in the network. Global optimization has been achieved. Therefore,
Simulations and analyses
The performance of the proposed algorithm (DAPB algorithm) is verified by simulation. We choose24–26 to simulate the influence of different parameters on system weighted sum rate and system fairness, respectively. In order to be more comprehensive, the iterations of the DAPB algorithm using classical ZF, MSLNR (minimum signal leakage and noise ratio), and CM (channel matching) criteria are simulated.
In this section, we assume that the radius of the cell is 500 m. Users in the cell are randomly distributed. The distance between the CUE and the base station is within 500 m. The D2D user can form a D2D pair within 50 m. The rest of the simulation parameters are shown in the following table.
The well-known Jain’s et al.
35
fairness criterion is adopted to measure the fairness of the system, which is used to measure whether the D2D receiver and the CUE receiver fair to share system resources.
Effect of different parameters on system weighted sum rate
In this section, the effect of different parameters on system weighted sum rate is simulated. In Figure 2, the number of cell users is 4 and the number of D2D user is 8. In Figure 3, the number of cell users is 4 and the number of DUEs changes from 1 to 6. In Figures 4 and 5, the number of cell users is 8 and the number of D2D user is 16. The rest of the simulation parameters are shown in Table 2.

System weighted sum rate changes with the number of iterations.

System weighted sum rate changes with the number of D2D changes.

System weighted sum rate with the maximum transmit power of DUE.

System weighted sum rate with the
Main simulation parameters.
DUE: device-to-device user; SINR: signal to interference plus noise ratio; D2D: device-to-device.
We first analyze the iteration number and convergence of the DAPB algorithm. In Figure 2, the convergence of the system and the rate is studied. It can be seen from the figure that as the iteration progresses, the system sum rate increases monotonically and finally converges. And the algorithm can reach convergence after a few iterations. Our algorithm has better performance than traditional MSLNR algorithm, traditional ZF algorithm, and traditional CM algorithm. At the same time, we find that the DAPB algorithm has the same convergence sum rate under different initial beam conditions (step 1 in Algorithm 1). Using MSLNR to initialize Algorithm 1 accelerates the convergence of the algorithm compared to the ZF and CM beam procedures.
The computational time of different algorithms is given in Table 3, where N denotes the number of subcarrier channels and D denotes the number of DUEs. Our simulation configuration is Intel(R) Core(TM) i7-6700 CPU at 3.40 GHz computer with 16 GB of memory size. As the number of DUEs and subcarrier channels increases, the time for various algorithms to converge will increase. As expected, the centralized algorithm requires significantly more time than the distributed algorithms, especially in dense networks. Hence, the centralized algorithm has the highest computational complexity. Consistent with the above analysis results, it is better to use MSLNR algorithm to initialize Algorithm 1. As stated in the analysis in “Algorithm complexity analysis” section, the multi-dimensional search algorithm requires much more time than our proposed DAPB algorithm, and it becomes prohibitively high in the case N = 16, D = 24. This simulation time is the time required for the entire DAPB algorithm to converge, not the time required for iteration. In the simulation, we set
Average computational time (s) for various methods.
DAPB: distributed adaptive price beamforming; MSLNR: minimum signal leakage and noise ratio; ZF: zero forcing; CM: channel matching.
Average computational time (s) for DAPB-MSLNR.
DAPB: distributed adaptive price beamforming; MSLNR: minimum signal leakage and noise ratio.
As can be seen from Figure 3, the weighted sum rate of the system increases as the number of D2D user pairs and the number of multiplexed links increases. When the number of reusable channel links is constant, the system weighted sum rate will increase as the number of D2D pairs increases. However, an exception occurs when the number of channel links is 2 and 1. This is because when the channel resources are multiplexed, the competition between the D2Ds is excessively large, resulting in a sharp increase in co-channel interference. When the number of DUEs is fixed, as the number of reusable links increases, the system’s weighted sum rate also increases.
As shown in Figure 4, the figure shows the variation of the weighted sum rate of the system with the maximum transmit power of D2D. It can be seen from the figure that under the condition of a certain number of user antennas (T), the weighted sum rate of the system increases first with the increase of the maximum transmit power of D2D, then reach steady state. At the same time, we can also see that with a certain maximum D2D transmit power, increasing the number of antennas will increase the weight sum rate of the system.
As shown in Figure 5, the weighted sum rate of the system decreases with the increase of
As shown in Figure 6, the system weighted sum rate for various algorithms increases as the average signal-to-noise ratio of the system channel increases, eventually reaching a steady value. And the proposed algorithm has better performance in terms of system weighted sum rate than other existing algorithms. When the number of beam transmitters is relatively large, the system is limited in interference. The proposed price-gambling scheme can indirectly induce the beam transmitters to cooperate and maintain the non-cooperative beam-strategy characteristics, thus increasing the weighted sum rate of the system. In addition, the performance of the proposed algorithm is similar to that based on the fully cooperative beamforming algorithm. The reason is as follows: the optimal system weighted sum rate problem proposed in this article is a non-convex problem. The algorithm based on complete cooperation only solves the KKT condition, and the Nash equilibrium point performance of our proposed algorithm is not less than KKT conditions of the point.

System weighted sum rate with the channel average signal-to-noise ratio changes.
System fairness simulation
The system fairness is analyzed in this section. In Figure 7, the number of cell users changes from 3 to 10, and the number of DUEs is 8. In Figure 8, the number of cell users is 8, and the number of DUEs changes from 3 to 10. In Figure 9, the number of cell users is 4, and the number of DUEs is 8. The rest of the simulation parameters are shown in Table 2. As the analysis following formula (14) shows,

System fairness changes with the number of cellular subscribers.

System fairness changes with the number of D2D users.

System fairness with the average SNR of the channel changes.
Figures 7 and 8 show the change of system fairness with the number of CUEs and DUEs. It can be seen from the figure that with the increase of users, the fairness of various algorithms will be reduced. Proposed algorithm has good fairness in different numbers of users. This is because the algorithm in this article satisfies the QoS requirements of different users through an effective beam algorithm. The penalty factor set by the utility function can well ensure the fairness of users. The proposed algorithm is more suitable for large networks with large number of users. Besides, setting different weighted factors to different users which relative to the same weight of the weighted factor can further improve the fairness of the system.
Figure 9 reflects the system fairness with the average signal-to-noise ratio of the channel changes, we can see from the figure, as the channel average signal-to-noise ratio increases, the system fairness will first increase and then tend to steady state. Compared with other algorithms, this algorithm has better fairness under different channel signal-to-interference-plus-noise ratios. This is because the algorithm in this article satisfies the QoS requirements of different users through an effective beam algorithm. The penalty function set by the utility function can avoid some selfish behavior, which ensures the fairness of users further. At the same time, we can also see from the figure that setting different weighted factors for different priority users can further improve system fairness.
In Table 2, the simulation parameters are mainly based on the typical system (time division long term evolution [TD-LTE] system) and the comparative literature, and a setting scheme is selected, which is different from the actual application scenario. The simulation results here mainly illustrate the theoretical performance of the algorithm. This is an ideal performance index. It provides only a reference for the selection of the technical solution and system design, and it is not enough for practical applications. The actual application will leave a certain margin when selecting parameters, and there are several sets of parameter settings. In addition, a large number of on-site tests and tests are needed to modify and optimize the parameters. This will be the content that needs to be studied and implemented in the next step of the paper.
Let me discuss how realistic the set parameters and actual production applications in my paper are and then discuss the problems and new challenges that may arise due to differences in parameter settings and actual production and life:
The cell radius: We all know that the radius of macrocells is generally 1–25 km, and the radius of microcells is generally 30–300 m. Typically, the radius of long term evolution (LTE) cells in urban areas is generally 400–700 m. The radius of the cell in this article is chosen to be 500 m and actual. The LTE cell is in a reasonable range.
System bandwidth: In general, LTE systems theoretically support six types of bandwidth, this is 1.4M, 3M, 5M, 10M, 15M, and 20M. Therefore, our bandwidth setting 10M is reasonable.
User’s SINR threshold: The setting of this parameter is mainly to determine the user’s communication quality, which can be set according to the specific production and living needs.
We set the noise power spectral density to –174 dBm/Hz. It is a quantity that has nothing to do with the type of communication system. In a sense, it is derived from thermodynamics (so it is related to temperature) and is a well-known parameter in the industry.
Path loss model for the links between base station and users: The revised COST231-Hata city propagation model based on the communication between the base station and all users.
Path loss model for the links between CUEs and DUEs: The distance between D2D pairs is small. This chapter adopts the free space model.
Path loss model for the links between CUEs and DUEs: We use the loss model of Andrews et al. 36 for simulation.
The smaller we set the value, the greater the time required for system convergence and the longer the distributed adaptive pricing beamforming (DPBF) algorithm will take. If we assume that the number of antennas in our simulation is too large, the actual application may require high equipment, and thus, it is difficult to achieve.
Conclusion
This article presents a distributed beamforming algorithm based on price strategy for non-cooperative game. Considering the case where the base station and the D2D sender have multiple antennas, it is assumed that the D2D user can reuse the channel resources of multiple cellular communications. Under the relevant constraints, the problem of maximizing the weighted sum rate of the entire network is formulated. But this is a non-convex problem. Considering the beam matrix transmitters (including the D2D transmitters and base stations) as game participants, an efficient distributed price scheme is proposed to optimize the system’s weighted sum rate. At the same time, the convergence of the proposed algorithm is proved. In this article, KKT conditions are proposed to solve the downlink beamforming problem of each sender based on the dual decomposition method. The proposed algorithm can quickly converge to the Nash equilibrium point with low information exchange overhead. It has the characteristics of complete cooperation game performance while taking into account the low information interaction overhead of non-cooperative game. The next step is to consider the energy efficiency of the system as an optimization objective to discuss energy-efficient beamforming algorithms.
Footnotes
Handling Editor: Wenbing Zhao
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This article was supported by the National Nature Science Foundation of China under contract numbers 61271259, 61301123, 61471076, and 61601070, the Chongqing Basic and Frontier Research Project under contract number CSTC 2016jcyjA0455, the Research Project of Chongqing Education Commission under contract number KJ130536 and KJ1600411, Changjiang Scholars and Innovative Research Team in University (IRT1299), and the special fund of Chongqing Key Laboratory (CSTC2012zdsy006).
