Massive MIMO signal detection is the key technology of next generation wireless communication (such as 5G) [1], and how to detect the transmitted signal from the mass MIMO system efficiently and accurately is of vital importance. As for massive MIMO signal detection, there are many algorithms to implement the signal detection. Generally, these algorithms can be divided into the linear detection algorithm and the nonlinear detection algorithm according to different calculation methods [2]. Although the linear detection algorithm is less accurate than the nonlinear detection algorithm, it is still a practical signal detection method for massive MIMO system in some cases due to its low complexity. In the linear detection algorithm, the difficulty people often encounter is the calculation to find the inverse matrix of a large-scale matrix, especially when the scale of a massive MIMO system is very large, the algorithm complexity is very high and the corresponding hardware is difficult to implement. Therefore, this chapter introduces several typical linear iterative algorithms for massive MIMO signal detection. Using these algorithms, the iterations between vectors or matrices can be effectively used to avoid direct inversion of large-scale matrices and reduce complexity of the linear detection algorithm. In the following sections, we will introduce Neumann series approximation (NSA) algorithm, the Chebyshev iteration algorithm, the Jacobi iteration algorithm and the Conjugate gradient (CG) algorithm respectively. And the optimization methods of the Chebyshev iteration algorithm, the Jacobi iteration algorithm and the CG algorithm are also introduced to get better linear detection algorithms. In addition, this chapter also compares the complexity and accuracy of these algorithms with those of other massive MIMO signal detection algorithms.
2.1 Analysis of Linear Detection Algorithm
Massive MIMO signal detection algorithms are usually divided into the linear detection algorithm and the nonlinear detection algorithm. The nonlinear detection algorithm, as the name implies, refers to the algorithm that adopts the nonlinear algorithm to recover the transmitted signal s from the received signal y. Such algorithms usually have high accuracy, but the computation complexity is also high. For example, the maximum likelihood (ML) detection is a typical nonlinear detection algorithm [3]. In theory, the ML detection is very ideal for massive MIMO signal detection with high accuracy. However, in the ML detection, the number of cycles required for computation depends largely on the modulation order q and the number of the user antennas (Nt). The total number of cycles is denoted as . Undoubtedly, this result is disastrous undoubtedly because the total number of cycles will still increase considerably even if the modulation order or the number of user antennas increases very little. Therefore, ML detection is not applicable in practice although it is a very ideal detection method in theory, especially in massive MIMO signal detection. In general, the nonlinear detection algorithm is more accurate than the linear detection algorithm in massive MIMO signal detection, but with higher complexity. We will introduce the nonlinear detection algorithm in detail Chap. 4.
![$$ {\varvec{Hs}} = {\varvec{y}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq2.png)
![$$ {\varvec{y}} = {\varvec{Hs}} + {\varvec{n}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ1.png)
![$$ {\varvec{H}}^{\text{H}} {\varvec{y}} = {\varvec{H}}^{\text{H}} {\varvec{Hs}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ2.png)
![$$ {\hat{\varvec{s}}} = {\varvec{Wy}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ5.png)
![$$ {\hat{\varvec{s}}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq3.png)
![$$ N_{\text{r}} \times N_{\text{t}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq4.png)
![$$ {\varvec{y}} = \sum\limits_{i = 1}^{{N_{\text{t}} }} {{\varvec{h}}_{i} {\varvec{s}}_{i} + {\varvec{n}},} \,i = 1,2, \ldots ,N_{\text{t}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ6.png)
![$$ {\varvec{H}} = \left( {{\varvec{h}}_{ 1} ,{\varvec{h}}_{ 2} , \ldots ,{\varvec{h}}_{{N_{\text{t}} }} } \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq5.png)
![$$ s_{i} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq6.png)
![$$ i = 1, 2, \ldots ,N_{\text{t}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq7.png)
![$$ {\varvec{w}}_{{i , 1\times N_{\text{r}} }} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq8.png)
![$$ {\varvec{w}}_{i} {\varvec{h}}_{j} = \left\{ {\begin{array}{*{20}c} {1,} & {i = j} \\ {0,} & {i \ne j} \\ \end{array} } \right. $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ7.png)
![$$ i = 1, 2, \ldots ,N_{\text{t}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq9.png)
![$$ {\varvec{w}}_{i} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq10.png)
![$$ {\varvec{W}}_{{N_{\text{t}} \times N_{\text{r}} }} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq11.png)
![$$ {\varvec{WH}} = {\varvec{I}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq12.png)
![$$ {\varvec{W}} = ({\varvec{H}}^{\text{H}} {\varvec{H}})^{ - 1} {\varvec{H}}^{\text{H}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ8.png)
![$$ {\hat{\varvec{s}}} = {\varvec{W}}({\varvec{Hs}} + {\varvec{n}}) = {\varvec{s}} + {\varvec{Wn}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ9.png)
Obviously, when the additive noise is ,
is strictly satisfied. Because the ZF detection algorithm meets the conditions in Eq. (2.7), it can eliminate the interference between the data sent by different transmitting antennas, and can get relatively accurate detection results when the signal-to-noise ratio is relatively high.
Although there are some deficiencies in the accuracy of the ZF detection algorithm, its derivation process also provides some ideas whether the influence of noise n can be added to the W matrix to simplify the solution of the transmitted signal s using the same method of solving linear matrix. On this basis, a MMSE detection algorithm is proposed.
![$$ {\hat{\varvec{s}}} = {\varvec{Wy}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq15.png)
![$$ {\hat{\varvec{s}}} = {\varvec{W}}_{\text{MMSE}} = \arg \mathop {\hbox{min} }\limits_{{\varvec{W}}} E\left\| {{\varvec{s}} - {\varvec{Wy}}} \right\|^{2} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ10.png)
![$$ \begin{aligned} {\varvec{W}}_{\text{MMSE}} & = \arg \mathop {\hbox{min} }\limits_{{\varvec{W}}} E\left\| {{\varvec{s}} - {\varvec{Wy}}} \right\|^{2} \\ & = \arg \mathop {\hbox{min} }\limits_{{\varvec{W}}} E\left\{ {({\varvec{s}} - {\varvec{Wy}})^{\text{H}} ({\varvec{s}} - {\varvec{Wy}})} \right\} \\ & = \arg \mathop {\hbox{min} }\limits_{{\varvec{W}}} E\left\{ {{\text{tr}}\left[ {({\varvec{s}} - {\varvec{Wy}})({\varvec{s}} - {\varvec{Wy}})^{\text{H}} } \right]} \right\} \\ & = \arg \mathop {\hbox{min} }\limits_{{\varvec{W}}} E\left\{ {{\text{tr}}\left[ {{\varvec{ss}}^{\text{H}} - {\varvec{sy}}^{\text{H}} {\varvec{W}}^{\text{H}} - {\varvec{Wys}}^{\text{H}} + {\varvec{Wyy}}^{\text{H}} {\varvec{W}}^{\text{H}} } \right]} \right\} \\ \end{aligned} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ11.png)
![$$ \frac{{\partial {\text{tr}}\left[ {{\varvec{ss}}^{\text{H}} - {\varvec{sy}}^{\text{H}} {\varvec{W}}^{\text{H}} - {\varvec{Wys}}^{\text{H}} + {\varvec{Wyy}}^{\text{H}} {\varvec{W}}^{\text{H}} } \right]}}{{\partial {\varvec{W}}}} = 0 $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ12.png)
![$$ {\varvec{W}} = \left( {{\varvec{H}}^{\text{H}} {\varvec{H}} + \frac{{N_{0} }}{{E_{\text{s}} }}{\varvec{I}}_{{N_{\text{t}} }} } \right)^{ - 1} {\varvec{H}}^{\text{H}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ13.png)
![$$ {\hat{\varvec{s}}} = {\varvec{Wy}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq16.png)
![$$ {\varvec{W}}_{\text{ZF}} = {\varvec{G}}^{ - 1} {\varvec{H}}^{\text{H}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq17.png)
![$$ {\varvec{W}}_{\text{MMSE}} = \left( {{\varvec{G}} + \frac{{N_{0} }}{{E_{\text{s}} }}{\varvec{I}}_{{N_{\text{t}} }} } \right)^{ - 1} {\varvec{H}}^{\text{H}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq18.png)
Whether in the ZF detection algorithm or in the MMSE detection algorithm, matrix inverse is involved. The scale of the channel matrix H is usually very large in massive MIMO system. But the matrix inversion is more complex, which is usually difficult to achieve in the actual signal detection circuit. In order to avoid the huge complexity of matrix inversion, many algorithms have been put forward to reduce the algorithm complexity so as to reduce the perplexities caused by large-scale matrix inversion. Among these algorithms, the typical ones are the NSA algorithm, the Chebyshev iteration algorithm, the Jacobi iteration algorithm, the CG algorithm and so on. The above four common algorithms will be introduced below in detail.
2.2 Neumann Series Approximation Algorithm
In the massive MIMO system, the number of receiving antennas is usually much greater than that of the user antennas [5], that is, Nr is much larger than Nt. Since the elements in channel matrix H are independent identically distributed, and the real and imaginary parts are subject to the Gaussian distribution with parameter N(0, 1), both Gram matrix G and MMSE matrix are diagonally dominant matrices. Gram matrix G tends to be a scalar matrix when Nr tends to infinity, i.e.,
[5]. Using this property, we can simplify the inverse process of matrix in the linear massive MIMO detection algorithm.
Because of the diagonal dominance of the MMSE matrix A, if taking the diagonal elements of matrix A out and denoting it as matrix D, it is very easy to seek the inverse of matrix D. As mentioned above, the condition that A tends to D is that the number of user antennas Nt tends to infinity. However that is not actually possible. Even it is often far from that condition. Thus, we use to approximate the matrix A and finding the inverse of the matrix A will lead to relatively large errors.
2.2.1 Algorithm Design
![$$ {\varvec{A}}^{ - 1} = \sum\limits_{n = 0}^{\infty } {({\varvec{X}}^{ - 1} ({\varvec{X}} - {\varvec{A}}))^{n} } {\varvec{X}}^{ - 1} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ14.png)
![$$ \mathop {\lim }\limits_{n \to \infty } ({\varvec{I}} - {\varvec{X}}^{ - 1} {\varvec{A}})^{n} = {\mathbf{0}}_{{N_{{\mathbf{t}}} \times N_{{\mathbf{t}}} }} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ15.png)
![$$ {\varvec{A}} = {\varvec{D}} + {\varvec{E}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq22.png)
![$$ {\varvec{A}}^{ - 1} = \sum\limits_{n = 0}^{\infty } {({\varvec{D}}^{ - 1} {\varvec{E}})^{n} } {\varvec{D}}^{ - 1} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ16.png)
According to Eq. (2.15), if the condition is satisfied, the expansion of Eq. (2.16) will definitely converge.
![$$ {\tilde{\varvec{A}}}_{K}^{ - 1} = \sum\limits_{n = 0}^{K - 1} {({\varvec{D}}^{ - 1} {\varvec{E}})^{n} } {\varvec{D}}^{ - 1} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ17.png)
![$$ {\tilde{\varvec{W}}}_{K}^{ - 1} = {\tilde{\varvec{A}}}_{K}^{ - 1} {\varvec{H}}^{H} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq24.png)
![$$ {\tilde{\varvec{A}}}_{1}^{ - 1} = {\varvec{D}}{}^{ - 1} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq25.png)
![$$ {\tilde{\varvec{W}}}_{1}^{ - 1} = {\varvec{D}}^{ - 1} {\varvec{H}}^{\text{H}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq26.png)
![$$ {\tilde{\varvec{A}}}_{2}^{ - 1} = {\varvec{D}}{}^{ - 1} + {\varvec{D}}{}^{ - 1}{\varvec{ED}}{}^{ - 1} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq27.png)
![$$ O(N_{\text{t}}^{2} ) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq28.png)
![$$ {\tilde{\varvec{A}}}_{3}^{ - 1} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq29.png)
![$$ {\tilde{\varvec{A}}}_{3}^{ - 1} = {\varvec{D}}{}^{ - 1} + {\varvec{D}}{}^{ - 1}{\varvec{ED}}{}^{ - 1} + {\varvec{D}}{}^{ - 1}{\varvec{ED}}{}^{ - 1}{\varvec{ED}}{}^{ - 1} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ18.png)
The computational complexity of Eq. (2.18) is , which is comparable to the actual computational complexity of the inverse matrix, but the approximate operation of Eq. (2.18) is less. When K > 4, the complexity of the inverse computation of the actual matrix may be lower than that of the approximate algorithm.
2.2.2 Error Analysis
![$$ \begin{aligned} {\varvec{\varDelta}}_{K} & = {\tilde{\varvec{A}}}^{ - 1} - {\tilde{\varvec{A}}}_{K}^{ - 1} \\ & = \sum\limits_{n = K}^{\infty } {( - {\varvec{D}}^{ - 1} {\varvec{E}})^{n} } {\varvec{D}}^{ - 1} \\ & = ( - {\varvec{D}}^{ - 1} {\varvec{E}})^{K} \sum\limits_{n = 0}^{\infty } {( - {\varvec{D}}^{ - 1} {\varvec{E}})^{n} } {\varvec{D}}^{ - 1} \\ & = ( - {\varvec{D}}^{ - 1} {\varvec{E}})^{K} {\varvec{A}}^{ - 1} \\ \end{aligned} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ19.png)
![$$ {\hat{\varvec{s}}}_{K} = {\tilde{\varvec{A}}}_{K}^{ - 1} {\varvec{H}}^{\text{H}} {\varvec{y}} = {\varvec{A}}^{ - 1} {\varvec{y}}^{\text{MF}} -\varvec{\varDelta}_{K} {\varvec{y}}^{\text{MF}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ20.png)
![$$ \begin{aligned} \left\| {{\varvec{\varDelta}}_{K} {\varvec{y}}^{\text{MF}} } \right\|_{2} & = \left\| {( - {\varvec{D}}^{ - 1} {\varvec{E}})^{K} {\varvec{A}}^{ - 1} {\varvec{y}}^{\text{MF}} } \right\|_{2} \\ & \quad \le \left\| {( - {\varvec{D}}^{ - 1} {\varvec{E}})^{K} } \right\|_{F} \left\| {{\varvec{A}}^{ - 1} {\varvec{y}}^{\text{MF}} } \right\|_{2} \\ & \quad \le \left\| { - {\varvec{D}}^{ - 1} {\varvec{E}}} \right\|_{F}^{K} \left\| {{\varvec{A}}^{ - 1} {\varvec{y}}^{\text{MF}} } \right\|_{2} \\ \end{aligned} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ21.png)
![$$ \left\| { - {\varvec{D}}^{ - 1} {\varvec{E}}} \right\|_{F} < 1 $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ22.png)
Now it is necessary to prove that Eq. (2.16) converges when the scale of the massive MIMO system satisfies the condition that Nr is much greater than Nt, the elements in the channel matrix H are independent and identically distributed, and all of them obey the complex Gaussian distribution with parameter N (0, 1). More specifically, it is necessary to prove that the condition of convergence of Neumann series and the condition of minimal error in Eq. (2.21) are only related to Nt and Nr. Here is a theorem and corresponding proof below.
Theorem 2.2.1
![$$ N_{\text{r}} > 4 $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq31.png)
![$$ P\left\{ {\left\| { - {\mathbf{D}}^{ - 1} {\mathbf{E}}} \right\|_{F}^{K} < \alpha } \right\} \ge 1 - \frac{{(N_{\text{t}}^{2} - N_{\text{t}} )}}{{\alpha^{{\frac{2}{K}}} }}\sqrt {\frac{{2N_{\text{r}} (N_{\text{r}} + 1)}}{{(N_{\text{r}} - 1)(N_{\text{r}} - 2)(N_{\text{r}} - 3)(N_{\text{r}} - 4)}}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ23.png)
Proof
To prove Theorem 2.2.1, we need to give three other lemmas and their proofs.
Lemma 2.2.1
![$$ x^{(k)} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq32.png)
![$$ y^{(k)} \left( {k = 1,2, \cdots ,N_{\text{r}} } \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq33.png)
![$$ E\left[ {\left| {\sum\limits_{k = 1}^{{N_{\text{r}} }} {x^{(k)} y^{(k)} } } \right|^{4} } \right] = 2N_{\text{r}} \left( {N_{\text{r}} + 1} \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ24.png)
Proof
![$$ \begin{aligned} E\left[ {\left| {\sum\limits_{k = 1}^{{N_{\text{r}} }} {x^{(k)} y^{(k)} } } \right|^{4} } \right] & = E\left[ {\left( {\sum\limits_{k = 1}^{{N_{\text{r}} }} {x^{(k)} y^{(k)} } \sum\limits_{k = 1}^{{N_{\text{r}} }} {\left( {x^{(k)} y^{(k)} } \right)}^{*} } \right)^{2} } \right] \\ & = \left( {_{2}^{{N_{\text{r}} }} } \right)E\left[ {\left| {x^{(k)} } \right|^{2} \left| {y^{(k)} } \right|^{2} } \right] + N_{\text{r}} E\left[ {\left| {x^{(k)} } \right|^{4} \left| {y^{(k)} } \right|^{4} } \right] \\ & = 2N_{\text{r}} \left( {N_{\text{r}} - 1} \right) + 4N_{\text{r}} \\ & = 2N_{\text{r}}^{2} + 2N_{\text{r}} \\ \end{aligned} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ25.png)
The operation process in Eq. (2.25) can be described as the following steps. The nonzero term can be expressed as and
after the Quadratic term is expanded, in which there are a total of
items
and
items
. According to
,
, we can get the conclusion of Lemma 2.2.1.
Lemma 2.2.2
![$$ N_{\text{r}} > 4 $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq42.png)
![$$ x^{(k)} \left( {k = 1,2, \cdots ,N_{\text{r}} } \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq43.png)
![$$ g = \sum\limits_{k = 1}^{{N_{\text{r}} }} {\left| {x^{(k)} } \right|^{2} } $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq44.png)
![$$ E\left[ {\left| {g^{ - 1} } \right|^{4} } \right] = \left( {\left( {N_{\text{r}} - 1} \right)\left( {N_{\text{r}} - 2} \right)\left( {N_{\text{r}} - 3} \right)\left( {N_{\text{r}} - 4} \right)} \right)^{ - 1} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ26.png)
Proof
![$$ g = \frac{1}{2}\sum\limits_{k = 1}^{{2N_{\text{r}} }} {\left| {s^{(k)} } \right|^{2} } $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ27.png)
![$$ s^{(k)} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq45.png)
![$$ 2g^{ - 1} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq46.png)
![$$ \chi^{2} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq47.png)
![$$ 2N_{\text{t}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq48.png)
![$$ \chi^{2} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq49.png)
![$$ 2N_{\text{r}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq50.png)
![$$ 2N_{\text{t}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq51.png)
![$$ \chi^{2} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq52.png)
![$$ E\left( {\left| {2g^{ - 1} } \right|^{4} } \right) = \frac{16}{{\left( {N_{\text{r}} - 1} \right)\left( {N_{\text{r}} - 2} \right)\left( {N_{\text{r}} - 3} \right)\left( {N_{\text{r}} - 4} \right)}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ28.png)
Then we have the conclusion from Eq. (2.26).
Lemma 2.2.3
![$$ N_{\text{t}} > 4 $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq53.png)
![$$ E\left[ {\left\| {{\varvec{D}}^{ - 1} {\varvec{E}}} \right\|_{F}^{2} } \right] \le \left( {N_{\text{t}}^{2} - N_{\text{t}} } \right)\sqrt {\frac{{2N_{\text{r}} \left( {N_{\text{r}} + 1} \right)}}{{\left( {N_{\text{r}} - 1} \right)\left( {N_{\text{r}} - 2} \right)\left( {N_{\text{r}} - 3} \right)\left( {N_{\text{r}} - 4} \right)}}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ29.png)
Proof
![$$ {\varvec{A}} = {\varvec{D}} + {\varvec{E}} = {\varvec{G}} + \frac{{N_{0} }}{{E_{\text{s}} }}{\varvec{I}}_{{N_{\text{t}} }} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq54.png)
![$$ a^{(i,i)} = \left\{ { \, \begin{array}{*{20}l} {g^{(i,i)} = \sum\limits_{k = 1}^{{N_{\text{r}} }} {(h^{(k,i)} )^{*} h^{(k,j)} ,} } \hfill & {i \ne j} \hfill \\ {g^{(i,i)} + \frac{{N_{0} }}{{E_{\text{s}} }} = \sum\limits_{k = 1}^{{N_{\text{r}} }} {\left| {h^{(k,i)} } \right|^{2} + \frac{{N_{0} }}{{E_{\text{s}} }}} ,} \hfill & {i = j} \hfill \\ \end{array} } \right. $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ30.png)
![$$ g^{(i,j)} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq55.png)
![$$ E\left[ {\left\| {{\varvec{D}}^{ - 1} {\varvec{E}}} \right\|_{F}^{2} } \right] = E\left[ {\sum\limits_{i = 1}^{{i = N_{\text{t}} }} {\sum\limits_{j = 1,i \ne j}^{{j = N_{\text{t}} }} {\left| {\frac{{g^{(i,j)} }}{{a^{(i,i)} }}} \right|^{2} } } } \right] \le \sum\limits_{i = 1}^{{i = N_{\text{t}} }} {\sum\limits_{j = 1,i \ne j}^{{j = N_{\text{t}} }} {E\left| {\frac{{g^{(i,j)} }}{{g^{(i,i)} }}} \right|^{2} } } $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ31.png)
![$$ E\left[ {\left\| {{\varvec{D}}^{ - 1} {\varvec{E}}} \right\|_{F}^{2} } \right] \le \sum\limits_{i = 1}^{{i = N_{\text{t}} }} {\sum\limits_{j = 1,i \ne j}^{{j = N_{\text{t}} }} {\sqrt {E\left[ {\left| {g^{(i,j)} } \right|^{4} } \right]E\left[ {\left| {\left( {g^{(i,i)} } \right)^{ - 1} } \right|^{4} } \right]} } } $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ32.png)
Use Lemmas 2.2.2 and 2.2.3 in the calculation of first and second moments respectively, get the following expression:
![$$ \begin{aligned} E\left[ {\left\| {{\varvec{D}}^{ - 1} {\varvec{E}}} \right\|_{F}^{2} } \right] & \le \sum\limits_{i = 1}^{{i = N_{\text{t}} }} {\sum\limits_{j = 1,i \ne j}^{{j = N_{\text{t}} }} {\sqrt {\frac{{2N_{\text{r}} \left( {N_{\text{r}} + 1} \right)}}{{\left( {N_{\text{r}} - 1} \right)\left( {N_{\text{r}} - 2} \right)\left( {N_{\text{r}} - 3} \right)\left( {N_{\text{r}} - 4} \right)}}} } } \\ & \quad = \left( {N_{\text{t}}^{2} - N_{\text{t}} } \right)\sqrt {\frac{{2N_{\text{r}} \left( {N_{\text{r}} + 1} \right)}}{{\left( {N_{\text{r}} - 1} \right)\left( {N_{\text{r}} - 2} \right)\left( {N_{\text{r}} - 3} \right)\left( {N_{\text{r}} - 4} \right)}}} \\ \end{aligned} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ33.png)
![$$ P\left\{ {\left[ {\left\| {{\varvec{D}}^{ - 1} {\varvec{E}}} \right\|_{F}^{K} \ge \alpha } \right]} \right\} = P\left\{ {\left[ {\left\| {{\varvec{D}}^{ - 1} {\varvec{E}}} \right\|_{F}^{K} \ge \alpha^{{\frac{2}{K}}} } \right]} \right\} \le \alpha^{{ - \frac{2}{K}}} E\left[ {\left\| {{\varvec{D}}^{ - 1} {\varvec{E}}} \right\|_{F}^{2} } \right] $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ34.png)
Combining the upper bound of and
in Lemma 2.2.3, we can get the conclusion of Theorem 2.2.1.
As we can see from Eq. (2.23), when , the probability of the condition (2.22) satisfied is greater. This theorem shows that Neumann series converges with a certain probability, the greater the ratio of
, the greater the probability of convergence. In addition, the theorem also provides α condition for minimizing the error of residual estimation and when alpha is less than 1, the greater the number of terms K selected, the greater the probability of convergence.
2.2.3 Complexity and Block Error Rate
The NSA reduces the computational complexity of inverse matrix. Now we discuss the advantages and limitations of the NSA from two aspects of computational complexity and block error rates. Here, we only consider the case that is 64, 128, and 256 respectively.
In the exact algorithm to solve the inverse matrix, the Cholesky decomposition (CHD) algorithm [8] has lower complexity than other exact solutions, such as direct matrix inversion, QR decomposition, LU decomposition [3, 9], etc. Therefore, the CHD algorithm can be selected as the object to be compared with the NSA. The complexity of the CHD algorithm for inverse matrix solution is in the in the range, while the complexity of the NSA for inverse matrix is in the
range and the
range respectively when K = 1 and K = 2. When K > 3, the complexity of the NSA mainly comes from the multiplication between large matrices, and the complexity increases linearly with the value of K. For example, when K = 3, there is one multiplication between large matrices once in the algorithm, and when K = 4, there are two multiplications between large matrices, that is, when K > 3, you need to calculate K − 2 multiplications between large matrices in the NSA. So when K > 3, the complexity of the NSA algorithm is
. It can be seen that when
, the NSA has no advantage in complexity compared with the CHD algorithm.
![$$ N_{\text{t}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq66.png)
![$$ K \le 3 $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq67.png)
![../images/478198_1_En_2_Chapter/478198_1_En_2_Fig1_HTML.png](../images/478198_1_En_2_Chapter/478198_1_En_2_Fig1_HTML.png)
The Relationship between the number of the user antennas and the number of the real number multiplications in the algorithm.
© [2018] IEEE. Reprinted, with permission, from Ref. [7]
![$$ {\text{SNR}} = N_{\text{r}} \frac{{E_{\text{s}} }}{{N_{0} }} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq69.png)
![$$ N_{\text{r}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq70.png)
![$$ N_{\text{t}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq71.png)
![../images/478198_1_En_2_Chapter/478198_1_En_2_Fig2a_HTML.png](../images/478198_1_En_2_Chapter/478198_1_En_2_Fig2a_HTML.png)
![../images/478198_1_En_2_Chapter/478198_1_En_2_Fig2b_HTML.png](../images/478198_1_En_2_Chapter/478198_1_En_2_Fig2b_HTML.png)
Block error Rate Curve at a , b
, c
.
© [2018] IEEE. Reprinted, with permission, from Ref. [7]
As we can see from Fig. 2.2, when K = 1 or K = 2, the NSA algorithm has a large block error rate, when the number of antennas at the base station is large, it can make up for part of block error rate. Considering the requirement of 10% block error rate [10] in LTE, it is not suitable for practical applications when K = 1 and K = 2 with modulation order of 64-QAM. The simulation result shows that the block error rate is less than 10−2 when K = 1, ,
, and the number of terms required by the NSA algorithm is fewer when the modulation order is 16-QAM. When the modulation order is 64-QAM and K = 3, the result of the NSA algorithm is close to that of the CHD algorithm. For example, when K = 3,
and K = 3,
,
, and the block error rate is 10−2, the SNR loss of NSA algorithm is less than 0.25 dB. Therefore, when the
ratio of the massive MIMO system is large, the NSA algorithm is used to find the inverse of the matrix and the term K is 3, so that the lower block error rate can be reduced under the condition of low computational complexity.
In summary, in the massive MIMO system, when the ratio is small, the CHD algorithm and other exact arithmetic are required to seek the inverse of the matrix. When
ratio is large, the NSA algorithm can be used to approximate the inversion. By using the NSA algorithm, we can find the relatively accurate inverse matrix results with low computational complexity. This makes the NSA an efficient and accurate method for massive MIMO detection in some specific cases.
2.3 Chebyshev Iteration Algorithm
2.3.1 Algorithm Design
![$$ {\varvec{Ax}} = {\varvec{b}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq83.png)
![$$ {\varvec{x}}^{(K)} = {\varvec{x}}^{(K - 1)} + {\varvec{\sigma}}^{(K)} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ35.png)
![$$ {\varvec{\sigma}}^{(0)} = \frac{1}{\beta }{\varvec{r}}^{(0)} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ36.png)
![$$ {\varvec{\sigma}}^{(K)} = \rho^{(K)} {\varvec{r}}^{(K)} + \varphi^{(K)} {\varvec{\sigma}}^{(K - 1)} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ37.png)
![$$ {\varvec{r}}^{(K)} = {\varvec{b}} - {\varvec{Ax}}^{(K)} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq84.png)
![$$ {\varvec{A}} = {\varvec{H}}^{H} {\varvec{H}} + \frac{{N_{0} }}{{E_{\text{s}} }}{\varvec{I}}_{{N_{\text{t}} }} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq85.png)
![$$ {\varvec{{A}\hat{s}}} = {\varvec{y}}^{\text{MF}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq86.png)
Although the Chebyshev iteration can be used in the MMSE of massive MIMO detection, it still faces some challenges. First of all, since the parameters such as β, ρ(K) and φ(K) are related to the eigenvalues of matrix A, it is difficult to calculate these parameters. Second, at the beginning of iteration, it is necessary to solve matrix A. Matrix A involves multiplication of large-scale matrix, which will consume a lot of hardware resources. Third, different initial values of iteration affect the convergence rate of the algorithm, and how to determine a good initial value is also the challenge of the algorithm. To solve the above problems, the Chebyshev iteration is optimized in this section to make it more suitable for massive MIMO signal detection.
![$$ {\hat{\varvec{s}}}^{(K)} = {\hat{\varvec{s}}}^{(K - 1)} + {\varvec{\sigma}}^{(K)} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ38.png)
![$$ {\hat{\varvec{s}}}^{(0)} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq87.png)
![$$ {\varvec{\sigma}}^{(K)} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq88.png)
![$$ {\varvec{\sigma}}^{(K)} = \frac{1}{\beta }\left( {{\varvec{y}}^{\text{MF}} - {\varvec{{A}\hat{s}}}^{(K)} } \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ39.png)
![$$ \rho^{(K)} = \frac{2\alpha }{\beta }\frac{{T_{K} (\alpha )}}{{T_{K + 1} (\alpha )}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ41.png)
![$$ \varphi^{(K)} = \frac{{T_{K - 1} (\alpha )}}{{T_{K + 1} (\alpha )}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ42.png)
![$$ T_{0} (\alpha ) = 1 $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ43.png)
![$$ T_{ 1} (\alpha ) = \alpha $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ44.png)
![$$ T_{K} (\alpha ) = 2\alpha {\text{T}}_{K - 1} (\alpha ) - T_{K - 2} (\alpha ),\quad K \ge 2 $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ45.png)
![$$ \rho^{(1)} = \frac{{2\alpha^{2} }}{{\left( {2\alpha^{2} - 1} \right)\beta }} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ46.png)
![$$ \rho^{(K)} = \frac{{4\alpha^{2} }}{{4\alpha^{2} \beta - \beta^{2} \rho^{(K - 1)} }} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ47.png)
![$$ \varphi^{(1)} = \frac{1}{{2\alpha^{2} - 1}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ48.png)
![$$ \varphi^{(K)} = \frac{{\beta^{2} \rho^{(K - 1)} }}{{4\alpha^{2} \beta - \beta^{2} \rho^{(K - 1)} }} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ49.png)
![$$ \alpha $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq90.png)
![$$ \beta $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq91.png)
![$$ \alpha = \frac{{\lambda_{\hbox{max} } + \lambda_{\hbox{min} } }}{{\lambda_{\hbox{max} } - \lambda_{\hbox{min} } }} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ50.png)
![$$ \beta = \frac{{\lambda_{\hbox{max} } + \lambda_{\hbox{min} } }}{2} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ51.png)
![$$ \lambda_{\hbox{max} } \approx N_{\text{r}} \left( {1 + \sqrt {\frac{{N_{\text{t}} }}{{N_{\text{r}} }}} } \right)^{2} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ52.png)
![$$ \lambda_{\hbox{min} } \approx N_{\text{r}} \left( {1 - \sqrt {\frac{{N_{\text{t}} }}{{N_{\text{r}} }}} } \right)^{2} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ53.png)
So far, all the parameters used in the Chebyshev iteration can be expressed in the scale parameters of the channel matrix H in the massive MIMO system. According to Eq. (2.38), we still need an iterative initial value if we want to use the Chebyshev iterative algorithm to estimate the signal s. Theoretically, although using any initial values we can get the final estimation, the convergence rates of the algorithm corresponding to different initial values are not the same. A good initial value can make the algorithm converge faster and generate the desired results, achieving twice the result with half the effort.
![$$ N_{\text{r}} \gg N_{\text{t}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq92.png)
![$$ A_{i,j} = \left\{ {\begin{array}{*{20}l} {\frac{{\lambda_{\hbox{max} }\,+\,\lambda_{\hbox{min} } }}{2} = \beta ,} \hfill & {i = j} \hfill \\ {0,} \hfill & {i \ne j} \hfill \\ \end{array} } \right. $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ54.png)
![$$ {\hat{\varvec{s}}}^{(0)} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq93.png)
![$$ {\hat{\varvec{s}}}^{(0)} \approx \frac{1}{\beta }{\varvec{H}}^{\text{H}} {\varvec{y}} = \frac{2}{{\lambda_{\hbox{max} } + \lambda_{\hbox{min} } }}{\varvec{H}}^{\text{H}} {\varvec{y}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ55.png)
This initial value enables the Chebyshev iteration to achieve a faster convergence rate, and the computational complexity of the initial value is very low. In addition, the computation of the initial value can be executed in parallel.
![$$ {\hat{\varvec{s}}} = {\varvec{A}}^{ - 1} {\varvec{H}}^{\text{H}} {\varvec{y}} = {\varvec{A}}^{ - 1} {\varvec{H}}^{\text{H}} {\varvec{Hs}} + {\varvec{A}}^{ - 1} {\varvec{H}}^{\text{H}} {\varvec{n}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ56.png)
![$$ {\varvec{X}} = {\varvec{A}}^{ - 1} {\varvec{H}}^{\text{H}} {\varvec{H}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq94.png)
![$$ {\varvec{Y}} = {\varvec{X{A}}}^{ - 1} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq95.png)
![$$ {\hat{\varvec{s}}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq96.png)
![$$ \begin{aligned} {\hat{\varvec{s}}} \approx {\hat{\varvec{s}}}^{(K)} & = {\hat{\varvec{s}}}^{{\left( {K - 1} \right)}} + \rho^{{\left( {K - 1} \right)}} {\varvec{r}}^{{\left( {K - 1} \right)}} + \varphi^{{\left( {K - 1} \right)}} {\varvec{\sigma}}^{{\left( {K - 2} \right)}} \\ & = \left[ {\left( {1 + \varphi^{{\left( {K - 1} \right)}} } \right){\varvec{I}}_{{N_{\text{t}} }} - \rho^{{\left( {K - 1} \right)}} {\varvec{A}}} \right]{\hat{\varvec{s}}}^{{\left( {K - 1} \right)}} - \varphi^{{\left( {K - 1} \right)}} {\hat{\varvec{s}}}^{{\left( {K - 2} \right)}} + \rho^{{\left( {K - 1} \right)}} {\varvec{y}}^{\text{MF}} \\ \end{aligned} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ57.png)
![$$ {\varvec{y}}^{\text{MF}} = {\varvec{e}}_{{(N_{\text{r}} , 1)}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq97.png)
![$$ \begin{aligned} {\tilde{\varvec{A}}}^{ - 1} & \approx \left( {{\tilde{\varvec{A}}}^{ - 1} } \right)^{\left( K \right)} \\ & = \left[ {\left( {1 + \varphi^{{\left( {K - 1} \right)}} } \right){\varvec{I}}_{{N_{t} }} - \rho^{{\left( {K - 1} \right)}} {\varvec{A}}} \right]\left( {{\tilde{\varvec{A}}}^{ - 1} } \right)^{{\left( {K - 1} \right)}} - \varphi^{{\left( {K - 1} \right)}} \left( {{\tilde{\varvec{A}}}^{ - 1} } \right)^{{\left( {K - 2} \right)}} + \rho^{{\left( {K - 1} \right)}} {\varvec{I}} \\ \end{aligned} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ58.png)
![$$ ({\tilde{\varvec{A}}}^{ - 1} )^{(0)} = \frac{1}{\beta }{\varvec{I}}_{{N_{\text{t}} }} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq98.png)
![$$ ({\tilde{\varvec{A}}}^{ - 1} )^{(K)} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq99.png)
![$$ \begin{aligned} {\hat{\varvec{X}}} & \approx {\hat{\varvec{X}}}^{(K)} \\ & = \left[ {\left( {1 + \varphi^{{\left( {K - 1} \right)}} } \right){\varvec{I}}_{{N_{\text{t}} }} - \rho^{{\left( {K - 1} \right)}} {\varvec{A}}} \right]{\hat{\varvec{X}}}^{{\left( {K - 1} \right)}} - \varphi^{{\left( {K - 1} \right)}} {\hat{\varvec{X}}}^{{\left( {K - 2} \right)}} + \rho^{{\left( {K - 1} \right)}} {\varvec{H}}^{\text{H}} {\varvec{H}} \\ \end{aligned} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ59.png)
![$$ {\hat{\varvec{Y}}} \approx {\hat{\varvec{Y}}}^{(K)} = {\hat{\varvec{X}}}({\hat{\varvec{W}}}^{ - 1} )^{(K)} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ60.png)
![$$ \mu_{i} = \hat{X}_{i,i} \approx \hat{X}_{i,i}^{(K)} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ61.png)
![$$ \nu_{i}^{2} = \sum\limits_{j \ne i}^{{N_{\text{t}} }} {\left| {X_{i,j} } \right|E_{\text{s}}\,+\,} Y_{i,i} N_{0} \approx N_{0} \hat{X}_{i,i}^{(K)} \tilde{A}_{i,i}^{(K)} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ62.png)
![$$ L_{i,b} (\hat{s}_{i} ) = \gamma_{i} \left( {\mathop {\hbox{min} }\limits_{{s \in S_{b}^{0} }} \left| {\frac{{\hat{s}_{i} }}{{\mu_{i} }} - s} \right|^{2} - \mathop {\hbox{min} }\limits_{{s \in S_{b}^{1} }} \left| {\frac{{\hat{s}_{i} }}{{\mu_{i} }} - s} \right|^{2} } \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ63.png)
![$$ \gamma_{i} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq100.png)
![$$ \gamma_{i} = \frac{{\mu_{i}^{2} }}{{\nu_{i}^{2} }} = \frac{{(\hat{X}_{i,i}^{(K)} )^{2} }}{{N_{0} \hat{X}_{i,i}^{(K)} \tilde{A}_{i,i}^{(K)} }} = \frac{{\hat{X}_{i,i}^{(K)} }}{{N_{0} \tilde{A}_{i,i}^{(K)} }} \approx \frac{1}{\beta }\frac{1}{{N_{0} }} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ64.png)
![$$ \gamma_{i} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq101.png)
![$$ S_{b}^{0} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq102.png)
![$$ S_{b}^{1} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq103.png)
![$$ Q\left( {\left| Q \right| = 2^{\vartheta } } \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq104.png)
![$$ S_{b}^{0} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq105.png)
![$$ S_{b}^{1} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq106.png)
![$$ L_{i,b} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq107.png)
![$$ L_{i,b} (\hat{s}_{i} ) = \gamma_{i} \xi_{b} (\hat{s}_{i} ) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq108.png)
![$$ \hat{X}_{i,i}^{(K)} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq109.png)
![$$ \tilde{A}_{i,i}^{(K)} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq110.png)
Based on the above analysis, we can obtain the optimized Chebyshev iterative algorithm that approximates an MMSE detector in a massive MIMO system and name it as parallelizable Chebyshev iteration (PCI). The algorithm [13] is shown in Algorithm 2.1.
Algorithm 2.1
![../images/478198_1_En_2_Chapter/478198_1_En_2_Figa_HTML.png](../images/478198_1_En_2_Chapter/478198_1_En_2_Figa_HTML.png)
2.3.2 Convergence
![$$ {\varvec{e}}^{(K)} = {\varvec{s}} - {\hat{\varvec{s}}}^{(K)} = P^{(K)} ({\varvec{A}}){\varvec{e}}^{(0)} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ65.png)
![$$ P^{(K)} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq111.png)
![$$ P^{(K)} (\lambda_{i} ) = \frac{{T^{(K)} \left( {\alpha - \frac{\alpha }{\beta }\lambda_{i} } \right)}}{{T^{(K)} (\alpha )}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ66.png)
![$$ \left\| {{\varvec{e}}^{(K)} } \right\| \le \left\| {P^{\left( K \right)} ({\varvec{A}})} \right\|\left\| {{\varvec{e}}^{( 0)} } \right\| $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ67.png)
![$$ \begin{aligned} P^{(K)} ({\varvec{A}}) & = P^{(K)} ({\varvec{SJS}}^{{ - \text{1}}} ) = {\varvec{S}}P^{(K)} ({\varvec{J}}){\varvec{S}}^{ - 1} \\ & = {\varvec{S}}\left[ {\begin{array}{*{20}c} {P^{(K)} (\lambda_{1} )} & {} & {} \\ {} & \ddots & {} \\ {} & {} & {P^{(K)} (\lambda_{{N_{\text{t}} }} )} \\ \end{array} } \right]{\varvec{S}}^{ - 1} \\ \end{aligned} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ68.png)
![$$ N_{\text{t}} \times N_{\text{t}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq112.png)
![$$ {\varvec{S}}^{ - 1} = {\varvec{S}}^{\text{H}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq113.png)
Lemma 2.3.1
![$$ \left| {P^{(K)} (\lambda_{i} )} \right| \approx \left| {\frac{{N_{\text{r}} + N_{\text{t}} - \lambda_{i} + \sqrt {\left( {N_{\text{r}} + N_{\text{t}} - \lambda_{i} } \right)^{ 2} - 1} }}{{ 2N_{\text{r}} }}} \right|^{K} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ69.png)
![$$ P^{(K)} (\lambda_{i} ) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq114.png)
Proof
![$$ \left| {Q^{(K)} (\lambda_{i} )} \right| $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq121.png)
![$$ P^{(K)} (\lambda_{i} ) \approx (V(\lambda_{i} ))^{(K)} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ76.png)
![$$ V(\lambda_{i} ) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq122.png)
![$$ V = V({\varvec{A}}) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq123.png)
![$$ P^{(K)} ({\varvec{A}}) \approx V^{(K)} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ77.png)
![$$ \lambda_{\hbox{min} } < \lambda_{i} < \lambda_{\hbox{max} } $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq124.png)
![$$ \left| {V(\lambda_{i} )} \right| $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq125.png)
![$$ \psi_{i} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq126.png)
![$$ \left| {\psi_{i} } \right| = \left| {\frac{{\alpha - \frac{\alpha }{\beta }\lambda_{i} + \sqrt {\left( {\alpha - \frac{\alpha }{\beta }\lambda_{i} } \right)^{2} - 1} }}{{\alpha + \sqrt {\alpha^{2} - 1} }}} \right| $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ78.png)
Therefore, the conclusion of Lemma 2.3.1 is obtained.
![$$ N_{\text{t}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq127.png)
![$$ N_{\text{r}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq128.png)
![$$ N_{\text{r}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq129.png)
![$$ \left| {P^{(K)} (\lambda_{i} )} \right| $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq130.png)
![$$ N_{\text{r}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq131.png)
![$$ N_{\text{t}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq132.png)
![$$ \mathop {\lim }\limits_{K \to \infty } \left| {P^{(K)} (\lambda_{i} )} \right| = 0 $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ80.png)
![$$ \mathop {\lim }\limits_{K \to \infty } \left| {P^{(K)} ({\varvec{A}})} \right| = 0 $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ81.png)
In other words, in massive MIMO signal detection, using Chebyshev iterative algorithm to estimate the transmitted signal s, the calculation error is very small, even close to zero.
Lemma 2.3.2
In the massive MIMO system, and
are satisfied, where
,
and
are respectively the normalized Chebyshev polynomials corresponding to the Chebyshev iteration algorithm, the CG algorithm and the NSA algorithm.
Proof
![$$ R({\varvec{A}}) = - \log_{2} \left( {\mathop {\lim }\limits_{K \to \infty } \left( {\left\| {P^{(K)} ({\varvec{A}})} \right\|^{{\frac{1}{K}}} } \right)} \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ82.png)
![$$ \mathop {\lim }\limits_{K \to \infty } \left( {\left\| {P^{(K)} ({\varvec{A}})} \right\|^{{\frac{1}{K}}} } \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq138.png)
![$$ \left| {V(\lambda_{i} )} \right| $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq139.png)
![$$ \hbox{min} \hbox{max} \left| {V(\lambda_{i} )} \right| = \hbox{min} \hbox{max} \left| {\frac{{\alpha - \frac{\alpha }{\beta }\lambda_{i} + \sqrt {\left( {\alpha - \frac{\alpha }{\beta }\lambda_{i} } \right)^{2} - 1} }}{{\alpha + \sqrt {\alpha^{2} - 1} }}} \right| $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ83.png)
![$$ \lambda_{i} = \beta $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq140.png)
![$$ \left| {V(\lambda_{i} )} \right| $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq141.png)
![$$ \left| {V_{\text{ch}} (\lambda_{i} )} \right| $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq142.png)
![$$ \begin{aligned} \left| {V_{\text{ch}} (\lambda_{i} )} \right| & = \left| {\frac{1}{{\alpha + \left( {\alpha^{2} - 1} \right)^{{\frac{1}{2}}} }}} \right| = \left| {\frac{1}{{\frac{{\lambda_{\text{max} } \,+\, \lambda_{\text{min} } }}{{\lambda_{\textrm{max} }\, - \,\lambda_{\text{min} } }} + \sqrt {\frac{{\lambda_{\text{max} } \,+\, \lambda_{\text{min} } }}{{\lambda_{\text{max} }\, -\, \lambda_{\text{min} } }}}^{2} - 1}}} \right| \\ & = \left| {\frac{{\lambda_{\text{max} } - \lambda_{\text{min} } }}{{\lambda_{\text{max} } + \lambda_{\text{min} } + 2\sqrt {\lambda_{\text{max} } \cdot \lambda_{\text{min} } } }}} \right| \\ \end{aligned} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ84.png)
![$$ \theta = \sqrt {\frac{{\lambda_{\text{min} } }}{{\lambda_{\textrm{max} } - \lambda_{\textrm{min} } }}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq143.png)
![$$ \left| {V_{\text{cg}} (\lambda_{i} )} \right| $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq144.png)
![$$ \begin{aligned} \left| {V_{{{\text{cg}}}} (\lambda _{i} )} \right| = & \left| {\left[ {\frac{2}{{\left( {1 + 2\theta + \sqrt {\left( {1 + 2\theta } \right)^{2} - 1} } \right)^{K} + \left( {1 + 2\theta + \sqrt {\left( {1 + 2\theta } \right)^{2} - 1} } \right)^{{ - K}} }}} \right]^{{\frac{1}{K}}} } \right| \\ \quad \ge & \left| {\frac{1}{{1 + 2\theta + \sqrt {\left( {1 + 2\theta } \right)^{2} - 1} }}} \right| = \left| {\frac{{\lambda _{{{\text{max}}}} - \lambda _{{{\text{min}}}} }}{{\lambda _{{{\text{max}}}} + \lambda _{{{\text{min}}}} + 2\sqrt {\lambda _{{{\text{max}}}} \cdot \lambda _{{{\text{min}}}} } }}} \right| \\ \quad = & \left| {V_{{{\text{ch}}}} (\lambda _{i} )} \right| \\ \end{aligned} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ85.png)
We can solve from Eq. (2.85). This inequality indicates that compared with CG algorithm and Chebyshev iteration algorithm,
has a smaller maximum value using Chebyshev iteration algorithm.
![$$ \left| {V_{\text{ch}} \left( {N_{\text{r}} ,N_{\text{t}} } \right)} \right| $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq147.png)
![$$ \left| {V_{\text{ch}} \left( {N_{\text{r}} ,N_{\text{t}} } \right)} \right| = \left| {\frac{{2\sqrt {N_{\text{r}} N_{\text{t}} } }}{{N_{\text{r}} + N_{\text{t}} + \left( {N_{\text{r}} + N_{\text{t}} } \right)^{2} - \left( {2\sqrt {N_{\text{r}} N_{\text{t}} } } \right)^{2} }}} \right| = \sqrt {\frac{{N_{\text{t}} }}{{N_{\text{r}} }}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ86.png)
Equation (2.87) indicates that compared with the Chebyshev iteration algorithm and the NSA algorithm, has a smaller maximum value.
Combined with Eqs. (2.68), (2.82) and (2.83), a smaller maximum value of results in a faster convergence rate. Therefore, as we know from Lemma 2.3.2, the convergence rate of Chebyshev iteration algorithm is faster than CG algorithm and NSA algorithm.
2.3.3 Complexity and Parallelism
After discussing the convergence and convergence rate of the Chebyshev iteration algorithm, we will analyze the computational complexity of the Chebyshev iteration algorithm. As computational complexity mainly refers to the number of multiplications in the algorithm, we can count the number of multiplications required by the algorithm to evaluate the computational complexity. In the Chebyshev iteration algorithm, the first part of the calculation is the parameter calculation based on Eqs. (2.46)–(2.53). Because of the fixed scale of the massive MIMO system, these parameters only need to be computed once and stored in memory as constants when calculating multiple sets of data so the actual number of multiplications is negligible. The second part is to match the filter vector and find the approximate initial solution by using the eigenvalues. The
matrix
needs to be multiplied by
vector y to get the
vector
, then calculate
of its value. There are
actual multiplications in this process. There are three steps in the third part of the computation. First, multiply the
matrix H by the
transmitted vector
. Then solve the residual vector
and finally find the correction vector
. These three steps require
,
and
real number multiplications respectively. The last part of the computation is the calculation of approximate log-likelihood ratio. This step requires
multiplications when the modulation order is 64-QAM. Therefore, the total number of multiplications required by the Chebyshev iteration algorithm is
.
![$$ O(N_{\text{t}}^{2} ) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq169.png)
![$$ O(N_{\text{t}}^{2} ) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq170.png)
![$$ O(N_{\text{t}}^{3} ) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq171.png)
![$$ N_{\text{t}} \times N_{\text{r}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq172.png)
![$$ {\varvec{H}}^{\text{H}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq173.png)
![$$ N_{\text{r}} \times N_{\text{t}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq174.png)
![$$ O\left( {N_{\text{r}} N_{\text{t}}^{2} } \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq175.png)
Computational complexity analysis
Algorithm | K = 2 | K = 3 | K = 4 | K = 5 |
---|---|---|---|---|
NSA [7] | 2NrNt2 + 6NtNr + 4Nt2 + 2Nt | 2NrNt2 + 6NtNr + 2Nt3 + 4Nt2 | 2NrNt2 + 6NtNr + 6Nt3 | 2NrNt2 + 6NtNr + 10Nt3-6Nt2 |
INS [17] | 2NrNt2 + 6NtNr + 10Nt2 + 2Nt | 2NrNt2 + 6NtNr +14Nt2 + 2Nt | 2NrNt2 + 6NtNr +18Nt2 + 2Nt | 2NrNt2 + 6NtNr +22Nt2 + 2Nt |
GAS [12] | 2 NrNt2 + 6NtNr +10Nt2-2Nt | 2NrNt2 + 6NtNr +14Nt2-6Nt | 2NrNt2 + 6NtNr +18Nt2-10Nt | 2NrNt2 + 6NtNr +22Nt2-14Nt |
CG [16] | 2 NrNt2 + 6NtNr +8Nt2 + 33Nt | 2NrNt 2 + 6NtNr + 12Nt2 + 49Nt | 2NrNt 2 + 6NtNr + 16Nt2 + 65Nt | 2NrNt 2 + 6NtNr + 20Nt2 + 81Nt |
CGLS [18] | 16NtNr + 20Nt2 + 32Nt | 20NtNr + 28Nt2 + 48Nt | 24NtNr + 36Nt2 + 64Nt | 28NtNr + 44Nt2 + 80Nt |
OCD [19] | 16NtNr +4Nt | 20NtNr + 6Nt | 32NtNr + 8Nt | 40NtNr + 10Nt |
PCI | 2NrNt 2 + 6NtNr + 8Nt2 + 8Nt | 2NrNt 2 + 6NtNr + 12Nt2 + 12Nt | 2NrNt 2 + 6NtNr + 16Nt2 + 16Nt | 2NrNt 2 + 6NtNr + 20Nt2 + 20Nt |
12NtNr + 7Nt | 20NtNr + 13Nt | 28NtNr + 19Nt | 36NtNr + 25Nt |
Now consider the parallelism of PCI. In the Gauss–Seidel (GAS) algorithm [12], there is a strong correlation between elements, which means that the method has low parallelism. In the calculation of GAS algorithm, when calculating at the Kth iteration, it requires
,
and
,
of the previous K iterations. Moreover, in the CHD algorithm as well as NSA, GAS, and CG methods, there is a strong correlation between large-scale matrix inverse and multiplication, which requires the calculation of large-scale matrix multiplication first. This results in their reduced parallelism. PCI parallelism is an important problem in algorithm design and hardware implementation. According to the algorithm, when calculating
and correcting vector
and residual vector
, each element in them can be done in parallel. Besides this, we can find that the implicit method reduces the correlation between large-scale matrix multiplication and inverse calculation, and improves the parallelism of the algorithm.
2.3.4 Bit Error Rate
To evaluate the performance of PCI, BER’s simulation results are compared with NSA, CG, GAS, OCD and the Richardson (RI) algorithm below. Furthermore, the BER performance comparison also includes the MMSE algorithm based on the CHD. SNR is defined at the receiving antenna [7]. The number of iterations K of PCI is an initial value calculation of the K-1th iterations.
![$$ N_{\text{t}} = 16 $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq184.png)
![$$ N_{\text{r}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq185.png)
![../images/478198_1_En_2_Chapter/478198_1_En_2_Fig3_HTML.png](../images/478198_1_En_2_Chapter/478198_1_En_2_Fig3_HTML.png)
BER simulation results of various algorithms at N = 0, SNR = 14 dB.
© [2018] IEEE. Reprinted, with permission, from Ref. [13]
![../images/478198_1_En_2_Chapter/478198_1_En_2_Fig4_HTML.png](../images/478198_1_En_2_Chapter/478198_1_En_2_Fig4_HTML.png)
BER performance comparisons between PCI after updating initial value and PCI for conventional zero vector value.
© [2018] IEEE. Reprinted, with permission, from Ref. [13]
![$$ N_{\text{r}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq186.png)
![$$ N_{\text{t}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq187.png)
![$$ N_{\text{r}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq188.png)
![$$ N_{\text{t}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq189.png)
![../images/478198_1_En_2_Chapter/478198_1_En_2_Fig5a_HTML.png](../images/478198_1_En_2_Chapter/478198_1_En_2_Fig5a_HTML.png)
![../images/478198_1_En_2_Chapter/478198_1_En_2_Fig5b_HTML.png](../images/478198_1_En_2_Chapter/478198_1_En_2_Fig5b_HTML.png)
BER comparisons between PCI and other algorithms. a Nr = 64, Nt = 16, b Nr = 90, Nt = 16, c Nr = 128, Nt = 16, d Nr = 162, Nt = 16
© [2018] IEEE. Reprinted, with permission, from Ref. [13]
2.3.5 Analysis on Channel Model Impact
![$$ N\left( {0,d({\varvec{z}}){\varvec{I}}_{{N_{\text{r}} }} } \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq190.png)
![$$ {\varvec{H}} = {\varvec{R}}_{\text{r}}^{{^{{\frac{1}{2}}} }} {\varvec{H}}_{{{\text{i}} . {\text{i}} . {\text{d}} .}} \sqrt {d({\varvec{z}})} {\varvec{R}}_{\text{t}}^{{^{{\frac{ 1}{ 2}}} }} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ88.png)
![$$ 10\lg C\sim\,N\left( {0,\sigma_{\text{sf}}^{2} } \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ89.png)
Considering the channel fading variance , where
, κ and ||·|| are the base station location, path loss index and Euclidean norm respectively. The simulation adopts the following assumption: κ = 3.7,
and the transmitted power is
.
![../images/478198_1_En_2_Chapter/478198_1_En_2_Fig6_HTML.png](../images/478198_1_En_2_Chapter/478198_1_En_2_Fig6_HTML.png)
BER comparisons of various algorithms under the Kronecker channel model.
© [2018] IEEE. Reprinted, with permission, from Ref. [13]
2.4 Jacobi Iteration Algorithm
2.4.1 Weighted Jacobi Iteration and Convergence
![$$ {\varvec{W}} = {\varvec{P}} + {\varvec{Q}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq195.png)
![$$ {\hat{\varvec{s}}}^{(K)} = {\varvec{B\hat{s}}}^{{\left( {K - 1} \right)}} + {\varvec{F}}{ = }\left( {\left( {1 - \omega } \right){\varvec{I}} - \omega {\varvec{P}}^{ - 1} {\varvec{Q}}} \right){\hat{\varvec{s}}}^{{\left( {K - 1} \right)}} + \omega {\varvec{P}}^{ - 1} {\varvec{y}}^{\text{MF}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ90.png)
![$$ {\varvec{B}} = \left( {\left( {1 - \omega } \right){\varvec{I}} - \omega {\varvec{P}}^{ - 1} {\varvec{Q}}} \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq196.png)
![$$ {\varvec{F}} = \omega {\varvec{P}}^{ - 1} {\varvec{y}}^{\text{MF}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq197.png)
![$$ {\hat{\varvec{s}}}^{(0)} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq198.png)
![$$ 0 < \omega < 1 $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq199.png)
![$$ 0 < \omega < \frac{2}{\rho }\left( {{\varvec{P}}^{ - 1} {\varvec{W}}} \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq200.png)
![$$ {\hat{\varvec{s}}}^{(0)} = \left( {{\varvec{I}} - {\varvec{P}}^{ - 1} {\varvec{Q}}} \right){\varvec{P}}^{ - 1} {\varvec{y}}^{\text{MF}} = \left( {{\varvec{I}} - {\varvec{R}}} \right){\varvec{T}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ91.png)
In addition, it is because of the increase of algorithm convergence rate that this initial value also reduces hardware resource consumption and increases data throughput rate.
![$$ \Delta = {\varvec{s}} - {\hat{\varvec{s}}}^{(K)} \approx {\hat{\varvec{s}}}^{(\infty )} - {\hat{\varvec{s}}}^{(K)} = {\varvec{B}}^{K} \left( {{\varvec{s}} - {\hat{\varvec{s}}}^{(0)} } \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ92.png)
![$$ {\varvec{s}} = {\hat{\varvec{s}}}^{(\infty )} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq201.png)
![$$ R({\varvec{B}}) = - \ln \left( {\mathop {\lim }\limits_{K \to \infty } \left\| {{\varvec{B}}^{K} } \right\|^{{\frac{1}{K}}} } \right) = - \ln \left( {\rho ({\varvec{B}})} \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ93.png)
![$$ \rho ({\varvec{B}}) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq202.png)
![$$ \rho ({\varvec{B}}) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq203.png)
Lemma 2.4.1
In massive MIMO systems, ,
and
are the iterative matrices of the WeJi and NSA respectively.
Proof
![$$ {\varvec{B}}_{\text{W}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq207.png)
![$$ \rho \left( {{\varvec{B}}_{\text{W}} } \right) = \rho \left( {\left( {1 - \omega } \right){\varvec{I}} - \omega {\varvec{P}}^{ - 1} {\varvec{Q}}} \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ94.png)
![$$ \rho \left( {{\varvec{B}}_{\text{N}} } \right) = \rho \left( {{\varvec{P}}^{ - 1} {\varvec{Q}}} \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ96.png)
![$$ \rho \left( {{\varvec{B}}_{\text{W}} } \right) \le \omega \rho \left( {{\varvec{B}}_{\text{N}} } \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ97.png)
![$$ \left\| \Delta \right\|_{ 2} \le \left\| {{\varvec{B}}_{\text{W}}^{K} } \right\|_{\text{F}} \left\| {{\varvec{s}} - {\hat{\varvec{s}}}^{(0)} } \right\|_{ 2} \le \left\| {{\varvec{B}}_{\text{W}} } \right\|_{\text{F}}^{K} \left\| {{\varvec{s}} - {\hat{\varvec{s}}}^{(0)} } \right\|_{ 2} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ98.png)
According to the above expression (2.98), if is satisfied, the approximate error of WeJi will exponentially approach 0 with the increase of the number of iterations K.
Lemma 2.4.2
![$$ \left\| {{\varvec{B}}_{\text{W}} } \right\|_{\text{F}} < 1 $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq212.png)
![$$ {\text{P}}\left\{ {\left\| {{\varvec{B}}_{\text{W}} } \right\|_{\text{F}} < 1} \right\} \ge 1 - \omega \sqrt[4]{{\frac{{(N_{\text{r}} + 17)(N_{\text{t}} - 1)N_{\text{r}}^{2} }}{{2N_{\text{r}}^{3} }}}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ99.png)
Proof
![$$ P\left\{ {\left\| {{\varvec{B}}_{\text{W}} } \right\|_{\text{F}} < 1} \right\} \ge 1 - P\left\{ {\left\| {{\varvec{B}}_{\text{W}} } \right\|_{\text{F}} \ge 1} \right\} \ge 1 - E\left( {\left\| {{\varvec{B}}_{\text{W}} } \right\|_{\text{F}} } \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ100.png)
![$$ 1 - \omega $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq213.png)
![$$ 0 < 1 - \omega < 1 $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq214.png)
![$$ \left( {1 - \omega } \right){\varvec{I}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq215.png)
![$$ P\left\{ {\left\| {{\varvec{B}}_{\text{W}} } \right\|_{\text{F}} < 1} \right\} \ge 1 - E\left( {\left\| {\omega {\varvec{P}}^{ - 1} {\varvec{Q}}} \right\|_{\text{F}} } \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ101.png)
![$$ \left\| {{\varvec{B}}_{\text{W}} } \right\|_{\text{F}} < 1 $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq216.png)
![$$ E\left( {\left\| {\omega {\varvec{P}}^{ - 1} {\varvec{Q}}} \right\|_{\text{F}} } \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq217.png)
![$$ \left\| {{\varvec{P}}^{ - 1} {\varvec{Q}}} \right\|_{\text{F}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq218.png)
![$$ a_{ij} \to \left\{ {\begin{array}{*{20}l} {\sum\limits_{t = 1}^{{N_{\text{r}} }} {h_{ti}^{*} h_{tj} } ,} \hfill & {i \ne j} \hfill \\ {\sum\limits_{m = 1}^{{N_{\text{r}} }} {\left| {h_{mi} } \right|^{2} + \frac{{N_{0} }}{{E_{\text{s}}^{{}} }},} } \hfill & {i = j} \hfill \\ \end{array} } \right. $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ102.png)
![$$ E\left( {\left\| {\omega {\varvec{P}}^{ - 1} {\varvec{Q}}} \right\|_{\text{F}} } \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq219.png)
![$$ E\left( {\left\| {\omega {\varvec{P}}^{ - 1} {\varvec{Q}}} \right\|_{\text{F}} } \right) = E\left( {\sqrt {\sum\limits_{i = 1}^{{N_{\text{t}} }} {\sum\limits_{j = 1,i \ne j}^{{N_{\text{t}} }} {\left| {\omega \frac{{a_{ij} }}{{a_{ii} }}} \right|^{2} } } } } \right) = E\left( {\sqrt[4]{{\left( {\sum\limits_{i = 1}^{{N_{\text{t}} }} {\sum\limits_{j = 1,i \ne j}^{{N_{\text{t}} }} {\left( {\omega^{2} \frac{{\left| {a_{ij} } \right|^{2} }}{{\left| {a_{ii} } \right|^{2} }}} \right)} } } \right)^{2} }}} \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ103.png)
![$$ E\left( {\left\| {\omega {\varvec{P}}^{ - 1} {\varvec{Q}}} \right\|_{\text{F}} } \right) \le \omega \sqrt[4]{{\sum\limits_{i = 1}^{{N_{\text{t}} }} {\sum\limits_{j = 1,i \ne j}^{{N_{\text{t}} }} {E\left( {\left| {a_{ij} } \right|^{4} } \right) \cdot } } \sum\limits_{i = 1}^{{N_{\text{t}} }} {E\left( {\frac{1}{{\left| {a_{ii} } \right|^{4} }}} \right)} }} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ104.png)
![$$ E\left( {\left| {a_{ij} } \right|^{4} } \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq220.png)
![$$ \sum\limits_{i = 1}^{{N_{\text{t}} }} {E\left( {\frac{1}{{\left| {a_{ii} } \right|^{4} }}} \right)} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq221.png)
![$$ N_{\text{t}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq222.png)
![$$ E\left( {\frac{1}{{\left| {a_{ii} } \right|^{4} }}} \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq223.png)
![$$ E\left( {\frac{1}{{\left| {a_{ii} } \right|^{4} }}} \right) = \frac{1}{{N_{\text{r}}^{4} }} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ105.png)
![$$ \varphi \left( {h_{1i}^{*} h_{1j} ,h_{2i}^{*} h_{2j} , \cdots ,h_{{N_{\text{r}} i}}^{*} h_{{N_{\text{r}} j}} } \right) = \frac{{{\text{e}}^{{ - \frac{{\left( {{\varvec{X}} - \mu } \right)^{\text{T}} {\varvec{C}}^{ - 1} \left( {{\varvec{X}} - \mu } \right)}}{2}}} }}{{\left( {2\,\uppi} \right)^{{\frac{{N_{\text{r}} }}{2}}} \sqrt {\det {\varvec{C}}} }} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ109.png)
![$$ N(0, 1 ) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq227.png)
![$$ m \ne p $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq228.png)
![$$ h_{mi}^{ * } h_{mj} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq229.png)
![$$ h_{pi}^{ * } h_{pj} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq230.png)
![$$ N(0, 1 ) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq231.png)
![$$ \varphi \left( {h_{1i}^{*} h_{1j} ,h_{2i}^{*} h_{2j} , \cdots ,h_{{N_{\text{r}} i}}^{*} h_{{N_{\text{r}} j}} } \right) = \frac{1}{{\left( {2\,\uppi} \right)^{{\frac{{N_{\text{r}} }}{2}}} }}{\text{e}}^{{ - \frac{{{\varvec{X}}^{\text{T}} {\varvec{X}}}}{2}}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ110.png)
![$$ m \ne p $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq232.png)
![$$ h_{mi}^{ * } h_{mj} h_{pi}^{ * } h_{pj} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq233.png)
![$$ N(0, 1 ) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq234.png)
![$$ \left( {h_{ii} } \right)^{2} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq235.png)
![$$ \chi^{2} (1) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq236.png)
![$$ \left( {h_{ii} } \right)^{2} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq237.png)
![$$ f\left( {h_{mi} ;1} \right) = \left\{ {\begin{array}{*{20}c} {\frac{1}{{2\Gamma \left( {\frac{1}{2}} \right)}}\sqrt {\frac{2}{{h_{mi} }}} {\text{e}}^{{ - \frac{{h_{mi} }}{2}}} } & {h_{mi} > 0} \\ 0 & {h_{mi} < 0} \\ \end{array} } \right. $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ111.png)
![$$ E\left( {\left| {h_{ii} } \right|^{2} } \right) = 1 $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq238.png)
![$$ D\left( {\left| {h_{ii} } \right|^{2} } \right) = 2 $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq239.png)
![$$ i \ne j $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq240.png)
![$$ h_{mi}^{ * } $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq241.png)
![$$ h_{mj} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq242.png)
![$$ E\left( {\left| {h_{mi} } \right|^{2} } \right) = D\left( {\left| {h_{mi} } \right|^{2} } \right) + \left[ {E\left( {\left| {h_{mi} } \right|^{2} } \right)} \right] = 3 $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq243.png)
![$$ E\left( {\left| {h_{mi}^{*} h_{mj} } \right|^{2} } \right) = E\left( {\left| {h_{mi}^{*} } \right|^{2} } \right)E\left( {\left| {h_{mj} } \right|^{2} } \right) = 1 $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq244.png)
![$$ E\left( {\left| {h_{mi}^{*} h_{mj} } \right|^{4} } \right) = E\left( {\left| {h_{mi}^{*} } \right|^{4} } \right)E\left( {\left| {h_{mj} } \right|^{4} } \right) = 9 $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq245.png)
![$$ E\left( {\left| {a_{ii} } \right|^{4} } \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq246.png)
![$$ E\left( {\left| {a_{ij} } \right|^{4} } \right)\,=\,N_{\text{r}} E\left( {\left( {h_{mi}^{*} h_{mj} } \right)^{4} } \right) + \left( \begin{aligned} N_{\text{r}} \hfill \\ 2 \hfill \\ \end{aligned} \right)\left( {E\left( {h_{mi}^{*} h_{mj} } \right)^{2} } \right)^{2} = \frac{1}{2}\left( {N_{\text{r}}^{2} + 17N_{\text{r}} } \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ112.png)
By substituting Eqs. (2.105) and (2.112) into Eqs. (2.101) and (2.104), we can obtain the conclusion in Lemma 2.4.2.
Lemma 2.4.2 demonstrates that when is fixed, the probability of
increases with the increase of
. Because of
in the massive MIMO system, the probability of
is close to 1.
2.4.2 Complexity and Frame Error Rate
![$$ {\varvec{R}}\left( {{\varvec{R = P}}^{ - 1} {\varvec{Q}}} \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq252.png)
![$$ {\varvec{T}}\left( {{\varvec{T = P}}^{ - 1} {\varvec{y}}^{\text{MF}} } \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq253.png)
![$$ {\hat{\varvec{s}}}^{(0)} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq254.png)
![$$ {\hat{\varvec{s}}}^{(K)} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq255.png)
![$$ {\varvec{y}}^{\text{MF}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq256.png)
![$$ N_{\text{t}} \times N_{\text{t}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq257.png)
![$$ N_{\text{t}} \times N_{\text{t}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq258.png)
![$$ N_{\text{t}} \times 1 $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq259.png)
![$$ {\varvec{y}}^{\text{MF}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq260.png)
![$$ 2N_{\text{t}} \left( {N_{\text{t}} - 1} \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq261.png)
![$$ 2N_{\text{t}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq262.png)
![$$ 4N_{\text{t}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq263.png)
![$$ \left( {4K + 4} \right)2N_{\text{t}}^{2} - \left( {4K - 4} \right)2N_{\text{t}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq264.png)
![../images/478198_1_En_2_Chapter/478198_1_En_2_Fig7_HTML.png](../images/478198_1_En_2_Chapter/478198_1_En_2_Fig7_HTML.png)
Comparison of the numbers of the actual multiplications between the WeJi and other algorithms.
© [2018] IEEE. Reprinted, with permission, from Ref. [29]
![$$ \hat{s}_{i}^{{^{(K)} }} = \frac{\omega }{{A_{i,i} }}y_{i}^{\text{MF}} + \frac{\omega }{{A_{i,i} }}\sum\limits_{j \ne i} {\left[ {A_{i,j} \hat{s}_{j}^{{^{{\left( {K - 1} \right)}} }} + \left( {1 - \omega } \right)\hat{s}_{j}^{{^{{\left( {K - 1} \right)}} }} } \right]} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ113.png)
The computation of only requires the elements of the previous iterations, so all the elements of
can be computed in parallel. However, in the GAS, the successive over-relaxation (SOR) method [26] and the symmetric successive over-relaxation (SSOR) method [27], each transmitted signal has a strong correlation in the iterative steps. When calculating
, we need
of the Kth iteration and
of the (K-1)th iteration. This means that the computation for each element cannot be executed in parallel. Therefore, neither the GAS [25] nor the SOR [26] can achieve a high data throughput rate and their throughput rates are far lower than that of the WeJi.
It was noted that the detection method is also proposed based on the Jacobi iteration in the literature [28]. Compared with this method, the WeJi described in this section achieves better performance in the following three aspects. First, the WeJi is a method based on hardware architecture design consideration, that is, the hardware implementation is fully considered in the process of algorithm optimization and improvement. In the process of algorithm design for the WeJi, the detection accuracy, computational complexity, parallelism, and hardware reusability are considered. On the contrary, hardware implementation problems are not considered in the literature [28], such as parallelism and hardware reuse. Second, the initial iterative solutions in WeJi in this section are different from that in literature [28]. The initial value in literature [28] is fixed. By contrast, the method described in this section takes into account the characteristics of the massive MIMO system, including a computational method for the initial solution. According to Eq. (2.91), the initial iterative solution is close to the final result, so the number of iterations can be reduced and the hardware consumption can be minimized. Furthermore, the method of the initial solution for the WeJi is similar to the algorithm in the later iterative steps, and the hardware resources can be reused. Because the Gram matrix G computation will occupy a large number of clock cycles before the iteration, the reuse of hardware resources will not affect the system throughput rate. Third, compared with the literature [28], the WeJi introduces a weighed factor, as shown in Eq. (2.90), so that improves the accuracy of the solution and consequently reduces hardware resource consumption. In addition, the same unit can be reused to increase unit utilization during the pre-iteration and the iteration, and this reuse does not affect the data throughput of hardware.
Next, we will discuss the frame error rate (FER) of the WeJi and the latest other signal detection algorithms. The FER performance of the exact matrix inversion (the CHD) algorithm is also used as a comparison object. In comparison, we consider modulation scheme of 64-QAM. The channel is assumed to be an independent and identically distributed Rayleigh fading matrix. The output (LLR) is adopted by Viterbi decoding. In the receiver, the LLR is the soft-input of the viterbi decoding. As for 4G and 5G, we have discussed a kind of parallel cascade convolution code, the Turbo code [1]. The Turbo scheme currently used in 4G is also an important encoding scheme for 5G and is widely used. Furthermore, these emulation settings are often used in many massive MIMO detection algorithms and architectures in the 5G communications.
![../images/478198_1_En_2_Chapter/478198_1_En_2_Fig8a_HTML.png](../images/478198_1_En_2_Chapter/478198_1_En_2_Fig8a_HTML.png)
![../images/478198_1_En_2_Chapter/478198_1_En_2_Fig8b_HTML.png](../images/478198_1_En_2_Chapter/478198_1_En_2_Fig8b_HTML.png)
Performance diagram of various algorithms. a Nr = 64, Nt = 8, 1/2 bit rate, b Nr = 128, Nt = 8, 1/2 bit rate, c Nr = 128, Nt = 8, 3/4 bit rate.
© [2018] IEEE. Reprinted, with permission, from Ref. [29]
2.4.3 Analyses on Channel Model Effects
![$$ N\left( {0,d({\varvec{z}}){\varvec{I}}_{B} } \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq270.png)
![$$ d({\varvec{z}}) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq271.png)
![$$ d({\varvec{z}}) = \frac{C}{{\left\| {{\varvec{z}} - {\varvec{b}}} \right\|^{\kappa } }} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq272.png)
![$$ {\varvec{z}} \in \varvec{R}^{2} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq273.png)
![$$ {\varvec{z}} \in \varvec{R}^{2} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq274.png)
![$$ \left\| \cdot \right\| $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq275.png)
![$$ 10\lg N\left( {0,\sigma_{\text{sf}}^{2} } \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq276.png)
![../images/478198_1_En_2_Chapter/478198_1_En_2_Fig9_HTML.png](../images/478198_1_En_2_Chapter/478198_1_En_2_Fig9_HTML.png)
FER performance of Kronecker channel model.
© [2018] IEEE. Reprinted, with permission, from Ref. [29]
![$$ {\varvec{H}} = {\varvec{R}}_{\text{r}}^{{^{{\frac{1}{2}}} }} {\varvec{H}}_{{{\text{i}} . {\text{i}} . {\text{d}} .}} \sqrt {d\left( {\varvec{z}} \right)} {\varvec{R}}_{\text{t}}^{{^{{\frac{1}{2}}} }} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ114.png)
![$$ {\varvec{H}}_{{{\text{i}} . {\text{i}} . {\text{d}} .}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq277.png)
![$$ r_{ij} = \left\{ {\begin{array}{*{20}c} {\xi^{j - i} ,} & {i \le j} \\ {\left( {\xi^{j - i} } \right),} & {i > j} \\ \end{array} } \right. $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ115.png)
![$$ \sigma_{\text{sf}}^{2} = 5 $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq278.png)
![$$ \rho = \frac{{\gamma^{\kappa } }}{2} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq279.png)
2.5 Conjugate Gradient Algorithm
2.5.1 Algorithm Design
![$$ {\varvec{x}}_{K + 1} = {\varvec{x}}_{K} + \alpha_{K} {\varvec{p}}_{K} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ116.png)
![$$ \alpha_{K} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq280.png)
![$$ \alpha_{K} = \frac{{\left( {{\varvec{r}}_{K} ,{\varvec{r}}_{K} } \right)}}{{\left( {{\varvec{Ap}}_{K} ,{\varvec{r}}_{K} } \right)}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ117.png)
![$$ {\varvec{r}}_{K + 1} = {\varvec{r}}_{K} + \alpha_{K} {\varvec{Ap}}_{K} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ118.png)
By using the CG algorithm, we can solve the linear matrix equation. Hence this method can be applied in the massive MIMO signal detection. However, the conventional CG algorithm still has some deficiencies. First of all, in the original algorithm, although every element in the vector can be multiplied and accumulated in parallel with each row element of the matrix when matrix times vector, there is a strong data dependence between the steps, so the computation must be carried out step by step according to the order of the algorithm, and the degree of parallelism is not high. Secondly, the conventional CG algorithm does not provide information about the initial value of iteration, but the zero vector is usually used in the iteration. Obviously, zero vector can only satisfy the feasible requirements, but it is not optimal. Therefore, finding a better initial iterative value can make the algorithm converge faster and reduce the number of iterations, thus indirectly reducing the computational complexity of the algorithm. The CG algorithm has been improved to satisfy better parallelism and convergence for the above two points. The improved CG algorithm is named as three-term-recursion conjugate gradient (TCG) algorithm [32].
![$$ {\varvec{r}}_{K + 1} = \rho_{K} \left( {{\varvec{r}}_{K} - \gamma_{K} {\varvec{Ar}}_{K} } \right) + \mu_{K} {\varvec{r}}_{K - 1} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ119.png)
![$$ \left( {{\varvec{Ar}}_{K} ,{\varvec{r}}_{K - 1} } \right) = \left( {{\varvec{r}}_{K} ,{\varvec{Ar}}_{K - 1} } \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq290.png)
![$$ {\varvec{Ar}}_{K - 1} = - \frac{{{\varvec{r}}_{K} }}{{\rho_{K - 1}^{{}} \gamma_{K - 1}^{{}} }} + \frac{{{\varvec{r}}_{K - 1} }}{{\gamma_{K - 1}^{{}} }} + \frac{{\left( {1 - \rho_{K - 1} } \right){\varvec{r}}_{K - 1} }}{{\rho_{K - 1}^{{}} \gamma_{K - 1}^{{}} }} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq291.png)
![$$ \rho_{K} = \frac{1}{{1 - \frac{{\gamma_{K} }}{{\gamma_{K - 1} }}\frac{{\left( {{\varvec{r}}_{K} ,{\varvec{r}}_{K} } \right)}}{{\left( {{\varvec{r}}_{K - 1} ,{\varvec{r}}_{K - 1} } \right)}}\frac{1}{{\rho_{K - 1} }}}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ123.png)
![$$ {\varvec{x}}_{K + 1} = \rho_{K} \left( {{\varvec{x}}_{K} + \gamma_{K} {\varvec{r}}_{K} } \right) + \left( {1 - \rho_{K} } \right){\varvec{x}}_{K - 1} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ124.png)
By processing the massive MIMO system and utilizing the above derivation process, we can get the TCG algorithm that is applied in the massive MIMO signal detection.
Algorithm 2.2
![../images/478198_1_En_2_Chapter/478198_1_En_2_Figb_HTML.png](../images/478198_1_En_2_Chapter/478198_1_En_2_Figb_HTML.png)
We can see from the Algorithm 2.2 that no data dependence exists between and
and between
and
. So they can perform computation at the same time, increasing the parallelism of the algorithm. Besides since there is also operation of matrix multiplied by the vector, each element in the vector can still be multiplied and accumulated with each row element of the matrix in the hardware design.
2.5.2 Convergence
![$$ C_{K} (t) = { \cos }\left[ {K{\text{arcos}}\left( t \right)} \right], - 1 \le t \le 1 $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ125.png)
![$$ C_{K + 1} (t) = 2tC_{K} (t) - C_{K - 1} (t),C_{0} (t) = 1,C_{1} (t) = t $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ126.png)
![$$ \left| t \right| > 1 $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq296.png)
![$$ C_{K} (t) = { \cosh }\left[ {K{\text{arcosh}}\left( t \right)} \right], \, \left| t \right| \ge 1 $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ127.png)
![$$ C_{K} (t) = \frac{1}{2}\left[ {\left( {t + \sqrt {t^{2} - 1} } \right)^{K} + \left( {t + \sqrt {t^{2} - 1} } \right)^{ - K} } \right] \ge \frac{1}{2}\left( {t + \sqrt {t^{2} - 1} } \right)^{K} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ128.png)
![$$ \eta $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq298.png)
![$$ \begin{aligned} 1 + 2\eta + 2\sqrt {\eta \left( {\eta + 1} \right)} & = \left( {\sqrt \eta + \sqrt {\eta + 1} } \right)^{2} = \frac{{\left( {\sqrt {\lambda_{\hbox{min} } } + \sqrt {\lambda_{\hbox{max} } } } \right)^{2} }}{{\lambda_{\hbox{max} } - \lambda_{\hbox{min} } }} \\ & = \frac{{\sqrt {\lambda_{\hbox{max} } } + \sqrt {\lambda_{\hbox{min} } } }}{{\sqrt {\lambda_{\hbox{max} } } - \sqrt {\lambda_{\hbox{min} } } }} = \frac{\sqrt \kappa + 1}{\sqrt \kappa - 1} \\ \end{aligned} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ130.png)
![$$ \kappa = \frac{{\lambda_{\textrm{max} } }}{{\lambda_{\text{min} } }} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq299.png)
![$$ {\varvec{x}}_{*} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq300.png)
![$$ \left\| {{\varvec{x}}_{*} - {\varvec{x}}_{K} } \right\|_{A} \le 2\left( {\frac{\sqrt \kappa - 1}{\sqrt \kappa + 1}} \right)^{K} \left\| {{\varvec{x}}_{*} - {\varvec{x}}_{0} } \right\|_{A} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ131.png)
The observation (2.131) shows that denotes the exact solution vector, which is a definite invariant vector. So when the initial value of iteration is determined,
on the right side of (2.131) can be regarded as a constant. When the number of iterations K increases,
exponentially approaches 0. This indicates that CG can converge faster when K is large, and the error of CG algorithm decreases with the increase of number of iterations K.
2.5.3 Initial Iteration Value and Search
Iteration requires corresponding initial value vectors in the TCG. As mentioned earlier, how to find a good initial value vector is crucial to the convergence rate of the algorithm. When introducing PCI in Sect. 2.3, we mentioned an eigenvalue-based initial value algorithm, which can be used in the TCG to make algorithm converge faster. Besides this, another quadrant-based initial value algorithm will be discussed in this section.
![$$ \left[ {\begin{array}{*{20}c} {\text{Re} \left\{ {\varvec{y}} \right\}} \\ {\text{Im} \left\{ {\varvec{y}} \right\}} \\ \end{array} } \right]_{{2N_{\text{r}} \times 1}} = \left[ {\begin{array}{*{20}c} {\text{Re} \left\{ {\varvec{H}} \right\}} & { - \text{Im} \left\{ {\varvec{H}} \right\}} \\ {\text{Im} \left\{ {\varvec{H}} \right\}} & {\text{Re} \left\{ {\varvec{H}} \right\}} \\ \end{array} } \right]_{{2N_{\text{r}} \times 2N_{\text{t}} }} \left[ {\begin{array}{*{20}c} {\text{Re} \left\{ {\varvec{s}} \right\}} \\ {\text{Im} \left\{ {\varvec{s}} \right\}} \\ \end{array} } \right]_{{2N_{\text{t}} \times 1}} + \left[ {\begin{array}{*{20}c} {\text{Re} \left\{ {\varvec{n}} \right\}} \\ {\text{Im} \left\{ {\varvec{n}} \right\}} \\ \end{array} } \right]_{{2N_{\text{r}} \times 1}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ132.png)
![$$ {\varvec{y}}_{\text{R}} = {\varvec{H}}_{\text{R}} {\varvec{s}}_{\text{R}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ133.png)
![$$ {\hat{\varvec{s}}}_{\text{R}} = \left( {{\varvec{H}}_{\text{R}}^{\text{H}} {\varvec{H}}_{\text{R}} } \right)^{ - 1} {\varvec{H}}_{\text{R}} {\varvec{y}}_{\text{R}} = \left( {{\varvec{H}}_{\text{R}}^{\text{H}} {\varvec{H}}_{\text{R}} } \right)^{ - 1} {\varvec{y}}_{\text{R}}^{\text{MF}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ134.png)
![$$ {\varvec{H}}_{\text{R}}^{\text{H}} {\varvec{H}}_{\text{R}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq304.png)
![$$ {\varvec{H}}_{\text{R}}^{\text{H}} {\varvec{H}}_{\text{R}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq305.png)
![$$ \left( {{\varvec{H}}_{\text{R}}^{\text{H}} {\varvec{H}}_{\text{R}} } \right)^{ - 1} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq306.png)
![$$ \hat{s}_{{{\text{R}},i}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq307.png)
![$$ {\hat{\varvec{s}}}_{\text{R}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq308.png)
![$$ y_{R,i}^{\text{MF}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq309.png)
![$$ {\varvec{y}}_{\text{R}}^{\text{MF}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq310.png)
![$$ \hat{s}_{R,i} \approx \frac{{y_{R,i}^{\text{MF}} }}{{\sum\limits_{j = 1}^{{N_{\text{r}} }} {\left( {H_{j,i} } \right)^{2} } }} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_Equ135.png)
![$$ \frac{1}{{\sum\limits_{j = 1}^{{N_{\text{r}} }} {\left( {H_{j,i} } \right)^{2} } }} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq311.png)
![$$ \hat{s}_{{{\text{R}},i}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq312.png)
![$$ y_{R,i}^{\text{MF}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq313.png)
![$$ \hat{s}_{i} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq314.png)
![$$ y_{i}^{MF} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq315.png)
![$$ {\hat{\varvec{s}}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq316.png)
![$$ \hat{s}_{i} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq317.png)
![$$ y_{i}^{\text{MF}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq318.png)
![$$ y_{i}^{\text{MF}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq319.png)
![$$ \tilde{s}_{i}^{(0)} = 4 + 4{\text{i}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq320.png)
![$$ \hat{s}_{i} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq321.png)
![../images/478198_1_En_2_Chapter/478198_1_En_2_Fig10_HTML.png](../images/478198_1_En_2_Chapter/478198_1_En_2_Fig10_HTML.png)
Schematic diagram of quadrant—based initial value algorithm
Algorithm 2.3
![../images/478198_1_En_2_Chapter/478198_1_En_2_Figc_HTML.png](../images/478198_1_En_2_Chapter/478198_1_En_2_Figc_HTML.png)
Taking the quadrant-based initial iteration value algorithm as the initial value result of TCG, the TCG can converge faster. According to the error requirements of the massive MIMO signal detection, the number of iterations can be reduced, and the computational complexity of the algorithm can be reduced from another angle, which makes the TCG even better.
![$$ {\tilde{\varvec{s}}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq322.png)
![$$ {\tilde{\varvec{s}}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq323.png)
![$$ {\tilde{\varvec{s}}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq324.png)
![$$ {\tilde{\varvec{s}}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq325.png)
![$$ {\tilde{\varvec{s}}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq326.png)
![../images/478198_1_En_2_Chapter/478198_1_En_2_Fig11_HTML.png](../images/478198_1_En_2_Chapter/478198_1_En_2_Fig11_HTML.png)
Schematic diagram of rounding off-based point seeking method
Algorithm 2.4
![../images/478198_1_En_2_Chapter/478198_1_En_2_Figd_HTML.png](../images/478198_1_En_2_Chapter/478198_1_En_2_Figd_HTML.png)
2.5.4 Complexity and Parallelism
![$$ {\varvec{y}}^{\text{MF}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq327.png)
![$$ N_{\text{t}} \times N_{\text{r}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq328.png)
![$$ {\varvec{H}}^{\text{H}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq329.png)
![$$ N_{\text{r}} \times 1 $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq330.png)
![$$ N_{\text{t}} \times N_{\text{r}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq331.png)
![$$ {\varvec{H}}^{\text{H}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq332.png)
![$$ N_{\text{r}} \times N_{\text{t}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq333.png)
![$$ N_{\text{t}} \times N_{\text{r}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq334.png)
![$$ N_{\text{t}} \times N_{\text{t}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq335.png)
![$$ N_{\text{t}} \times 1 $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq336.png)
![$$ {\tilde{\varvec{s}}}_{0} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq337.png)
![$$ 4N_{\text{t}} N_{\text{r}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq338.png)
![$$ 2N_{\text{r}} N_{\text{t}}^{ 2} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq339.png)
![$$ 4N_{\text{t}}^{ 2} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq340.png)
![$$ 2N_{\text{r}} N_{\text{t}}^{ 2} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq341.png)
![$$ N_{\text{t}} \times 1 $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq342.png)
![$$ N_{\text{t}} \times N_{\text{t}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq343.png)
![$$ 4KN_{\text{t}}^{ 2} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq344.png)
![$$ \left( {{\varvec{\eta}}_{K} ,{\varvec{z}}_{K} } \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq345.png)
![$$ \left( {{\varvec{z}}_{K} ,{\varvec{z}}_{K} } \right) $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq346.png)
![$$ 4KN_{t} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq347.png)
![$$ 4KN_{t} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq348.png)
![$$ {\tilde{\varvec{s}}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq349.png)
![$$ 12KN_{\text{t}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq350.png)
![$$ 12KN_{\text{t}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq351.png)
![$$ 2N_{\text{r}} N_{\text{t}}^{2} + 4N_{\text{t}} N_{\text{r}} + 4N_{\text{t}}^{ 2} + K\left( {4N_{\text{t}}^{ 2} + 32N_{\text{t}} } \right) - 16N_{\text{t}} $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq352.png)
![$$ N_{\text{r}} = 128 $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq353.png)
![$$ N_{\text{t}} = 8 $$](../images/478198_1_En_2_Chapter/478198_1_En_2_Chapter_TeX_IEq354.png)
Complexity comparison
K = 2 | K = 3 | K = 4 | K = 5 | |
---|---|---|---|---|
NSA [7] | 2NrNt2 + 6NtNr + 4Nt2 + 2Nt | 2NrNt2 + 6NtNr + 2Nt3 + 4Nt2 | 2NrNt2 + 6NtNr + 6Nt3 | 2NrNt2 + 6NtNr + 10Nt3-6Nt2 |
INS [17] | 2NrNt2 + 6NtNr + 10Nt2 + 2Nt | 2NrNt2 + 6NtNr + 14Nt2 + 2Nt | 2NrNt2 + 6NtNr + 18Nt2 + 2Nt | 2NrNt2 + 6NtNr + 22Nt2 + 2Nt |
GAS [12] | 2NrNt2 + 6NtNr + 10Nt2 − 2Nt | 2NrNt2 + 6NtNr + 14Nt2 − 6Nt | 2NrNt2 + 6NtNr + 18Nt2 − 10Nt | 2NrNt2 + 6NtNr + 22Nt2 − 14Nt |
CG [16] | 2NrNt2 + 6NtNr + 8Nt2 + 33Nt | 2NrNt2 + 6NtNr + 12Nt2 + 49Nt | 2NrNt2 + 6NtNr + 12Nt2 + 49Nt | 2NrNt2 + 6NtNr + 20Nt 2 + 81Nt |
CGLS [18] | 24NtNr + 20Nr2 + 8Nr + 44Nt | 32NtNr + 28Nr2 + 12Nr + 66Nt | 40NtNr + 36Nr2 +16Nr + 88Nt | 48NtNr + 44Nr2 + 20Nr + 110Nt |
OCD [19] | 16NtNr + 4Nt | 24NtNr + 6Nt | 32NtNr + 8Nt | 40NtNr + 10Nt |
TCG | 2NrNt2 + 4NtNr + 12Nt2 + 48Nt | 2NrNt2 + 4NtNr + 16Nt2 + 80Nt | 2NrNt2 + 4NtNr +20Nt2 + 112Nt | 2NrNt2 + 4NtNr + 24Nt2 + 144Nt |
The parallelism of the TCG algorithm is also very important. According to Step 2, Step 3, Step 6, and Step 7 of the algorithm, each row element of the matrix is multiplied by the vector when the matrix is multiplied by the vector, and each multiply accumulate operation can be carried out simultaneously. There is no data dependence between the eighth and ninth steps of the algorithm and between the fourteenth and fifteenth steps of the algorithm except the parallel computation of the matrix multiplied by vectors, so the computation can be performed simultaneously. This algorithm has parallelism in two cases. Compared with other algorithms, the parallelism between steps has great advantages.
As previously analyzed, the TCG has significant advantages over other algorithms in complexity and parallelism. In addition, the TCG algorithm optimizes the initial value of iteration and the final point finding method, so that this algorithm can bring the maximum benefit. In general, complexity and accuracy are in contradiction, and lower complexity often means lower precision. Therefore, although the above performance shows the excellent performance of the TCG algorithm in this respect, it does not show that it is a relatively comprehensive algorithm, and it is also required to consider the accuracy of the algorithm.
2.5.5 Symbol Error Rate
![../images/478198_1_En_2_Chapter/478198_1_En_2_Fig12_HTML.png](../images/478198_1_En_2_Chapter/478198_1_En_2_Fig12_HTML.png)
Effect of the initial value algorithm on the SER of the CG algorithm
![../images/478198_1_En_2_Chapter/478198_1_En_2_Fig13_HTML.png](../images/478198_1_En_2_Chapter/478198_1_En_2_Fig13_HTML.png)
SER curves of various algorithms for different number of iterations in massive MIMO system with 128 * 8
![../images/478198_1_En_2_Chapter/478198_1_En_2_Fig14_HTML.png](../images/478198_1_En_2_Chapter/478198_1_En_2_Fig14_HTML.png)
Influence of different algorithms on SER in MIMO systems with different scales
Therefore, the TCG has obvious advantages compared with the three algorithms of NSA, CG and CGLS. The TCG is better than the other three methods in complexity and parallelism, while the performance of SER is still worse than the three methods under the same SNR. Compared with the PCI algorithm, the TCG algorithm has lower complexity and better parallelism than the PCI. Besides having the parallelism of the matrix multiplied by vectors in the PCI, there is also parallelism between the steps in the algorithm. Compared with the OCD algorithm, the complexity of the TCG algorithm is higher than that of the OCD algorithm, and the OCD algorithm also has better performance of SER. However, the OCD method also has its inherent shortcomings: Because the OCD algorithm has too strong data dependence, the obtained data through calculation need to be stored in the register and each calculation needs to read data from the register, the parallelism is so poor that the operation can only be executed step by step sequentially.
Compared with the NSA, CG, CGLS, PCI, OCD, and other algorithms, the TCG algorithm is not the best in all performance parameters, but it overall achieves a better compromise. In the actual massive MIMO signal detection, the appropriate algorithm can be selected according to the actual application requirements.