From a signal processing point of view, the novel aspect of hybrid precoding/combining in comparison with the conventional fully digital precoding/combining lies in the introduction of the precoding/combining stage in the RF domain as a result of driving a large-scale antenna array by a limited number of RF chains. By treating the cascade of the RF stage and multiple-input multiple-output (MIMO) channel as the effective channel, the system model of massive MIMO is analogous to the conventional counterpart, where various solution techniques have been proposed for the single-user and multi-user scenarios. On the other hand, problem formulation and optimization of the RF component depend on the choice of the baseband component in a joint RF-baseband design. Therefore, before we embark on the study of hybrid precoding/combining design, we give a brief review of the related background and recent developments in this chapter, which serve as the basis for our proposed research in the subsequent chapters.
2.1 Linear Precoding and Combining for Point-to-Point MIMO
2.1.1 Baseband Digital Solutions for Conventional MIMO
2.1.2 Progressive Baseband Digital Solutions for Conventional MIMO ARQ
2.1.3 Hybrid RF-Baseband Solutions for Massive MIMO
2.2 Multi-User MIMO Precoding
2.2.1 Precoding for Conventional MIMO: From Linear to Nonlinear
Since the perturbation vector is a Gaussian integer, the search can be viewed as a problem of closest point search in a lattice. If s is picked from a high-order square constellation such as 16-, 64-, and 256-QAM as defined in Long-Term Evolution (LTE), and the optimal perturbation vector is optimally found, then by approximating the equiprobable discrete constellation points as continuous and uniformly distributed in a hyper-rectangle, the resulting errors from the search can be accordingly treated as uniformly distributed [29]. If the perturbation vector is obtained by power minimization, the error is nothing but the transmit power. In contrast to [30] where the transmit power is obtained by numerically solving a set of fixed-point equations, such lattice-theoretic approximation yields a mathematically tractable lower bound. This insight was leveraged for channel vector quantization design in [31], and for greedy user selection to alleviate the concern of power enhancement in cooperative ZF beamforming [32]. While the per-BS power constraint is more practical for multi-cell downlink beamforming, unfortunately, the MMSE-VP precoder has to be numerically optimized [33].
When users are equipped with multiple antennas, an enhanced system performance is expected from joint design of nonlinear precoding and linear combining. So far, the focus has been exclusively on non-iterative methods for the benefit of low complexity. The basic idea is to first use block diagonalization (BD) to eliminate inter-user interference and hence create parallel SU-MIMO channels, and then perform VP across the spatially multiplexed data streams in conjunction with SU-MIMO precoding and combining for each user. In [34], over each SU-MIMO channel, ZF-VP is used for spatial multiplexing while treating each receive antenna as a virtual user. In doing so, users only need to know the power scaling factor β for detection. Such a design, motivated by the need for signaling overhead reduction, was shown to approach the performance of waterfilling-based solutions [1]. The work in [35] instead combined BD with MMSE-VP, and demonstrated the benefit of geometric mean decomposition-based joint design [5] in terms of improved BER. By exploiting uniform channel decomposition [6], the linear MMSE receiver could also be incorporated [36, 37], which further decreases the BER. Assuming matched filtering at the users, a non-iterative approach for cooperative ZF-VP beamforming was proposed in [38] and compared with the linear counterpart.
2.2.2 Hybrid RF-Baseband Solutions for Massive MIMO
RF phase-shifting design with perfect CSI was addressed in [40, 41]. In [40], the phase-shifters in the RF domain are heuristically derived based on extracting the phases of the maximum ratio transmission (MRT) beamformer. In combination with the baseband ZF precoding, the hybrid solution was shown to perform close to the fully digital ZF counterpart in terms of sum rate. This idea was further pursued in a multi-antenna user setting in [41], where hybrid combining was also considered with the constant-modulus RF combiner reduced to DFT beam selection.
It is worth mentioning that one basic assumption in the aforementioned works [15–19, 40, 41] is that perfect knowledge of the high-dimensional MIMO channel is available at the BS. This assumption, however, could be problematic in closed-loop frequency-division duplexing (FDD) systems, in that an enormous amount of channel estimates need to be frequently fed back [42]. In an effort to remedy the issue with channel estimation overhead, one interesting idea is to adjust RF processing solely based on the statistical CSI while updating baseband processing according to the instantaneous effective CSI [43–47]. The slow-varying nature of statistical CSI renders it unnecessary for frequent update which leads to reduction in feedback overhead. In the presence of the RF stage, the dimension of the effective channel from the perspective of the baseband is significantly decreased in contrast to the original MIMO channel, and thus timely update of instantaneous CSI becomes feasible.
It is worth mentioning that the effectiveness of the hybrid precoding solutions with two-timescale CSI as in [43, 50, 52] relies on the assumption that users are naturally partitioned into groups, and the same group of users experiences identical channel spatial correlation. This is however too restrictive in practice. Thus, the work in [53–55] proposed various user grouping algorithms based on the distance between the subspaces of the transmit spatial correlation, and RF beamforming was adapted to the centroid of the correlation matrices for each user group. Since the mean correlation is only a rough approximation, in this case, it is unlikely for the resulting RF beams to create near-perfect spatial separation, which casts doubt on the feasibility of group-wise spatial multiplexing at baseband.
When it comes to multi-cell systems, additional design issues, such as limited inter-BS cooperation in terms of signal and local CSI exchange, per-BS power constraint, and inter-cell interference need to be properly addressed. For example, in [45], only statistical CSI was assumed to be globally available, and the cluster-wise RF precoding was constrained to the null space of the superimposed transmit correlation matrices for interference reduction. In conjunction with local CSI-based ZF solutions, the RF solutions are derived to maximize a general utility function of spectral efficiency. In [52], RF precoding is designed with the objective of minimizing interference leakage power with linear pricing. Under the assumption that statistical CSI used for RF precoder update is outdated, a subspace tracking and compensation algorithm on Grassmann manifolds was proposed. The concept of deterministic equivalent was exploited in [47] to approximate the SINR chance constraint by deterministic functions, where accordingly, the RF precoding solutions are obtained. In [56], MMSE-VP was employed for instantaneous CSI-based two-stage precoder design in cooperative multi-cell systems.
2.3 Summary
For point-to-point massive MIMO systems, previous work has shown that even with a reduced number of RF chains, instantaneous CSI-based two-stage precoding/combining solutions are capable of delivering a performance comparable to their fully digital counterparts. This is especially the case when the MIMO channel is correlated, as commonly found in a directional propagation environment such as mmWave channels. However, because of the tremendous overhead of estimating the high-dimensional MIMO channel, one might be interested to know if such an observation is still valid when perfect CSI in the RF domain is relaxed to statistical CSI, e.g., channel covariance. Besides, it is noted that joint precoder-combiner optimization in conventional MIMO gives a substantial performance enhancement. Nonetheless, such an approach has not been attempted to derive the hybrid precoding and combining solutions yet. Finally, the design and evaluation of statistical CSI-based RF phase-shifting remain an open issue.
When packet retransmission is incorporated through hybrid ARQ mechanisms in conventional MIMO systems, previous research efforts have demonstrated that by exploiting the temporal diversity in the linear precoder optimization, the system performance can be improved. Unfortunately, such solutions cannot be directly applied to massive MIMO. To begin with, it has been assumed that the received signals from the past rounds of retransmission are fully accessible to the baseband. However, such an assumption raises concern about the storage requirement and processing complexity when the received signals are of high dimensions. A potential remedy is to introduce hybrid RF-baseband combining which reduces the dimension of the received signals to be combined through RF preprocessing. In doing this, the baseband has only access to the low-dimensional received signals at the output of the RF combiner. On the other hand, when the hybrid precoding structure with RF phase-shifting is employed, it is necessary for the optimization of baseband precoding to take into account the design of RF phase-shifting and hybrid combining at the receiver. Hence, a novel design of hybrid precoding and combining is required.
In a multi-user massive MIMO environment, the principle of hybrid RF-baseband precoding is to first create spatial separation of users in the RF beam domain and then perform spatial multiplexing at baseband. In particular, by assuming that users are geographically clustered in hotspots, it suffices to adjust the RF precoder based on the statistical CSI in consideration of the baseband precoder adaptive to the instantaneous effective CSI. As a result, only two-timescale CSI is required, which significantly reduces the channel estimation overhead. It is worth mentioning that the existing work has restricted the attention to linear precoding schemes such as ZF and RZF at baseband. Unfortunately, the linear schemes suffer severe power loss when a maximum number of equal-rate users is spatially multiplexed. On the contrary, by introducing a perturbation vector as additional DoFs for performance optimization, VP effectively addresses such an issue. A direct consequence of hybrid precoding with a reduced number of RF chains is that the number of data streams that can be physically supported is limited. Thus, it is desirable for hybrid precoding to perform well in fully loaded systems for utility maximization. In view of the drawback of linear schemes in this case, it is natural to explore how nonlinear VP techniques can be combined with the two-timescale CSI-based hybrid precoding design.
When extended to a multi-cell massive MIMO environment, the hybrid precoder needs to take into account the presence of inter-cell interference as well. The effectiveness of inter-cell interference mitigation depends on the degree of inter-cell cooperation in terms of information exchange allowed. For network MIMO processing, the technique of hybrid precoding design for single cell cannot directly carry over. In particular, because of the per-BS power constraint, closed-form expressions for the linear baseband front-end are no longer available. Given the difficulty with optimizing the statistical CSI-based RF precoder with respect to the traditional performance metrics, e.g., mutual information and MSE, the existing work has largely relied on heuristics. We remark that although the use of two-timescale CSI provides an effective alternative to addressing the issue of channel estimation overhead, it is somewhat restrictive from the perspectives of optimization and applicability. For example, in the absence of the analytical baseband solutions, iterative procedures are generally required for joint RF-baseband optimization. It is not clear how such alternating optimization can be carried out when the design variables are adaptive to different time scales. Besides, the assumption that users are geographically clustered and the user clusters are separated apart can turn out to be too ideal. Hence, novel design approaches that overcome such disadvantages while enjoying comparable channel estimation overhead with the two-timescale CSI are desired.