Abstract
This paper focuses on mobile multiple-input multiple-output (MIMO) underwater acoustic communications (UAC) over double-selective channels subject to both intersymbol interference and Doppler scaling effects. Temporal resampling is implemented to effectively convert the Doppler scaling effects to Doppler frequency shifts. Under the assumption that the channels between all the transmitter and receiver pairs experience the same Doppler frequency, a variation of the recently proposed generalization of the sparse learning via iterative minimization (GoSLIM) algorithm, referred to as GoSLIM-V, is employed to estimate the frequency modulated acoustic channels. GoSLIM-V is user parameter free and is easy to use in practical applications. This paper also considers turbo equalization for retrieving the transmitted signal. In particular, this paper reviews the linear minimum mean-squared error (LMMSE) based soft-input soft-output equalizer involved in the turbo equalization scheme and adopts a fast implementation of the equalizer that achieves negligible detection performance degradation compared to its direct implementation counterpart. The effectiveness of the considered MIMO UAC scheme is demonstrated using both simulated data and measurements recently acquired during the MACE10 in-water experiment.
1. Introduction
Achieving reliable underwater acoustic communications (UAC) with high data rate is difficult owing to the unique challenges imposed by the underwater acoustic environment [1, 2]. In typical UAC, the difference in the propagation time between the earliest and latest arrivals could span tens to hundreds of symbol periods [3], which translates into long channel impulse response (CIR) and severe intersymbol interference (ISI) at the receiver side. Moreover, the presence of Doppler effects, owing to the relative motions between the transmitter and receiver platforms and the dynamic underwater acoustic medium, induces temporal scaling (stretching or compression) to the transmitted signals [4]. Doppler-induced scaling effects impair the reliability of UAC, especially in the case of a phase-coherent detection scheme [3]. Furthermore, the scarcely available bandwidth permitted by the acoustic channel imposes an upper bound on the attainable symbol rate [2]. Therefore, the pursuit of high data rate in UAC leverages the multiple-input multiple-output (MIMO) scheme, which offers enhanced reliability and/or increased data rates compared to its single-input counterpart [5–7]. The focus of the present paper is on effective mobile MIMO UAC over double-selective acoustic channels suffering from both ISI and Doppler scaling effects.
Converting the double-selective channel into an ISI channel via temporal resampling is an effective way to tackle mobile UAC difficulties [4]. Although the Doppler scaling effects can be largely mitigated via such a temporal resampling process, the residual Doppler still causes frequency shift on the received measurements. Coherent UAC requires the receiver to acquire knowledge of the underlying channel after temporal resampling via channel estimation [7]. Channel estimation could be conducted either in the training-directed mode, using known training sequences, or in the decision-directed mode, using the detected payload symbols [5, 6]. A preferable tool to characterize a channel subject to both ISI and Doppler frequency shift is the scattering function (SF), which essentially decouples the acoustic channel into a bank of paths that experience different delays and Doppler frequencies [8]. The major concern in SF-based channel estimation is that the problem becomes over parameterized with too many degrees of freedom. It is practically more beneficial to look for a channel model with the smallest number of parameters, but one that still sufficiently reflects the defining characteristics of the acoustic channel of interest. Along this line of thought, it is assumed in [9, 10] that, at each receiver, the channel taps for all the transmitters experience the same Doppler frequency, but different receivers experience different Doppler shifts. The number of unknowns in the frequency dimension, as a consequence, is significantly reduced. Under this assumption, CIRs and the underlying Doppler frequency could be estimated in a separate manner [9] or in a joint manner by employing the generalization of the sparse learning via iterative minimization (GoSLIM) algorithm [10]. It is demonstrated in [10] that GoSLIM outperforms the separate estimation algorithm proposed in [9] in terms of estimation accuracy and robustness against suboptimal training sequences.
In [11], the aforementioned channel model is further simplified by assuming that the channel taps for all the transmitter and receiver pairs experience the same Doppler frequency. As a consequence, the impact of the Doppler frequency shift on the received measurements across all the receivers is taken into account through one unknown common frequency. Accordingly, a variation of GoSLIM, referred to as GoSLIM-V (V stands for variation), is proposed for channel estimation in [11]. Like GoSLIM, GoSLIM-V addresses sparsity through a hierarchical Bayesian model, and because GoSLIM-V is user parameter free, it is easy to use in practical applications. It is demonstrated in [11] that the employment of GoSLIM-V not only reduces the overall complexity in the channel estimation stage but also slightly improves the detection performance compared to its GoSLIM counterpart. Due to this reason, GoSLIM-V is used as the channel estimation algorithm in the present paper.
Following the channel estimation is the design of the detection scheme for extracting the transmitted signals. The channel-induced phase shift should be first compensated out using the Doppler frequency estimate [9]. Such phase compensation task, along with the aforementioned temporal resampling process, effectively converts a double-selective channel subject to both Doppler scaling effects and ISI to an ISI channel, which allows for the employment of various equalization techniques that can effectively combat ISI. We use a linear minimum mean-squared error (LMMSE) based filter for symbol detection. In a MIMO setup, on top of ISI, multiple simultaneously transmitted signals act as interferences to one another. Therefore, interference cancellation scheme also plays a critical role in the overall detection performance. A hard decision based interference cancellation scheme, including vertical BLAST (V-BLAST) [12] and RELAX-BLAST [5], subtracts out the hard decisions of detected signals from the received measurements to aid the detection of the remaining signals. By combining V-BLAST with the cyclic principle of the RELAX algorithm [13], RELAX-BLAST provides superior detection performance over V-BLAST at the cost of slightly increased complexities [5, 6].
The detection performance can be further enhanced by employing a soft interference cancellation scheme, including turbo equalization [14–16]. For a receiver employing turbo equalization, both the equalizer and decoder involved are configured as soft-input soft-output. The detection performance improves as the soft information cycles between the equalizer and decoder. The main drawback of the turbo equalization scheme is the increased computational complexity compared to its hard-decision-based counterparts. To address this problem, we consider a low complexity approximation of soft-input soft-output equalizer [14]. We will show via numerical and experimental examples that the employment of the proposed approximate equalizer enjoys a computational complexity comparable to RELAX-BLAST and provides only slightly degraded detection performance compared to an exactly implemented turbo equalizer.
The rest of the paper is organized as follows. Section 2 presents a system outline. Section 3 describes a model for the acoustic channel subject to both ISI and Doppler scaling effects and reviews the temporal resampling procedure. Section 4 formulates the channel estimation problem in both training-directed and decision-directed modes and then introduces GoSLIM-V as the channel estimation algorithm. Section 5 first formulates the symbol detection problem and then details the LMMSE based soft-input soft-output equalizer and its low complexity approximation. Section 6 presents the simulation results of the turbo equalization scheme, followed by the experimental results obtained from analyzing the MACE10 in-water measured data. The paper is concluded in Section 7.
Notation
Vectors and matrices are denoted by boldface lowercase and uppercase letters, respectively,
2. System Outline
Consider an

An
The structure of a receiver employing a turbo equalization scheme is shown in Figure 1(b). The measurements acquired by the M receive hydrophones are first resampled, followed by channel estimation and phase compensation. After phase compensation, the double-selective channel is converted to an ISI channel, and the turbo equalization scheme is employed herein to retrieve the transmitted information. The superior detection performance promised by turbo equalization is mainly due to its mechanism of cycling soft information between the equalizer and the decoder [14]. Accordingly, turbo equalization consists of two key modules, namely, a soft-input soft-output equalizer and a soft-input soft-output decoder [17]. The soft information of a generic bit
3. Double-Selective Channel with Doppler Scaling Effects
In this section, we start with the modeling of the double-selective channel suffering from both ISI and Doppler scaling effects. Then we describe the temporal resampling procedure to mitigate the Doppler scaling effects. After that, we provide a practical approach to estimate the Doppler scaling factor.
3.1. Channel Model
By adopting a single-carrier communication scheme, at the nth transmitter, the continuous baseband signal
Due to multipath effects, the actual transmitted signals
We assume that the propagation paths for all the transmitter and receiver pairs experience a common Doppler scaling factor and the resolved paths are synchronized among all the transmitter and receiver pairs, that is,
3.2. Temporal Resampling
By resampling the received measurements
Therefore, the determination of the resampling factor β plays a crucial role in the effective mitigation of the Doppler scaling effects.
3.3. Resampling Factor Estimation
We take advantage of the preamble and the postamble of a data packet to estimate β [19] (the structure of a data packet will be discussed in Section 6). By cross-correlating the received signal with the known preamble and postamble, the receiver estimates the time duration of a packet
More accurate Doppler scaling factor estimate can be achieved via channel estimation instead of cross correlation. Based on the CIRs estimated from the two measurement segments in response to the preamble and postamble, the change in the time duration
The
4. Channel Estimation
Since the
In what follows,
4.1. Training-Directed Mode
The initial task of the receiver is to acquire knowledge of the underlying channel between all transmitter and receiver pairs using the training sequences. By adopting the cyclic prefix scheme in [7], the training sequence at the nth transmitter (
For MIMO UAC over acoustic channels subject to both ISI and Doppler frequency shifts, the measurement vectors can be written as [8, 21]
The ISI and Doppler shift effects can be viewed separately in (14). More specifically, the term
We express (14) in a more compact form as
4.2. Decision-Directed Mode
The decision-directed channel estimation problem is only a slight twist of its training-directed counterpart. For the former, we use the hard decisions of the previously estimated payload symbols, instead of the training symbols, to estimate the channels; see Figure 1(b). Accordingly, (14) can still be used, where
4.3. Channel Estimation Algorithm: GoSLIM-V
The channel estimation algorithm, in either training- or decision-directed mode, has the generic form given by (20). We remark that the number of elements in
Consider the following hierarchical Bayesian model:
Furthermore, by considering a flat prior on f, η, and
The 5 steps of the GoSLIM-V algorithm at the tth iteration are outlined below.
Given Once Next, using the most recently obtained Using the Set
In the training-directed mode, the channel characteristics in general are not available a priori. In our examples,
5. Symbol Detection
In this section, we proceed to study the detection of the payload symbols given the estimates of CIRs and Doppler frequency f obtained by GoSLIM-V. The detection task is achieved via two steps: phase compensation followed by turbo equalization. As shown in Figure 1(b), turbo equalization consists of an equalizer and a decoder, both configurated as soft-input soft-output. The decoder is conventionally implemented by the Max-Log-MAP algorithm [17], and our focus herein is on the soft-input soft-output equalizer. We first formulate the symbol detection problem and then describe the phase compensation procedure. After that, we elaborate the LMMSE-based turbo equalization design and discuss its low complexity approximation.
5.1. Problem Formulation
Treating the transmitted symbols as the unknowns and the CIRs and Doppler frequency as known, the measurement vector in (14) can be expressed as [8, 21]
When detecting symbols, we use the estimates
5.2. Phase Compensation
Stacking up all the measurements, (32) can be written as
Phase compensation, along with the aforementioned temporal resampling process, effectively converts the original double-selective channel to an ISI channel. Given the phase-compensated measurement vector
5.3. LMMSE-Based Soft-Input Soft-Output Equalizer
As shown in Figure 2, an LMMSE-based soft-input soft-output equalizer can be functionally divided into four modules. The

The structure of the LMMSE-based soft-input soft-output equalizer.
5.3.1. A Priori LLR Preprocessor
In this task, we calculate
Plugging (42) into (40) gives
5.3.2. LMMSE Filtering
Depending on whether the a priori LLR information is incorporated or not, two types of LMMSE filters are studied in the following.
In the Absence of A Priori Knowledge
The equalizer is performed in the absence of a priori knowledge at the very first iteration before using the decoder. This scenario amounts to setting
In the Presence of A Priori Knowledge
In this case, the LMMSE estimate of
Equation (46) suggests that the estimation of
Define
5.3.3. A Posteriori LLR Generator
This task calculates the
We assume that, given
Let
Since
5.4. Low-Complexity Approximate LMMSE Filtering
Although the calculation of a posteriori LLR according to (56) is more efficient than (53), it still constitutes the major computational bottleneck in turbo equalization mainly because
Note that matrix inversion is an indispensable stage in calculating the LMMSE filter coefficients in (44), (47), and (57). To expedite the calculation, we can make use of the conjugate gradient (CG) method and fast Fourier transform (FFT) operations, as elaborated in [27]. Although [27] focuses on the efficient calculation of the LMMSE filter coefficients in the form of (44), the extension to a more general scenario in (47) or (57) is straightforward. In the present paper, both exact-LMMSE turbo and approximate-LMMSE turbo are implemented using the FFT-based CG method.
6. Numerical and Experimental Results
6.1. Numerical Results
Consider transmitting four payload blocks simultaneously over time-invariant ISI channels using a MIMO UAC system equipped with

(a) Coded BER performance by using exact-LMMSE turbo along with RELAX-BLAST performance. (b) Coded BER performance by using approximate-LMMSE turbo along with RELAX-BLAST performance. Each point is averaged over 500 Monte Carlo trials. In this simulation,
6.2. MACE10 In-Water Experimental Results
6.2.1. Experiment Specifics
The MACE10 in-water experiment was conducted by the Woods Hole Oceanographic Institution (WHOI) off the coast of Martha's Vineyard, MA, USA, in June 2010. A source array consisting of 4 transducers was vertically deployed at a depth of 80 m and towed by a vessel. At the receiver side, a 12-element hydrophone array was mounted on a buoy. The vessel moved from the minimum range of 500 m away from the receiving array outbound to the maximum range of 4000 m and then inbound back to the minimum range. The carrier frequency, sampling frequency, and symbol rate employed in the MACE10 experiment were 13 kHz, 39.0625 kHz, and 3.90625 kHz, respectively. By transmitting
The structure of a transmitted data package is shown in Figure 4. Each package consists of 4 packets. The first packet conveys a grayscale Gator mascot and the subsequent 3 packets are combined from a colored mascot. The RGB components of the colored image were transmitted in the 2nd, 3rd, and 4th packets, respectively. Each pixel of the Gator grayscale image is represented by 5 bits, corresponding to 32 different intensities (e.g., pure white and pure dark pixels are represented by 11111 and 00000, resp.). The 64-pixel by 100-pixel grayscale mascot image, as a consequence, is represented by a total of 32 k source bits. Accordingly, a colored mascot image is represented by 96 k bits. The contrast of the grayscale image, as well as the hue of the colored image, has been carefully adjusted so that the image carries approximately equal numbers of 1s and 0s.

The structure of the package used in the MACE10 experiment.
As shown in Figure 4, each packet is constructed as follows: time-marking sequences are placed at the beginning of each packet to facilitate the temporal resampling procedure; two guard intervals, each containing 500 silent symbols, are placed, respectively, before and after the segments containing the payload symbols and training sequences. The payload symbols contain the information of the Gator mascot image. We herein elaborate how to generate the 1st packet from the grayscale Gator mascot image (the packet generation for each of the RGB components of the colored image follows the same procedure). Specifically, the 32 k source bits are first interleaved so that the bits fed into the convolutional encoder module have an equal chance of being 0 or 1; see Figure 5. The so-obtained 32 k interleaved source bits are then divided into 32 groups, each containing 1 k bits. The bits in the

The structure of the transmitted symbols for the
To estimate the Doppler scaling factor, we treat the time-marking sequences at the beginning of a packet as its preamble and those at the beginning of the subsequent packet as the postamble. Take the 2nd packet of epoch “E002,” for example. For the channel between the 1st transmitter and the 1st receiver, the superimposed modulus of the CIRs obtained by GoSLIM-V from the preamble and postamble is shown in Figure 6. The indexes of the principal arrivals for the preamble and postamble are 12 and 21, respectively. Hence, the time duration change imposed on the packet is

The superimposed modulus of the CIRs obtained from the preamble and postamble, respectively.
To assess the performance of the resampling process, the CIR and Doppler frequency evolutions obtained by GoSLIM-V before we resample the 2nd packet of epoch “E002” are shown in Figures 7(a) and 7(c), respectively. In comparison, Figures 7(b) and 7(d) demonstrate the corresponding CIR and Doppler frequency evolutions obtained after resampling the packet, respectively. We can see from Figure 7 that the temporal resampling procedure successfully reduces the Doppler scaling effects to Doppler frequency shifts. The relative speed between the transmitter and the receiver arrays can be estimated as

(a) CIR evolution of epoch “E002” before resampling. (b) CIR evolution of epoch “E002” after resampling. (c) Doppler frequency evolution of epoch “E002” before resampling. (d) Doppler frequency evolution of epoch “E002” after resampling.

The relative speed between the transmitter and receiver array given by GPS and estimated during the temporal resampling stage (courtesy of Milica Stojanovic's group).
6.2.2. Performance Evaluation
By fixing the tracking length at
We deem a packet to be successfully detected if the resulting coded BER is less than 0.1. After analyzing a total of 480 packets available, Table 1 summarizes the successfully detected packet percentage, the zero BER packet percentage, the coded BER averaged over the successful packets, and the time ratio of the time consumed to process a packet on the workstation specified in Section 6.1 to
A summary of the performance of the three detection schemes (3 iterations applied).

(a) Grayscale mascot recovered from epoch “E054” using RELAX-BLAST. (b) Grayscale mascot recovered from epoch “E054” using turbo equalization. (c) Colored mascot recovered from epoch “E054” using RELAX-BLAST. (d) Colored mascot recovered from epoch “E054” using turbo equalization.
To further illustrate the detection performance of turbo equalization, Table 2 shows the coded BER averaged over all of the 480 packets at different iteration numbers obtained by exact-LMMSE turbo and approximate-LMMSE turbo. We can see from Table 2 that the coded BER improves with iteration. Empirical experience indicates that the detection performance for both types of turbo equalization converges after three iterations. Next, we choose one payload block and denote
The average coded BER obtained by exact-LMMSE turbo and approximate-LMMSE turbo, respectively.

The LLR soft information about the source bits at the output of the decoder. ((a)–(d)) are obtained by exact-LMMSE turbo from no iteration to 3 iterations, respectively. ((e)–(h)) are obtained by approximate-LMMSE turbo from no iteration to 3 iterations, respectively.
7. Conclusions
For double-selective channels encountered in mobile MIMO UAC, we have demonstrated via the MACE10 in-water experimental data analysis that it is reasonable to assume a common Doppler scaling factor imposed on the propagation paths among all the transmitter and receiver pairs when the Doppler effects are mainly induced by the relative motions between the transmitter and receiver arrays. Temporal resampling has been used to effectively convert the Doppler scaling effects to Doppler frequency shifts. A data-adaptive sparse channel estimation algorithm, referred to as the GoSLIM-V algorithm, is used to estimate the underlying CIRs and Doppler frequency in a joint manner. For symbol detection, we have investigated the turbo equalization schemes implemented by the LMMSE-based soft-input soft-output equalizer as well as its low complexity approximation. The latter provides only slightly degraded detection performance but at a significantly lower computational complexity compared to the former and is thus preferred. The effectiveness of the considered approaches has been verified using both numerical and the MACE10 in-water experimental results.
Footnotes
Acknowledgments
This work was supported in part by the Office of Naval Research (ONR) under Grant no. N00014-10-1-0054. The authors gratefully acknowledge WHOI for the fruitful collaborations with them to conduct the in-water experimentations and for sharing data with them.
