Two-dimensional multi-frequency synchronous grid-free compressive beamforming

Abstract

Two-dimensional grid-free compressive beamforming with rectangular planar microphone arrays has attracted considerable attention due to its wide source identification region (the hemispherical space in front of arrays), high accuracy (overcoming basis mismatch), and robust performance. However, the uniform distribution of microphones tends to result in spatial aliasing, i.e., the existing single-frequency methods fail to achieve accurate localization for the sources with wavelengths shorter than twice the minimum spacing between microphones. This paper proposes a two-dimensional multi-frequency synchronous grid-free compressive beamforming method, aimed at mitigating spatial aliasing and extending its applicability beyond the sampling limit imposed by the array geometry. First, a two-dimensional multi-frequency synchronous atomic norm minimization problem is formulated and then converted into an equivalent dual problem. Subsequently, a spatial spectral function is constructed from the solution of the dual problem to extract source positions and strengths. Simulation and experiment results show that when the spatial Nyquist criterion is satisfied, both the proposed multi-frequency synchronous method and existing single-frequency methods can accurately identify sources, with the proposed method reducing strength quantification error by over 80% under strong interference. When the spatial Nyquist criterion is violated, the proposed method can effectively suppress spatial aliasing inherent in the existing single-frequency methods and accurately identify common engineering noise sources with high-frequency components, such as electric motor noise, gear whine, and bearing noise.

Keywords

acoustic source identification compressive beamforming grid-free multi-frequency synchronous spatial aliasing

1. Introduction

Compressive beamforming (CB)^1,2 enables clear and intuitive acoustic imaging, shows strong anti-interference capability and applicability to both coherent and incoherent sources, and thus has great application value in engineering fields such as aerospace,^3–6 automobile,^7,8 wind power generation^9–11 and so on.

Based on the mathematical models, CB can be categorized into four types: on-fixed-grid methods, off-fixed-grid methods, dynamic-grid methods, and grid-free methods.¹² On-fixed-grid CB discretizes the imaging region into a set of fixed grid points, constructs a sensing matrix using the transfer functions between the grid points and microphones, and then formulates an underdetermined equation system relating the measured pressures to the sensing matrix and the source distribution. Sparse constraints and sparse recovery algorithms are applied to solve this equation system. When acoustic sources fall on the preset grid points, on-fixed-grid CB performs well. However, when sources deviate from the grid points, these methods localize sources to the nearby grid points, resulting in significant performance degradation. This is known as basis mismatch problem.² The other three types of methods are established to overcome this problem. Off-fixed-grid CB also discretizes the imaging area but mitigates basis mismatch through initial on-grid localization and off-grid offset compensation.^13–15 Dynamic-grid CB discretizes the imaging region into dynamic grids and drives these grids to move toward true source positions, which performs well with an appropriate initial grid setting.¹⁶ Different from the above two types of CB methods, grid-free CB treats the source distribution as a continuum, thereby avoiding basis mismatch at the root.¹⁷

Many researchers have made efforts to develop two-dimensional grid-free CB methods based on planar microphone arrays, which enables the identification of the sources lying in the hemispherical space in front of the arrays. In 2017, Yang et al. proposed a two-dimensional grid-free CB method based on atomic norm minimization (ANM),¹⁷ and later introduced iterative reweighted ANM to improve spatial resolution.^18,19 In 2019, Yang et al. exploited multiple-snapshot measurements to enhance the performance²⁰ and developed an efficient solver based on the alternating direction method of multipliers (ADMM).²¹ Tang et al. solved the ANM problem in the dual domain by defining a spatial spectral function related to the dual polynomial.²² In 2020, Liu et al. proposed an iterative Vandermonde decomposition and shrinkage thresholding-based two-dimensional grid-free CB method,²³ which solves its model using accelerated proximal mapping and alternating projections. This method avoids performance degradation caused by inaccurate SNR estimation.

The core of the above two-dimensional grid-free CB methods is to formulate and solve the ANM model, for which the construction of a two-level Toeplitz matrix is essential. Therefore, the existing two-dimensional grid-free CB methods can be only applied to regular array configurations, such as uniform or sparse rectangular microphone arrays.²⁴ For a given number of microphones and equivalent array diameters, regular microphone arrays have a larger minimum spacing between microphones. According to the spatial Nyquist criterion,^25,26 when the minimum spacing between microphones exceeds half the wavelength, DOA estimation may become ambiguous, i.e., high-frequency spatial aliasing occurs. The existing two-dimensional grid-free CB methods process signals at a single frequency, and therefore, their upper applicable frequency limit is constrained by the spatial Nyquist criterion.

Spatial aliasing is frequency-dependent, meaning that aliasing artifacts appear at different locations across frequencies, while the true source locations remain frequency-independent. Therefore, by establishing a multi-frequency model in which the frequency separation (measured in wavelength) exceeds twice the minimum microphone spacing, the intersection of the source distribution sparsity profile across frequencies contains exclusively the true DOAs, enabling aliasing-free DOA estimation.²⁷ Based on that, some researchers developed a joint sparse multi-frequency model with one-dimensional linear arrays^28–30 to integrate acoustic information at different frequencies into a unified optimization framework, achieving spatial aliasing suppression in one-dimensional line spectral estimation. Inspired by these reports, this paper proposes a two-dimensional multi-frequency synchronous grid-free CB method. The proposed method aims to suppress high-frequency spatial aliasing caused by the use of regular microphone arrays and effectively identify sources whose wavelengths are smaller than twice the minimum microphone spacing, thereby extending its applicability to identifying engineering noise containing high-frequency components, such as electric motor noise, gear whine, and bearing noise. The proposed method first formulates a two-dimensional multi-frequency model and defines the atomic norm. Then, it constructs an infinite-dimensional ANM problem and converts it to a dual problem with finite-dimensional variables for solving. Finally, a spatial spectral function related to the dual polynomial is used to recover the primal solution, from which DOAs and source strengths are extracted.

The main contributions of this article are as follows:

● A two-dimensional multi-frequency synchronous grid-free CB model is established based on atomic norm minimization for source identification in two-dimensional space.

● A multi-frequency spatial spectral function is proposed to extract source information.

● Compared with the existing two-dimensional grid-free CB methods, the proposed method can effectively suppress spatial aliasing and accurately identify the sources with wavelengths shorter than twice the minimum spacing between microphones due to its multi-frequency synchronous model. Compared with the existing multi-frequency CB methods, the proposed method is the two-dimensional extension of these methods, and can achieve two-dimensional source localization in continuous space.

The remainder of this paper is organized as follows: Section 2 introduces the theory of the proposed two-dimensional multi-frequency synchronous grid-free CB, Section 3 evaluates the performance of the proposed method via simulations and Monte Carlo trials, Section 4 validates the correctness of simulations and the effectiveness of the proposed method via loudspeaker identification experiments, and Section 5 summarizes the paper.

2. Theory of two-dimensional multi-frequency synchronous grid-free compressive beamforming

2.1 Two-dimensional multi-frequency synchronous model

Figure 1 shows the measurement model. Sources are located in the far field and radiate plane waves. $Ω_{S i} = (θ_{S i}, ϕ_{S i})$ denotes the DOA of the $i th$ source, where $θ_{S i} \in [0 °, 90 °]$ and $ϕ_{S i} \in [0 °, 360 °)$ represent elevation and azimuth angles, respectively. A rectangular microphone array, which contains A rows and B columns, i.e., A total of AB microphones are used. The symbol “●” denotes microphones. $(a, b) \in {0, 1, \dots, (A - 1)} \times {0, 1, \dots, (B - 1)}$ denotes the index of microphones. $Δ x$ and $Δ y$ represent the distances between the adjacent microphones in the x and y directions, respectively. The $(0, 0) th$ microphone is located at the origin. $i = 1, 2, \dots, I$ is the index of sources, and I is the number of sources. $l = 1, 2, \dots, L$ denotes the index of frequencies, and L is the number of frequencies. $C$ denotes the set of complex numbers. $s_{i} = [s_{i, 1}, s_{i, 2}, \dots, s_{i, L}] \in C^{1 \times L}$ is a row vector consisting of the pressures at all frequencies radiated by the $i th$ source at the origin. Let $t_{1} \equiv \sin θ \cos ϕ Δ x$ and $t_{2} \equiv \sin θ \sin ϕ Δ y$ . The continuous expression of multi-frequency signal $x (t_{1}, t_{2}) \in C^{1 \times L}$ is given by

x (t_{1}, t_{2}) = \sum_{i = 1}^{I} s_{i} δ (t_{1} - t_{1 i}, t_{2} - t_{2 i}),

(1)

where

t_{1 i} \equiv \sin θ_{S i} \cos ϕ_{S i} Δ x

t_{2 i} \equiv \sin θ_{S i} \sin ϕ_{S i} Δ y

, and

δ (\cdot)

is the Dirac Delta function.

p_{a, b} = [p_{a, b, 1}, p_{a, b, 2}, \dots, p_{a, b, L}] \in C^{1 \times L}

assembles the pressures at the

(a, b) th

microphone across all frequencies, and it can be written as

p_{a, b} = \iint_{T} x (t_{1}, t_{2}) \circ e^{j 2 π \frac{f}{c} (t_{1} a + t_{2} b)} d t_{1} d t_{2} =_{i = 1}^{I} [e^{j 2 π \frac{f_{1}}{c} (t_{1 i} a + t_{2 i} b)}, e^{j 2 π \frac{f_{2}}{c} (t_{1 i} a + t_{2 i} b)}, \dots, e^{j 2 π \frac{f_{L}}{c} (t_{1 i} a + t_{2 i} b)}] S_{i},

(2)

where

T = {(t_{1}, t_{2}) | {(t_{1} / Δ x)}^{2} + {(t_{2} / Δ y)}^{2} \leq 1},

the symbol “

\circ

” denotes the Hadamard product,

c

is the sound speed and

j = \sqrt{- 1}

is the imaginary unit. The center frequencies of all subbands are collected in a row vector

f = [f_{1}, f_{2}, \dots, f_{L}] \in R^{1 \times L}

R

denotes the set of real numbers.

S_{i} = Diag (s_{i}) \in C^{L \times L}

, where

Diag (\cdot)

forms a diagonal matrix with the vector in parentheses as the diagonal. We construct

D (t_{1 i}, t_{2 i}, f) \in C^{A B \times L}

D (t_{1 i}, t_{2 i}, f) = D (t_{1 i}, f) ⊙ D (t_{2 i}, f) = [\begin{array}{c} 1 \\ e^{j 2 π \frac{f}{c} t_{1 i}} \\ ⋮ \\ e^{j 2 π \frac{f}{c} t_{1 i} (A - 1)} \end{array}] ⊙ [\begin{array}{c} 1 \\ e^{j 2 π \frac{f}{c} t_{2 i}} \\ ⋮ \\ e^{j 2 π \frac{f}{c} t_{2 i} (B - 1)} \end{array}],

(3)

where the symbol “

⊙

” represents the Khatri-Rao product. For brevity,

D (t_{1 i}, t_{2 i}, f)

is expressed as

D_{i}

D_{i} = [d_{i, 1}, d_{i, 2}, \dots, d_{i, L}]

, and

d_{i, l} \in C^{A B \times 1}

is the

l th

column of

D_{i}

. Let

P = {[p_{0, 0}^{T}, p_{0, 1}^{T}, \dots, p_{A - 1, B - 1}^{T}]}^{T} \in C^{A B \times L}

, and

{(\cdot)}^{T}

denotes the transpose.

P

can be expressed as

P = \sum_{i = 1}^{I} D_{i} S_{i} = D S,

(4)

where

D = [D_{1}, D_{2}, \dots, D_{I}] \in C^{A B \times L I}

and

S = {[S_{1}^{T}, S_{2}^{T}, \dots, S_{I}^{T}]}^{T} \in C^{L I \times L}

. When considering the noise inference, the measured signal is

P^{#} = P + N,

(5)

where

N \in C^{A B \times L}

is the noise matrix. The signal-to-noise (SNR) ratio is defined as

SNR = 20 \log_{10} ({‖ P ‖}_{F} / {‖ N ‖}_{F})

, where

{‖ \cdot ‖}_{F}

denotes the Frobenius norm.

Figure 1.

Measurement model.

2.2 Primal problem

In the continuous spatial domain, the atomic norm of $S$ is defined as

{‖ S ‖}_{A} = \sum_{i = 1}^{I} tr ([\begin{array}{c} | s_{i, 1} | & 0 & \dots & 0 \\ 0 & | s_{i, 2} | & \dots & 0 \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & \dots & | s_{i, L} | \end{array}]),

(6)

where

tr (\cdot)

denotes the trace of a matrix,

| \cdot |

denotes to take the absolute value.

{‖ S ‖}_{A}

measures the sparsity of source distribution. The DOA estimation in the continuous spatial domain can be achieved by solving the following convex optimization problem:

\min_{x} {‖ S ‖}_{A} s . t . {‖ P^{#} - D S ‖}_{F} \leq ε,

(7)

where

ε

is the upper bound of the Frobenius norm of noise. Since

x (t_{1}, t_{2})

is a variable in the continuous domain, the primal problem is an infinite-dimensional optimization problem and cannot be directly solved. In the next section, we will solve Eq. (7) by solving its corresponding dual problem.

2.3 Dual problem

Firstly, the Lagrange function of Eq. (7) is constructed,

L (S, C, β) = {‖ S ‖}_{A} + Re (tr (C^{H} (P^{#} - D S - N))) + β ({‖ N ‖}_{F}^{2} - ε^{2}),

(8)

where

β \in R^{+}

and

C = [c_{1}, c_{2}, \dots, c_{L}] \in C^{A B \times L}

are Lagrangian multipliers,

R^{+}

denotes the set of positive real numbers,

{(\cdot)}^{H}

denotes the Hermitian transpose, and

(\cdot)

takes the real parts of complex numbers. The dual function

g (C, β)

is the maximum of the lower bound of Eq. (8),

g (C, β) = \inf_{S} L (S, C, β)

= \inf_{S} ({‖ S ‖}_{A} + Re (tr (C^{H} (P^{#} - D S - N))) + β ({‖ N ‖}_{F}^{2} - ε^{2})) = \inf_{S} ({‖ S ‖}_{A} + Re (tr (C^{H} P^{#} - C^{H} D S - C^{H} N)) + β (tr (N N^{H}) - ε^{2}))

= \inf_{S} ({‖ S ‖}_{A} - Re (tr (C^{H} D S))) + Re (tr (C^{H} P^{*} - C^{H} N)) + β (tr (N N^{H}) - ε^{2}),

(9)

where

\inf (\cdot)

denotes the greatest lower bound.

Then, we minimize $g (C, β)$ by taking its partial derivative with respect to the noise matrix and setting the derivative to 0, i.e.,

\frac{\partial g (C, β)}{\partial N} = \frac{\partial (- Re (tr (C^{H} N)) + β tr (N N^{H}))}{\partial N} = - C + 2 β N = 0 .

(10)

Therefore, the optimal noise matrix is $N_{opt} = C / (2 β)$ . Now, we calculate the dual function at $N_{opt}$ and maximize it with respect to $β$ ,

\frac{{\partial g (C, β) |}_{N_{opt}}}{\partial β} = \frac{{\partial β (tr (N N^{H}) - ε^{2}) |}_{N_{opt}}}{\partial β} = {(tr (N N^{H}) - ε^{2}) |}_{N_{opt}} = \frac{tr (C C^{H})}{4 β^{2}} - ε^{2} = 0,

(11)

thus the optimal dual variable is

β_{opt} = \sqrt{tr (C C^{H}) / (4 ε^{2})} = {‖ C ‖}_{F} / (2 ε)

. The dual function at

N_{opt}

and

β_{opt}

{g (C) |}_{N_{opt}, β_{opt}}

= {\inf_{S} ({‖ S ‖}_{A} - Re (tr (C^{H} D S))) + Re (tr (C^{H} P^{#})) - Re (tr (C^{H} N)) + β (tr (N N^{H}) - ε^{2}) |}_{N_{opt}, β_{opt}}

= {\inf_{S} ({‖ S ‖}_{A} - Re (tr (C^{H} D S))) + Re (tr (C^{H} P^{#})) - Re (tr (\frac{C^{H} C}{2 β})) + β (tr (\frac{C^{H} C}{4 β^{2}}) - ε^{2}) |}_{β_{opt}} = {\inf_{S} ({‖ S ‖}_{A} - Re (tr (C^{H} D S))) + Re (tr (C^{H} P^{#})) - \frac{{‖ C ‖}_{F}^{2}}{2 β} |}_{β_{opt}}

= \inf_{S} ({‖ S ‖}_{A} - Re (tr (C^{H} D S))) + Re (tr (C^{H} P^{#})) - ε {‖ C ‖}_{F} .

(12)

With respect to the first part in Eq. (12), we have

{‖ S ‖}_{A} - Re (tr (C^{H} D S)) = {‖ S ‖}_{A} - \sum_{i = 1}^{I} tr (Re ({(D_{i}^{H} C)}^{H} S_{i}))

= {‖ S ‖}_{A} - \sum_{i = 1}^{I} tr (| D_{i}^{H} C | \circ | S_{i} | \circ {\cos φ}_{i})

= \sum_{i = 1}^{I} tr (| S_{i} | - | D_{i}^{H} C | \circ | S_{i} | \circ {\cos φ}_{i})

= \sum_{i = 1}^{I} tr (| S_{i} | \circ (I_{L} - | D_{i}^{H} C | \circ {\cos φ}_{i}))

\geq \sum_{i = 1}^{I} tr (| S_{i} | \circ (I_{L} - | D_{i}^{H} C |))

= \sum_{i = 1}^{I} \sum_{l = 1}^{L} (| s_{i, l} | (1 - | d_{i, l}^{H} c_{l} |))

(13)

where

| X |

represents the modulus operation of each element in the matrix

X

I_{L}

denotes a

L \times L

unit matrix, and

φ_{i}

is the element-wise included angle between

D_{i}^{H} C

and

S_{i}

. When

| d_{i, l}^{H} c_{l} | \leq 1

, Eq. (13) is nonnegative, implying that its lower bound is 0. When

| d_{i, l}^{H} c_{l} | > 1

, the lower bound is

- \infty

. Therefore, the dual function in Eq. (12) can be rewritten as

{g (C) |}_{N_{opt}, β_{opt}} = {\begin{cases} Re ((tr (C^{H} P^{#}))) - ε {‖ C ‖}_{F}, \\ - \infty, \end{cases} \begin{array}{l} | d_{i, l}^{H} c_{l} | \leq 1 \\ otherwise \end{array} .

(14)

It is known that $d_{i, l}^{H} (c_{1} c_{1}^{H} + c_{2} c_{2}^{H} + \dots + c_{L} c_{L}^{H}) d_{i, l}$ is the $l th$ diagonal element of $D_{i}^{H} C C^{H} D_{i}$ , and

{| d_{i, l}^{H} c_{l} |}^{2} = d_{i, l}^{H} c_{l} c_{l}^{H} d_{i, l} \leq d_{i, l}^{H} (c_{1} c_{1}^{H} + c_{2} c_{2}^{H} + \dots + c_{L} c_{L}^{H}) d_{i, l} .

(15)

Therefore, the constraint

| d_{i, l}^{H} c_{l} | \leq 1

in Eq. (14) can be replaced by

d_{i, l}^{H} (c_{1} c_{1}^{H} + c_{2} c_{2}^{H} + \dots + c_{L} c_{L}^{H}) d_{i, l} \leq 1

We construct a Hermitian matrix $H \in C^{A B \times A B}$ , which satisfies $d {(t_{1}, t_{2}, f_{l})}^{H} H d (t_{1}, t_{2}, f_{l}) = 1$ . Since

d {(t_{1}, t_{2}, f_{l})}^{H} H d (t_{1}, t_{2}, f_{l}) = sum {(d {(t_{1}, t_{2}, f_{l})}^{*} d {(t_{1}, t_{2}, f_{l})}^{T}) \circ H} = sum {G \circ H},

(16)

where

sum {\cdot}

takes the sum of all elements in vectors or matrices,

{(\cdot)}^{*}

denotes the conjugate,

G = d {(t_{1}, t_{2}, f_{l})}^{*} d {(t_{1}, t_{2}, f_{l})}^{T} \in C^{A B \times A B}

is a two-fold Toeplitz matrix with the dimension of

A B \times A B

, whose elements are all in the exponential form,

G = [\begin{array}{c} G_{0} & G_{1}^{H} & \dots & G_{A - 1}^{H} \\ G_{1} & G_{0} & \dots & G_{A - 2}^{H} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ G_{A - 1} & G_{A - 2} & \dots & G_{0} \end{array}] .

(17)

Each block $G_{m}, m = 0, 1, \dots, (A - 1)$ is a Toeplitz matrix with the dimension of $B \times B$ . All diagonal elements of $G_{0}$ are 1. Accordingly, we partition $H$ into blocks, i.e.,

H = [\begin{array}{c} H_{0, 0} & H_{0, 1} & \dots & H_{0, A - 1} \\ H_{0, 1}^{H} & H_{1, 1} & \dots & H_{1, A - 1} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ H_{0, A - 1}^{H} & H_{1, A - 1}^{H} & \dots & H_{A - 1, A - 1} \end{array}] .

(18)

Let $J_{m} = \sum_{j = 0}^{A - 1 - m} H_{j, j + m}, m = 0, 1, \dots, (A - 1)$ . Since $sum {G \circ H} = 1$ , $J_{m}$ satisfies that $sum {diag (J_{0}, 0)} = 1$ , $sum {diag (J_{0}, n)} = 0, n = 1, 2, \dots, (B - 1)$ , and $sum {diag (J_{m}, n)} = 0, m = 1, 2, \dots, (A - 1), n = - (B - 1), \dots, - 1, 0, 1, \dots, (B - 1) .$ $diag (\cdot, n)$ takes the $n th$ diagonal of matrices and forms a column vector. $n$ represents the offset of diagonals. When $n = 0$ , it refers to the principal diagonal. When $n > 0$ , it denotes the $n th$ minor diagonal above the principal diagonal, and when $n < 0$ , it is the $| n | th$ minor diagonal below the principal diagonal.

For an arbitrary $i$ and $l$ , the diagonal elements of $D_{i}^{H} C C^{H} D_{i}$ are constrained to be no larger than 1, i.e.,

d {(t_{1}, t_{2}, f_{l})}^{H} C C^{H} d (t_{1}, t_{2}, f_{l}) \leq d {(t_{1}, t_{2}, f_{l})}^{H} H d (t_{1}, t_{2}, f_{l}) .

(19)

According to the Schur complement condition, we have

[\begin{array}{l} H & C \\ C^{H} & I_{L} \end{array}] ≽ 0,

(20)

where

≽ 0

denotes that the matrix is positive semidefinite.

Overall, the primal minimization problem (Eq. (7)) is equivalent to the maximization of the dual function (Eq (14).), and it can be rewritten as

{\hat{C}, \hat{H}} = \arg \max (tr (C^{H} P^{#})) - ε {‖ C ‖}_{F}

s . t . [\begin{array}{l} H & C \\ C^{H} & I_{L} \end{array}] ≽ 0, H = [\begin{array}{c} H_{0, 0} & H_{0, 1} & \dots & H_{0, A - 1} \\ H_{0, 1}^{H} & H_{1, 1} & \dots & H_{1, A - 1} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ H_{0, A - 1}^{H} & H_{1, A - 1}^{H} & \dots & H_{A - 1, A - 1} \end{array}]

J_{m} = \sum_{j = 0}^{A - 1 - m} H_{j, j + m}, m = 0, 1, \dots, (A - 1)

sum {diag (J_{0}, 0)} = 1 sum {diag (J_{0}, n)} = 0 | n = 1, 2, \dots, (B - 1)

sum {diag (J_{m}, n)} = 0 | m = 1, 2, \dots, (A - 1), n = - (B - 1), \dots, - 1, 0, 1, \dots, (B - 1)

(21)

This positive semidefinite programming problem can be solved by the SDPT3 solver in the CVX toolbox.

2.4 DOA and strength estimation

Define the spatial spectral function as

y (θ, ϕ) = \sum_{l = 1}^{L} d {(t_{1}, t_{2}, f_{l})}^{H} {\hat{c}}_{l} {\hat{c}}_{l}^{H} d (t_{1}, t_{2}, f_{l}),

(22)

where

{\hat{c}}_{l}

is the

l th

column of

\hat{C}

. According to the relationship between the primal problem and the dual problem, the peak positions of

y (θ, ϕ)

correspond to source DOAs. The number of the peaks whose amplitudes are larger than

0.7 \max {y (θ, ϕ)}

y (θ, ϕ)

is denoted by

\hat{k}

. The locations of these

\hat{k}

peaks are the estimated DOAs

{{\hat{Ω}}_{S i} = ({\hat{θ}}_{S i}, {\hat{ϕ}}_{S i}) | i = 1, 2, \dots, \hat{k}}

With the estimated DOAs, we obtain $t_{1 i} \equiv \sin {\hat{θ}}_{S i} \cos {\hat{ϕ}}_{S i} Δ x$ and $t_{2 i} \equiv \sin {\hat{θ}}_{S i} \cos {\hat{ϕ}}_{S i} Δ y$ . Then, we construct the sensing matrix $A = [D (t_{11}, t_{21}, f), D (t_{12}, t_{22}, f), \dots, D (t_{1 \hat{k}}, t_{2 \hat{k}}, f)] \in C^{A B \times \hat{k} L}$ and quantify the strengths using the least-square method,

\hat{S} = A^{+} P^{#},

(23)

where

{()}^{+}

denotes the pseudo-inverse.

3. Simulation

A rectangular microphone array consisting of 8 rows and 8 columns (a total of 64 microphones) is used, with a spacing of $Δ x = Δ y = 0.035 m$ between adjacent microphones. The upper frequency limit for aliasing-free sampling is approximately 4900 Hz. With reference to the existing two-dimensional single-frequency grid-free compressive beamforming method in Refs.¹⁷, this section analyzes the performance of the proposed method when the source frequency is below 4900 Hz (satisfying the spatial Nyquist criterion and free of spatial aliasing) and exceeds 4900 Hz (spatial aliasing occurs). We conduct D Monte-Carlo trials, $Δ Ω_{S i, d}$ denotes the DOA estimation error of the $i th$ source in the $d th$ trial,

Δ Ω_{S i, d} = \frac{180}{π} \arccos (\cos {\hat{θ}}_{S i, d} \cos θ_{S i, d} + \cos ({\hat{ϕ}}_{S i, d} - ϕ_{S i, d}) \sin {\hat{θ}}_{S i, d} \sin θ_{S i, d}) .

(24)

We also define the probability of accurate identification (P), the DOA estimation root mean square error (RMSE), and the strength quantification mean error (ME) as

P = \frac{size ([Δ Ω_{S i, d} | Δ Ω_{S i, d} \leq 5 °])}{I D},

(25)

RMSE = \frac{{‖ [Δ Ω_{S i, d} | Δ Ω_{S i, d} \leq 5 °] ‖}_{2}}{\sqrt{size ([Δ Ω_{S i, d} | Δ Ω_{S i, d} \leq 5 °])}},

(26)

ME = \frac{\sum | [20 lo g_{10} (| {\hat{s}}_{i, d} | / | s_{i, d} |) | Δ Ω_{S i, d} \leq 5 °] |}{size ([Δ Ω_{S i, d} | Δ Ω_{S i, d} \leq 5 °])},

(27)

where

{‖ \cdot ‖}_{2}

denotes

l_{2}

norm,

size (\cdot)

represents the size of vectors in parentheses, and

[Δ Ω_{S i, d} | Δ Ω_{S i, d} \leq 5 °]

denotes a column vector consisting of all

Δ Ω_{S i, d}

which satisfy the condition after the vertical bar.

3.1 Aliasing-free case

Assume three sources. They radiate sound waves with a frequency of 3000 Hz. The DOAs ( $Ω_{S i} = (θ_{S i}, ϕ_{S i})$ ) are $(30 °, 200 °)$ , $(50 °, 90 °)$ , and $(80 °, 150 °)$ . The strengths are 100 dB, 94 dB, and 97 dB (referenced to $2.0 \times 1 0^{- 5} Pa$ ). When simulating the pressures measured by the microphone array, white Gaussian noise with an SNR of 20 dB is added to the clean signal. The proposed method and the method in Ref.¹⁷ are used to process the pressures at 3000 Hz. Figure 2(a) shows the spatial spectral function of the proposed method. Figure 2(b) and 2(c) present the source identification results. The spatial spectral function exhibits the peaks with nearly unit amplitudes at locations corresponding to the true source DOAs. Both of the two methods accurately localize the sources. RMSE are $0.59 °$ and $0.84 °$ , respectively. ME are 0.12 dB and 1.49 dB. These two methods achieve comparable performance in DOA estimation, but the proposed method demonstrates a slightly better performance in strength quantification.

Figure 2.

Identification results of the proposed method (single-frequency modelling) and the method in Ref.¹⁷ under aliasing-free condition.

We vary SNR and perform Monte Carlo trials. The frequency is set to 3000 Hz. The SNR ranges from 5 dB to 30 dB with a step of 5 dB. At each SNR, 1000 trials are considered. In each trial, four sources are assumed, whose DOAs are randomly generated while satisfying the minimum separation condition.¹⁸ The strengths are set to unit amplitude with random phases. Figure 3 shows P, RMSE, ME, and the average runtime for the two methods. As shown in Figure 3(a), the proposed method achieves a higher probability of accurate identification than the method in Ref.¹⁷ across different SNRs. From Figure 3(b) and 3(c), it can be observed that both methods yield comparable DOA estimation accuracy, whereas the proposed method significantly outperforms the method in Ref.¹⁷ in terms of strength quantification. The advantage becomes more pronounced as the SNR decreases. At SNRs of 10 dB and 5 dB, the strength quantification error is reduced by over 80% and 90%, respectively. In Figure 3(d), the runtimes of the two methods are nearly identical, since they operate on matrices of the same dimension and employ the same solver.

Figure 3.

Performance comparison between the proposed method (single-frequency modelling) and the method in Ref.¹⁷ under aliasing-free condition at different SNRs.

3.2 Spatial aliasing case

In the case described in Section 3.1, the source frequency is changed to 7000 Hz, while all other conditions remain unchanged. The results are shown in Figure 4. Figure 4(a) presents the spatial spectral function, which produces peaks with high amplitudes at the locations where there is no source due to spatial aliasing. The imaging maps are shown in Figure 4(b) and 4(c). The proposed method identifies some spurious sources, whereas the method in Ref.¹⁷ could only localize a part of sources. The presence of these spurious sources adversely affects the subsequent strength quantification.

Figure 4.

Identification results of the proposed method (single-frequency modelling) and the method in Ref.¹⁷ under spatial aliasing.

Assume that the three sources contain spectral components not only at 7000 Hz but also at 7500 Hz and 8000 Hz. The source strengths at the frequencies are [100, 94, 94] dB, [94, 100, 94] dB, and [97, 97, 94] dB (referenced to $2.0 \times 1 0^{- 5} Pa$ ). Other simulation conditions remain unchanged. The proposed method is used to post the pressures at the three frequencies synchronously. Figure 5(a) and 5(b) present the spatial spectral function and the imaging map, respectively. Compared with Figure 4(a) and 4(b), we find that spurious peaks are effectively suppressed, and sources are accurately identified.

Figure 5.

Identification results of the proposed method under spatial aliasing with multi-frequency synchronous modeling.

SNR estimation is required by CVX toolbox to solve the dual problem. To investigate its influence on the performance of the proposed method, we underestimate and overestimate SNR to 10 dB and 30 dB (true SNR is 20 dB), and examine the performance based on the case in Figure 5. The results under inaccurate SNR estimation are shown in Figure 6. Comparing Figure 5(b) and Figure 6, we observe that underestimation of SNR has negligible effect on both DOA estimation and strength quantification, whereas overestimation slightly increases DOA estimation error, with minimal impact on strength quantification.

Figure 6.

Identification results of the proposed method under spatial aliasing with multi-frequency synchronous modeling and inaccurate SNR estimation.

We vary frequency and perform Monte Carlo trials. The fundamental frequencies are set to 5000 Hz, 6000 Hz, 7000 Hz, 8000 Hz, and 9000 Hz, respectively. For each fundamental frequency, four additional frequency components are defined, each increasing by 100 Hz from the fundamental one (e.g., when the fundamental frequency is 5000 Hz, the frequency components are [5000, 5100, 5200, 5300, 5400] Hz). The method in Ref.¹⁷ only processes the pressures at the fundamental frequencies, whereas the proposed method utilizes both fundamental and additional frequency components. At each frequency, 1000 trials are conducted with four sources. DOAs are randomly generated while satisfying the minimum separation condition. Each source is assigned a unit amplitude with a random phase, and white Gaussian noise is added to achieve an SNR of 20 dB. Figure 7 shows P, RMSE, ME, and the average runtime for the two methods. Obviously, when frequency increases, the probability of accurately identifying sources of the single-frequency method in Ref.¹⁷ drops sharply due to spatial aliasing. In contrast, the proposed method maintains a high probability and accuracy in source identification. A slight performance degradation is observed for the proposed method at higher frequencies, which may be attributed to the reduced relative frequency spacing that causes overlapping peaks at aliasing positions. Since multi-frequency synchronous processing increases the dimensionality of matrices, the runtime correspondingly rises.

Figure 7.

Performance comparison between the proposed method (multi-frequency synchronous modelling) and the method in Ref.¹⁷ under spatial aliasing at different frequencies.

4. Experiments

To validate the effectiveness of the proposed method and the correctness of the simulation results, a loudspeaker source identification experiment is conducted in a semi-anechoic chamber using a $4 \times 4$ rectangular array. The experimental setup is shown in Figure 8. The array is placed at a height of 1.1 m above the ground, and Type 4958 microphones from Brüel & Kjær company are used. The spacing between two adjacent microphones is $Δ x = Δ y = 0.07 m$ , therefore the upper frequency limit of this array is approximately 2450 Hz. A Cartesian coordinate system is established with the (0, 0) th microphone as the origin. A loudspeaker driven by white noise is placed at (−2.24, 0, 5) m. The DOAs of the loudspeaker source and its ground-reflected image source are about $(24.1 °, 180 °)$ and $(32.1 °, 224.5 °)$ , respectively. Brüel & Kjær PULSE Type 3660C data acquisition system is used to simultaneously record the signals from all microphones with a sampling rate of 16384 Hz. When performing FFT, the Hanning weighting is used, and the frequency resolution is set to 1 Hz.

Figure 8.

Experimental layout.

Figure 9(a) shows the spatial spectral function of the proposed method when processing the pressures at 6500 Hz and 7000 Hz. It can be observed that many spurious peaks caused by spatial aliasing are effectively suppressed. Figure 9(b) and 9(c) present the imaging results of the proposed method and the method in Ref.¹⁷ at 6500 Hz. The method in Ref.¹⁷ regards the positions where spatial aliasing occurs as sources. The proposed method successfully locates the sources close to their true positions, though there is a weak spurious source within the dynamic range. The experiment draws a similar conclusion to the simulations. In summary, the proposed two-dimensional multi-frequency synchronous grid-free CB method effectively suppresses spatial aliasing and accurately identifies sources whose wavelengths are shorter than twice the minimum microphone spacing.

Figure 9.

Experimental results of the proposed method (multi-frequency synchronous modelling) and the method in Ref.¹⁷ under spatial aliasing.

5. Conclusion

To overcome high-frequency spatial aliasing problem of the two-dimensional single-frequency grid-free CB method, this paper proposes a two-dimensional multi-frequency synchronous grid-free CB method. The approach formulates a two-dimensional multi-frequency synchronous atomic norm minimization problem, then converts it into a dual problem for solving, and finally extracts DOAs and strengths by using a spatial spectral function related to the dual polynomial. Simulations and experiments demonstrate that when the spatial Nyquist criterion is satisfied, both the proposed method and the existing single-frequency method can achieve accurate source identification, while the proposed method reduces strength quantification error by over 80% under strong noise interference. When the spatial Nyquist criterion is violated, the proposed method effectively suppresses spatial aliasing and accurately identifies sources whose wavelengths are shorter than twice the minimum microphone spacing. The dual problem is currently solved using the CVX toolbox, which suffers from low computational efficiency. Future work will focus on developing a more efficient solver to reduce computational cost, as well as extending the method to non-uniform array geometries.

Footnotes

ORCID iDs

Yang Yang

Shijia Yin

Ruixue Ma

Tongrui Peng

Linbang Shen

Zhigang Chu

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by the National Natural Science Foundation of China, grant number 12304519, the Natural Science Foundation of Chongqing, grant number CSTB2023NSCQ-MSX0548, and the New Chongqing Youth Innovation Talent Project, grant number CSTB2024NSCQ-QCXMX0068.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

References

Edelmann

Gaumond

. Beamforming using compressive sensing. Journal of the Acoustical Society of America 2011; 130(41): 232–237. https://doi.org/10.1121/1.3632046

Xenaki

Gerstoft

Mosegaard

. Compressive beamforming. Journal of the Acoustical Society of America 2014; 136(1): 260–271. https://doi.org/10.1121/1.4883360

Zhong

Wei

Huang

. Compressive sensing beamforming based on covariance for acoustic imaging with noisy measurements. Journal of the Acoustical Society of America 2013; 134(5): 445–451. https://doi.org/10.1121/1.4824630

Wei

Huang

. Compressive sensing based beamforming and its application in aeroacoustic experiment. 20th AIAA/CEAS Aeroacoustics Conference. Atlanta, Georgia, USA, 20 June 2014. AIAA-2014-2918.

Huang

. Reconstruction of aircraft engine noise source using beamforming and compressive sensing. IEEE Access 2018; 6: 11716–11726. https://doi.org/10.1109/access.2018.2801260

Han

Qun

Feng

, et al.

Application of microphone array using compressed sensing algorithm for abnormal sound localization in aircraft strength test

2023 IEEE 6st International Conference on Information Communication and Signal Processing (ICICSP). Xi’an, China, 23-25 September 2023, pp. 1–6.

Meng

Masiero

, et al. Signal reconstruction of fast moving sound sources using compressive beamforming. Applied Acoustics 2019; 150: 236–245. https://doi.org/10.1016/j.apacoust.2019.02.012

Wang

Yang

Wang

, et al. Time-domain signal reconstruction of vehicle interior noise based on deep learning and compressed sensing techniques. Mechanical Systems and Signal Processing 2020; 139: 106635. https://doi.org/10.1016/j.ymssp.2020.106635

Huang

Zhang

. High-resolution acoustical imaging for rotating acoustic source based on compressive sensing beamforming. 25th AIAA/CEAS Aeroacoustics Conference. Delft, Netherlands, 20-23 May 2019. AIAA-2019-2410.

10.

Sun

Wang

Yang

, et al. Damage identification of wind turbine blades using an adaptive method for compressive beamforming based on the generalized minimax-concave penalty function. Renewable Energy 2022; 181: 59–70. https://doi.org/10.1016/j.renene.2021.09.024

11.

Sun

Wang

Yang

, et al. Damage identification of wind turbine blades using the microphone array under different parametric and measuring conditions: A prototype study with laboratory-scale models. Structural Health Monitoring 2023; 22(1): 201–215. https://doi.org/10.1177/14759217221085655

12.

Yang

Chu

. Reviews on high-performance beamforming method. Journal of Mechanical Engineering 2021; 57(24): 166–183, [In Chinese].

13.

Park

Seong

Gerstoft

. Block-sparse two-dimensional off-grid beamforming with arbitrary planar array geometry. Journal of the Acoustical Society of America 2020; 147(4): 2184–2191. https://doi.org/10.1121/10.0000983

14.

Huang

Zoubir

. Off-grid direction-of-arrival estimation using second-order Taylor approximation. Signal Processing 2022; 196: 108513. https://doi.org/10.1016/j.sigpro.2022.108513

15.

Wang

Zhuang

Zhang

, et al. Weighted block l1 norm induced 2-D off-grid compressive beamforming for acoustic source localization: Methodology and applications. Applied Acoustics 2023; 214: 10967.

16.

Yang

Chu

Yang

, et al. Two-dimensional Newtonized orthogonal matching pursuit compressive beamforming. Journal of the Acoustical Society of America 2020; 148(3): 1337–1348. https://doi.org/10.1121/10.0001919

17.

Yang

Chu

, et al. Two-dimensional grid-free compressive beamforming. Journal of the Acoustical Society of America 2017; 142(2): 618–629. https://doi.org/10.1121/1.4996460

18.

Yang

Chu

Ping

, et al. Resolution enhancement of two-dimensional grid-free compressive beamforming. Journal of the Acoustical Society of America 2018; 143(6): 3860–3872. https://doi.org/10.1121/1.5042239

19.

Yang

Chu

Ping

. Alternating direction method of multipliers for weighted atomic norm minimization in two-dimensional grid-free compressive beamforming. Journal of the Acoustical Society of America 2018; 144(5): 361–366. https://doi.org/10.1121/1.5066345

20.

Yang

Chu

Ping

. Two-dimensional multiple-snapshot grid-free compressive beamforming. Mechanical Systems and Signal Processing 2019; 124: 524–540. https://doi.org/10.1016/j.ymssp.2019.02.011

21.

Yang

Chu

. Two-dimensional multiple-snapshot grid-free compressive beamforming using alternating direction method of multipliers. Shock and Vibration 2020; 2020: 1310805–1310811. https://doi.org/10.1155/2020/1310805

22.

Tang

Jiang

Pang

. Grid-free DOD and DOA estimation for MIMO radar via duality-based 2-D atomic norm minimization. IEEE Access 2019; 7: 60827–60836. https://doi.org/10.1109/access.2019.2915189

23.

Liu

Chu

Yang

. Iterative Vandermonde decomposition and shrinkage-thresholding based two-dimensional grid-free compressive beamforming. Journal of the Acoustical Society of America 2020; 148(3): 301–306. https://doi.org/10.1121/10.0002029

24.

Yang

Xie

Stoica

. Vandermonde decomposition of multilevel Toeplitz matrices with application to multidimensional super-resolution. IEEE Transactions on Information Theory 2016; 62(6): 3685–3701. https://doi.org/10.1109/tit.2016.2553041

25.

Hyder

Mahata

. Direction-of-Arrival estimation using a mixed ℓ2,0 norm approximation. IEEE Transactions on Signal Processing 2010; 58(9): 4646–4655. https://doi.org/10.1109/tsp.2010.2050477

26.

Reddy

Khong

AWH

. Unambiguous speech doa estimation under spatial aliasing conditions. IEEE/ACM Transactions on Audio, Speech, and Language Processing 2014; 22(12): 2133–2145. https://doi.org/10.1109/taslp.2014.2344856

27.

Tang

Blacquiere

Leus

, et al. Aliasing-free wideband beamforming using sparse signal representation. IEEE Transactions on Signal Processing 2011; 59(7): 3464–3469. https://doi.org/10.1109/tsp.2011.2140108

28.

Ang

Nguyn

Gan

. Multiband grid-free compressive beamforming. Mechanical Systems and Signal Processing 2020; 135: 106425. https://doi.org/10.1016/j.ymssp.2019.106425

29.

Wakin

Gerstoft

. Gridless DOA estimation with multiple frequencies. IEEE Transactions on Signal Processing 2023; 71: 417–432. https://doi.org/10.1109/tsp.2023.3244091

30.

Wakin

Gerstoft

. Non-uniform array and frequency spacing for regularization-free gridless DOA. IEEE Transactions on Signal Processing 2024; 72: 2006–2020. https://doi.org/10.1109/tsp.2024.3386018