Tensor-based matched-field processing applied to the SWellEx-96 data
Fangwei Zhu1,2, Guangying Zheng1,2,3✉, Xiaowei Guo1,2,3, Fangyong Wang1,2,3, Shuanping Du1,2,3, and Linlang Bai1,2,3
1 Science and Technology on Sonar Laboratory, Hangzhou, 310023, China
2 Hangzhou Applied Acoustics Research Institute, Hangzhou, 310023, China
3 Hanjiang National Laboratory, Wuhan, 430000, China
Email: 276454158@qq.com.
This study proposed a matched field source localization method based on tensor decomposition. By considering the advantages of tensors in multidimensional data processing, a three-dimensional tensor signal model of space-time-frequency is constructed, and the signal subspace is estimated using high-order singular value decomposition (HOSVD). The source position is estimated by matching the measured data tensor signal subspace with the replica field tensor signal subspace. The S5 event data of SWellEx-96 is processed by the proposed tensor-based matched-field processing (TMFP). The comparison with the results of conventional matched field processing (MFP) shows that TMFP has a better suppression effect on ambient noise under low SNR and better source localization performance.
Introduction : Acoustic source localization in an ocean waveguide is a subject of great interest. Localizing an underwater source target can serve several applications, from marine biology to anti-submarine warfare [1]. Matched field processing (MFP) [2-3] is the earliest and the most famous method for localizing the underwater source target. MFP is based on matching the measured acoustic data with the dictionary of the replica to estimate the source position. Given that MFP uses the prior information of the acoustic field, MFP performs better than the traditional geometry locating methods. MFP is sensitive to the mismatch of sound speed profile and geoacoustic parameters [4-6].
Recently, tensor signal processing has been a subject of great interest. Tensor algebra, also known as multilinear algebra, is a natural extension of classical linear algebra to high dimensions. It is a mathematical theory that characterizes the linear relationship between multiple variables and multidimensional data. It has been deeply studied and widely used in many fields, such as image processing, chemistry, and artificial intelligence [7-10]. Tensor is a way of data representation and provides it with rich theoretical connotation. The most crucial connotation is to develop effective tensor decomposition methods under multilinear theory, such as Tucker decomposition and higher-order singular value decomposition (HOSVD) [9]. As the name suggests, this decomposition is the expansion of singular value decomposition (SVD) under multilinear algebra theory. Compared with SVD, HOSVD can suppress the noise in multidimensional data samples, reducing the subspace estimation bias. Given this advantage, the HOSVD of tensor has been widely used in array signal processing in the last decade [11-13], especially in applying polarization sensor array and acoustic vector sensor array.
In a broad sense, matched field processing (MFP) can also be considered a beamformer. Thus, this study uses the beamformer based on tensor decomposition for reference, and it uses the HOSVD of tensor for matched field processing to improve the source localization performance.
Construction of the tensor signal model: A vertical line array with elements is used to receive the broadband sound field; the element dimension, frequency dimension, and snapshot dimension can be expanded into a third-order tensor (a snapshot is a sampling of all the elements in the time domain). The third-order tensor received data can be expressed as the sum of the three-mode product of the array Green’s function tensor , the multi-snapshot source model , and the ambient noise tensor, as shown in Fig.1.
Fig. 1 The diagram of the tensor signal model.
, (1)
where denotes the number of array elements, denotes the number of frequency points, denotes the number of sources, and denotes the number of snapshots. The elements in the array Green’s function tensor denote the sound field Green’s function from the n-th element to the k-th source at the m-th frequency point.
Derivation for tensor-based matched field processing (TMFP) : The multidimensional space-time-frequency signals received by the array are reconstructed into tensor signals. HOSVD decomposes the tensor signals to solve the tensor signal subspace, and the position of the sound source is estimated by utilizing the matching estimator. Since the matrix’s singular value decomposition (SVD) expanded in each tensor dimension can further suppress the noise to obtain a more accurate signal subspace, the source localization accuracy is improved.
Applying HOSVD [9] to the third-order tensor, as follows:
, (2)
Where is the kernel tensor of tensor, and are the tensor products of three dimensions, respectively.
Then, SVD is applied to the 3-mode expansion matrix of the tensor:
, (3)
, (4)
, (5)
where is the left singular matrix of the 1-mode expansion of the tensor, is the singular value matrix of the 1-mode expansion of the tensor, is the right singular matrix of the 1-mode expansion of the tensor, is the left singular matrix of the 2-mode expansion of the tensor, is the singular value matrix of the 2-mode expansion of the tensor, is the right singular matrix of the 2-mode expansion of the tensor, is the left singular matrix of the 3-mode expansion of the tensor, is the singular value matrix of the 3-mode expansion of the tensor, and is the right singular matrix of the 3-mode expansion of the tensor.
By truncating the left singular matrix of the n-mode expansion of the tensor, the signal correlation matrix is composed of the column vectors of the left singular matrices corresponding to the top M largest singular values, that is, , , and . The noise correlation matrix is composed of the column vectors of the left singular matrices corresponding to the remaining smaller singular values, that is, , , and .
Considering that the kernel tensor can be denoted as follows:
. (6)
The truncated kernel tensor can be expressed as follows:
, (7)
where is the kernel tensor of the truncated signal.
The third-order tensor output data can be approximately expressed as follows:
. (8)
Thus, the tensor signal subspace of the third-order tensor output data can be obtained, as follows:
. (9)
The ambiguity surface for source localization can be defined based on higher-order singular value decomposition and inner tensor product. For the hypothesized source range and depth , the ambiguity surface for source localization is defined as follows:
, (10)
where represents the inner product of the tensor in the third dimension, represents the normalized Green’s function matrix, and represents the inner product of tensor and Green’s function matrix in the first and second dimensions.
To compare the performance of TMFP and MFP, the normalized broadband MFP I [15] (based on summation) output of the Bartlett processor and broadband MFP II [16] resulting from maximum likehood estimation, given by,
, (11)
, (12)
where is the Green’s function vector of the receiving array corresponding to the -th frequency point under the hypothesized source range and depth ; is the number of frequency points; is the trace operation; is the cross-spectral density matrices corresponding to the -th frequency point, which is expressed as follows:
, (13)
where is the -th snapshot data corresponding to the -th frequency point.
SWellEx-96 data results: The SWellEx96 experiment was conducted near San Diego, CA in the spring of 1996. A vertical line array (VLA) was used to record the acoustic field. The VLA contained 21 hydrophones spanning a depth from 94 m to 212 m of the water column, spaced at 5.6 m.
In this section, we analyze the first 55 min of data recorded by the vertical line array after the start of event S5, in which the two towed sources travelled along an isobath of a mildly sloping environment toward the VLA at a distance from 8.6 km to 1.0 km. The depth corresponding to the shallow source was 9 m, and the depth corresponding to the deep source was 54 m. The data corresponding to the shallow source analyzed here involved three tonal signals at 109, 127, and 145 Hz. The data corresponding to the deep source analyzed here applied three tonal signals at 112, 130, and 148 Hz.
In event S5, the data sampling frequency is 1500Hz, and the data analysis starts from the first minute and is processed every 3 minutes. The duration of data intercepted in each processing is 54.6 s, and the intercepted data is divided into 9 snapshots. The adjacent snapshots overlapped by 50%, and the duration of each snapshots was 10.9 s. According to the tensor construction method in Section 2.1, the data tensors for shallow and deep sources are constructed with dimensions of 21×3×9.
For the replica field generation, the input water depth is approximately 216 m, the sound speed profile is a typical downward refracting sound speed profile, and the marine environment parameters are available online in Ref. 14. The normal mode program KRAKEN [17] was used to generate the acoustic field (or Green functions) for the 21-element VLA with frequencies of 109, 112, 127, 130, 145, and 148Hz. Then, the tensor-based MFP and conventional MFP I and MFP II are used to locate the shallow source and deep source.
The comparisons of localization results for the SwellEx-96 towed sources with MFP I, MFP II and TMFP are shown in Fig. 2.
Fig. 2a gives the variation in the range estimation errors of the shallow source versus time. The range estimation errors are small for TMFP, MFP I and MFP II. Figure 2b shows the variation in the depth estimation errors of the shallow source versus time. The depth estimation results by TMFP are similar to that by MFP I and MFP II, and the depth estimation errors are less than 10m.
Fig. 2c gives the variation in the range estimation errors of the deep source versus time. TMFP can accurately estimate the source range throughout the tow period studied, where as MFP I and MFP II has large errors in range estimation at 1 and 13mins. Fig. 2d shows the variation in the depth estimation errors of the deep source versus time. As the time changes, the range from source to VLA decreases, the depth estimation results of TMFP, MFP I and MFP II show a gradual deepening trend, and the maximum depth estimation error is up to 15m.
The reason is that the generation of replica field does not incorporate the bathymetry of real marine environment, which leads to the depth estimate offset.
Comparing the processing results of Figs.2b and 2d, the processing results of TMFP, MFP I and MFP II can accurately distinguish whether the source is a surface source or a submerged source.
By comparing the results of Fig. 2(a–d), the processing performance of TMFP is slightly better than that of MFP I and MFP II for the VLA data in event S5.