Gao, Dazhi
Enhanced Neural Beamformer with Spatial Information for Target Speech Extraction
Guo, Aoqi, Wu, Junnan, Gao, Peng, Zhu, Wenbo, Guo, Qinwen, Gao, Dazhi, Wang, Yujun
Recently, deep learning-based beamforming algorithms have shown promising performance in target speech extraction tasks. However, most systems do not fully utilize spatial information. In this paper, we propose a target speech extraction network that utilizes spatial information to enhance the performance of neural beamformer. To achieve this, we first use the UNet-TCN structure to model input features and improve the estimation accuracy of the speech pre-separation module by avoiding information loss caused by direct dimensionality reduction in other models. Furthermore, we introduce a multi-head cross-attention mechanism that enhances the neural beamformer's perception of spatial information by making full use of the spatial information received by the array. Experimental results demonstrate that our approach, which incorporates a more reasonable target mask estimation network and a spatial information-based cross-attention mechanism into the neural beamformer, effectively improves speech separation performance.
Dynamic Interactional And Cooperative Network For Shield Machine
Gao, Dazhi, Li, Rongyang, Wang, Hongbo, Mao, Lingfeng, Ning, Huansheng
The shield machine (SM) is a complex mechanical device used for tunneling. However, the monitoring and deciding were mainly done by artificial experience during traditional construction, which brought some limitations, such as hidden mechanical failures, human operator error, and sensor anomalies. To deal with these challenges, many scholars have studied SM intelligent methods. Most of these methods only take SM into account but do not consider the SM operating environment. So, this paper discussed the relationship among SM, geological information, and control terminals. Then, according to the relationship, models were established for the control terminal, including SM rate prediction and SM anomaly detection. The experimental results show that compared with baseline models, the proposed models in this paper perform better. In the proposed model, the R2 and MSE of rate prediction can reach 92.2\%, and 0.0064 respectively. The abnormal detection rate of anomaly detection is up to 98.2\%.
Sound source ranging using a feed-forward neural network with fitting-based early stopping
Chi, Jing, Li, Xiaolei, Wang, Haozhong, Gao, Dazhi, Gerstoft, Peter
When a feed-forward neural network (FNN) is trained for source ranging in an ocean waveguide, it is difficult evaluating the range accuracy of the FNN on unlabeled test data. A fitting-based early stopping (FEAST) method is introduced to evaluate the range error of the FNN on test data where the distance of source is unknown. Based on FEAST, when the evaluated range error of the FNN reaches the minimum on test data, stopping training, which will help to improve the ranging accuracy of the FNN on the test data. The FEAST is demonstrated on simulated and experimental data.