AITopics | Suo, Hongbin

Collaborating Authors

Suo, Hongbin

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Task-Agnostic Structured Pruning of Speech Representation Models

Wang, Haoyu, Wang, Siyuan, Zhang, Wei-Qiang, Suo, Hongbin, Wan, Yulong

arXiv.org Artificial IntelligenceJul-9-2023

Self-supervised pre-trained models such as Wav2vec2, Hubert, and WavLM have been shown to significantly improve many speech tasks. However, their large memory and strong computational requirements hinder their industrial applicability. Structured pruning is a hardware-friendly model compression technique but usually results in a larger loss of accuracy. In this paper, we propose a fine-grained attention head pruning method to compensate for the performance degradation. In addition, we also introduce the straight through estimator into the L0 regularization to further accelerate the pruned model. Experiments on the SUPERB benchmark show that our model can achieve comparable performance to the dense model in multiple tasks and outperforms the Wav2vec 2.0 base model on average, with 72% fewer parameters and 2 times faster inference speed.

machine learning, natural language, pruning, (19 more...)

arXiv.org Artificial Intelligence

2306.01385

Country: Asia > China (0.15)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.46)

Add feedback

BeamTransformer: Microphone Array-based Overlapping Speech Detection

Zheng, Siqi, Zhang, Shiliang, Huang, Weilong, Chen, Qian, Suo, Hongbin, Lei, Ming, Feng, Jinwei, Yan, Zhijie

arXiv.org Artificial IntelligenceSep-9-2021

We propose BeamTransformer, an efficient architecture to leverage beamformer's edge in spatial filtering and transformer's capability in context sequence modeling. BeamTransformer seeks to optimize modeling of sequential relationship among signals from different spatial direction. Overlapping speech detection is one of the tasks where such optimization is favorable. In this paper we effectively apply BeamTransformer to detect overlapping segments. Comparing to single-channel approach, BeamTransformer exceeds in learning to identify the relationship among different beam sequences and hence able to make predictions not only from the acoustic signals but also the localization of the source. The results indicate that a successful incorporation of microphone array signals can lead to remarkable gains. Moreover, BeamTransformer takes one step further, as speech from overlapped speakers have been internally separated into different beams.

beamtransformer, deep learning, speech recognition, (20 more...)

arXiv.org Artificial Intelligence

2109.04049

Country:

Europe (1.00)
North America > Canada (0.29)
North America > United States (0.28)
Asia > Middle East > Republic of Türkiye (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback