AITopics | condeepmod

Collaborating Authors

condeepmod

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Speech Separation based on Contrastive Learning and Deep Modularization

Ochieng, Peter

arXiv.org Artificial IntelligenceSep-12-2023

The effectiveness of the use of general audio pre-trained models to boost speech separation has been explored in previous study with the main finding being that they provide minimal benefit when compared to features extracted without the models. It has been hypothesised that since the general audio pre-trained models were trained with clean audio dataset, they are unable to generalize to noisy and mixed speeches hence not effective in speech separation. This paper investigates this hypothesis by comparing the performance of pre-trained model trained on contaminated speeches and that trained on clean ones. We are interested in evaluating if contamination leads to better downstream performance. We also investigate if the type of input used to train the pre-trained model impacts the quality of embeddings it generates. To separate the sources, we propose a fully unsupervised technique of speech separation based on deep modularization. Our findings establish that by injecting noise and reverberation in the training dataset, the pre-trained model generate significantly better embeddings than when clean dataset is used. Further, based on the model presented here, working in short-time Fourier transform (STFT) results in better features than using time domain features. The deep modularization speech separation technique proposed is able to improve SI-SNRi and SDRi by 1.3 and 2.7 respectively when mixtures contain less than four sources and improves the results significantly for many source mixtures

condeepmod, pre-trained model, speech separation, (11 more...)

arXiv.org Artificial Intelligence

2305.10652

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback