AITopics | Ding, Yiwei

Collaborating Authors

Ding, Yiwei

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Uncertainty Estimation in the Real World: A Study on Music Emotion Recognition

Watcharasupat, Karn N., Ding, Yiwei, Ma, T. Aleksandra, Seshadri, Pavan, Lerch, Alexander

arXiv.org Artificial IntelligenceJan-20-2025

Any data annotation for subjective tasks shows potential variations between individuals. This is particularly true for annotations of emotional responses to musical stimuli. While older approaches to music emotion recognition systems frequently addressed this uncertainty problem through probabilistic modeling, modern systems based on neural networks tend to ignore the variability and focus only on predicting central tendencies of human subjective responses. In this work, we explore several methods for estimating not only the central tendencies of the subjective responses to a musical stimulus, but also for estimating the uncertainty associated with these responses. In particular, we investigate probabilistic loss functions and inference-time random sampling. Experimental results indicate that while the modeling of the central tendencies is achievable, modeling of the uncertainty in subjective responses proves significantly more challenging with currently available approaches even when empirical estimates of variations in the responses are available.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2501.1157

Country:

Asia (1.00)
Europe > Germany > Bavaria (0.14)
Oceania > Australia > Queensland (0.14)
North America > United States > Hawaii (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Cognitive Science (0.86)

Add feedback

Parameter-Efficient Transfer Learning for Music Foundation Models

Ding, Yiwei, Lerch, Alexander

arXiv.org Artificial IntelligenceNov-28-2024

More music foundation models are recently being released, promising a general, mostly task independent encoding of musical information. Common ways of adapting music foundation models to downstream tasks are probing and fine-tuning. These common transfer learning approaches, however, face challenges. Probing might lead to suboptimal performance because the pre-trained weights are frozen, while fine-tuning is computationally expensive and is prone to overfitting. Our work investigates the use of parameter-efficient transfer learning (PETL) for music foundation models which integrates the advantage of probing and fine-tuning. We introduce three types of PETL methods: adapter-based methods, prompt-based methods, and reparameterization-based methods. These methods train only a small number of parameters, and therefore do not require significant computational resources. Results show that PETL methods outperform both probing and fine-tuning on music auto-tagging. On key detection and tempo estimation, they achieve similar results as fine-tuning with significantly less training cost. However, the usefulness of the current generation of foundation model on key and tempo tasks is questioned by the similar results achieved by training a small model from scratch. Code available at https://github.com/suncerock/peft-music/

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2411.19371

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.84)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Add feedback

Embedding Compression for Teacher-to-Student Knowledge Transfer

Ding, Yiwei, Lerch, Alexander

arXiv.org Artificial IntelligenceFeb-9-2024

Common knowledge distillation methods require the teacher model and the student model to be trained on the same task. However, the usage of embeddings as teachers has also been proposed for different source tasks and target tasks. Prior work that uses embeddings as teachers ignores the fact that the teacher embeddings are likely to contain irrelevant knowledge for the target task. To address this problem, we propose to use an embedding compression module with a trainable teacher transformation to obtain a compact teacher embedding. Results show that adding the embedding compression module improves the classification performance, especially for unsupervised teacher embeddings. Moreover, student models trained with the guidance of embeddings show stronger generalizability.

artificial intelligence, machine learning, student model, (17 more...)

arXiv.org Artificial Intelligence

2402.06761

Genre: Research Report > New Finding (0.88)

Industry: Education (0.92)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

A Generalized Bandsplit Neural Network for Cinematic Audio Source Separation

Watcharasupat, Karn N., Wu, Chih-Wei, Ding, Yiwei, Orife, Iroro, Hipple, Aaron J., Williams, Phillip A., Kramer, Scott, Lerch, Alexander, Wolcott, William

arXiv.org Artificial IntelligenceDec-1-2023

Cinematic audio source separation is a relatively new subtask of audio source separation, with the aim of extracting the dialogue, music, and effects stems from their mixture. In this work, we developed a model generalizing the Bandsplit RNN for any complete or overcomplete partitions of the frequency axis. Psychoacoustically motivated frequency scales were used to inform the band definitions which are now defined with redundancy for more reliable feature extraction. A loss function motivated by the signal-to-noise ratio and the sparsity-promoting property of the 1-norm was proposed. We additionally exploit the information-sharing property of a common-encoder setup to reduce computational complexity during both training and inference, improve separation performance for hard-to-generalize classes of sounds, and allow flexibility during inference time with detachable decoders. Our best model sets the state of the art on the Divide and Remaster dataset with performance above the ideal ratio mask for the dialogue stem.

data mining, machine learning, proc, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/OJSP.2023.3339428

2309.02539

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Industry: Media (0.93)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback