AITopics | Ji, Shulei

Collaborating Authors

Ji, Shulei

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Comprehensive Survey on Generative AI for Video-to-Music Generation

Ji, Shulei, Wu, Songruoyao, Wang, Zihao, Li, Shuyu, Zhang, Kejun

arXiv.org Artificial IntelligenceFeb-17-2025

The burgeoning growth of video-to-music generation can be attributed to the ascendancy of multimodal generative models. However, there is a lack of literature that comprehensively combs through the work in this field. To fill this gap, this paper presents a comprehensive review of video-to-music generation using deep generative AI techniques, focusing on three key components: visual feature extraction, music generation frameworks, and conditioning mechanisms. We categorize existing approaches based on their designs for each component, clarifying the roles of different strategies. Preceding this, we provide a fine-grained classification of video and music modalities, illustrating how different categories influence the design of components within the generation pipelines. Furthermore, we summarize available multimodal datasets and evaluation metrics while highlighting ongoing challenges in the field.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2502.12489

Country: Asia (0.28)

Genre: Overview (1.00)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.70)

Add feedback

MusER: Musical Element-Based Regularization for Generating Symbolic Music with Emotion

Ji, Shulei, Yang, Xinyu

arXiv.org Artificial IntelligenceJan-1-2024

Generating music with emotion is an important task in automatic music generation, in which emotion is evoked through a variety of musical elements (such as pitch and duration) that change over time and collaborate with each other. However, prior research on deep learning-based emotional music generation has rarely explored the contribution of different musical elements to emotions, let alone the deliberate manipulation of these elements to alter the emotion of music, which is not conducive to fine-grained element-level control over emotions. To address this gap, we present a novel approach employing musical element-based regularization in the latent space to disentangle distinct elements, investigate their roles in distinguishing emotions, and further manipulate elements to alter musical emotions. Specifically, we propose a novel VQ-VAE-based model named MusER. MusER incorporates a regularization loss to enforce the correspondence between the musical element sequences and the specific dimensions of latent variable sequences, providing a new solution for disentangling discrete sequences. Taking advantage of the disentangled latent vectors, a two-level decoding strategy that includes multiple decoders attending to latent vectors with different semantics is devised to better predict the elements. By visualizing latent space, we conclude that MusER yields a disentangled and interpretable latent space and gain insights into the contribution of distinct elements to the emotional dimensions (i.e., arousal and valence). Experimental results demonstrate that MusER outperforms the state-of-the-art models for generating emotional music in both objective and subjective evaluation. Besides, we rearrange music through element transfer and attempt to alter the emotion of music by transferring emotion-distinguishable elements.

artificial intelligence, emotion, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2312.10307

Country:

North America > Canada (0.14)
Europe > Netherlands (0.14)
North America > United States (0.14)
(2 more...)

Genre:

Research Report > New Finding (0.66)
Research Report > Promising Solution (0.54)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Emotion-Conditioned Melody Harmonization with Hierarchical Variational Autoencoder

Ji, Shulei, Yang, Xinyu

arXiv.org Artificial IntelligenceJul-19-2023

Existing melody harmonization models have made great progress in improving the quality of generated harmonies, but most of them ignored the emotions beneath the music. Meanwhile, the variability of harmonies generated by previous methods is insufficient. To solve these problems, we propose a novel LSTM-based Hierarchical Variational Auto-Encoder (LHVAE) to investigate the influence of emotional conditions on melody harmonization, while improving the quality of generated harmonies and capturing the abundant variability of chord progressions. Specifically, LHVAE incorporates latent variables and emotional conditions at different levels (piece- and bar-level) to model the global and local music properties. Additionally, we introduce an attention-based melody context vector at each step to better learn the correspondence between melodies and harmonies. Objective experimental results show that our proposed model outperforms other LSTM-based models. Through subjective evaluation, we conclude that only altering the types of chords hardly changes the overall emotion of the music. The qualitative analysis demonstrates the ability of our model to generate variable harmonies.

artificial intelligence, emotion, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2306.03718

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.88)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback