AITopics | mvsd

Collaborating Authors

mvsd

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Mutual Learning for Acoustic Matching and Dereverberation via Visual Scene-driven Diffusion

Ma, Jian, Wang, Wenguan, Yang, Yi, Zheng, Feng

arXiv.org Artificial IntelligenceJul-14-2024

Visual acoustic matching (VAM) is pivotal for enhancing the immersive experience, and the task of dereverberation is effective in improving audio intelligibility. Existing methods treat each task independently, overlooking the inherent reciprocity between them. Moreover, these methods depend on paired training data, which is challenging to acquire, impeding the utilization of extensive unpaired data. In this paper, we introduce MVSD, a mutual learning framework based on diffusion models. MVSD considers the two tasks symmetrically, exploiting the reciprocal relationship to facilitate learning from inverse tasks and overcome data scarcity. Furthermore, we employ the diffusion model as foundational conditional converters to circumvent the training instability and over-smoothing drawbacks of conventional GAN architectures. Specifically, MVSD employs two converters: one for VAM called reverberator and one for dereverberation called dereverberator. The dereverberator judges whether the reverberation audio generated by reverberator sounds like being in the conditional visual scenario, and vice versa. By forming a closed loop, these two converters can generate informative feedback signals to optimize the inverse tasks, even with easily acquired one-way unpaired data. Extensive experiments on two standard benchmarks, i.e., SoundSpaces-Speech and Acoustic AVSpeech, exhibit that our framework can improve the performance of the reverberator and dereverberator and better match specified visual scenarios.

dereverberation, diffusion model, mvsd, (14 more...)

arXiv.org Artificial Intelligence

2407.10373

Country:

Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.70)

Add feedback

Detecting Spoilers in Movie Reviews with External Movie Knowledge and User Networks

Wang, Heng, Zhang, Wenqian, Bai, Yuyang, Tan, Zhaoxuan, Feng, Shangbin, Zheng, Qinghua, Luo, Minnan

arXiv.org Artificial IntelligenceOct-26-2023

Online movie review platforms are providing crowdsourced feedback for the film industry and the general public, while spoiler reviews greatly compromise user experience. Although preliminary research efforts were made to automatically identify spoilers, they merely focus on the review content itself, while robust spoiler detection requires putting the review into the context of facts and knowledge regarding movies, user behavior on film review platforms, and more. In light of these challenges, we first curate a large-scale network-based spoiler detection dataset LCS and a comprehensive and up-to-date movie knowledge base UKM. We then propose MVSD, a novel Multi-View Spoiler Detection framework that takes into account the external knowledge about movies and user activities on movie review platforms. Specifically, MVSD constructs three interconnecting heterogeneous information networks to model diverse data sources and their multi-view attributes, while we design and employ a novel heterogeneous graph neural network architecture for spoiler detection as node-level classification. Extensive experiments demonstrate that MVSD advances the state-of-the-art on two spoiler detection datasets, while the introduction of external knowledge and user interactions help ground robust spoiler detection. Our data and code are available at https://github.com/Arthur-Heng/Spoiler-Detection

node, spoiler detection, subgraph, (14 more...)

arXiv.org Artificial Intelligence

2304.11411

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(8 more...)

Genre: Research Report (1.00)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Add feedback