AITopics | ambisonic

Collaborating Authors

ambisonic

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Gen-A: Generalizing Ambisonics Neural Encoding to Unseen Microphone Arrays

Heikkinen, Mikko, Politis, Archontis, Drossos, Konstantinos, Virtanen, Tuomas

arXiv.org Artificial IntelligenceJan-14-2025

Using deep neural networks (DNNs) for encoding of microphone array (MA) signals to the Ambisonics spatial audio format can surpass certain limitations of established conventional methods, but existing DNN-based methods need to be trained separately for each MA. This paper proposes a DNN-based method for Ambisonics encoding that can generalize to arbitrary MA geometries unseen during training. The method takes as inputs the MA geometry and MA signals and uses a multi-level encoder consisting of separate paths for geometry and signal data, where geometry features inform the signal encoder at each level. The method is validated in simulated anechoic and reverberant conditions with one and two sources. The results indicate improvement over conventional encoding across the whole frequency range for dry scenes, while for reverberant scenes the improvement is frequency-dependent.

artificial intelligence, machine learning, microphone, (18 more...)

arXiv.org Artificial Intelligence

2501.08047

Country:

Europe > Finland > Pirkanmaa > Tampere (0.05)
North America > United States > New York (0.04)
Europe > Denmark > North Jutland > Aalborg (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Compression of Higher Order Ambisonics with Multichannel RVQGAN

Hirvonen, Toni, Namazi, Mahmoud

arXiv.org Artificial IntelligenceDec-11-2024

A multichannel extension to the RVQGAN neural coding method is proposed, and realized for data-driven compression of third-order Ambisonics audio. The input- and output layers of the generator and discriminator models are modified to accept multiple (16) channels without increasing the model bitrate. We also propose a loss function for accounting for spatial perception in immersive reproduction, and transfer learning from single-channel models. Listening test results with 7.1.4 immersive playback show that the proposed extension is suitable for coding scene-based, 16-channel Ambisonics content with good quality at 16 kbps when trained and tested on the EigenScape database. The model has potential applications for learning other types of content and multichannel formats.

ambisonic, compression, higher order ambisonic, (11 more...)

arXiv.org Artificial Intelligence

2411.12008

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
Asia > China > Jiangsu Province > Xuzhou (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

HARP: A Large-Scale Higher-Order Ambisonic Room Impulse Response Dataset

Saini, Shivam, Peissig, Jürgen

arXiv.org Artificial IntelligenceNov-21-2024

This contribution introduces a dataset of 7th-order Ambisonic Room Impulse Responses (HOA-RIRs), created using the Image Source Method. By employing higher-order Ambisonics, our dataset enables precise spatial audio reproduction, a critical requirement for realistic immersive audio applications. Leveraging the virtual simulation, we present a unique microphone configuration, based on the superposition principle, designed to optimize sound field coverage while addressing the limitations of traditional microphone arrays. The presented 64-microphone configuration allows us to capture RIRs directly in the Spherical Harmonics domain. The dataset features a wide range of room configurations, encompassing variations in room geometry, acoustic absorption materials, and source-receiver distances. A detailed description of the simulation setup is provided alongside for an accurate reproduction. The dataset serves as a vital resource for researchers working on spatial audio, particularly in applications involving machine learning to improve room acoustics modeling and sound field synthesis. It further provides a very high level of spatial resolution and realism crucial for tasks such as source localization, reverberation prediction, and immersive sound reproduction.

acoustic processing, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2411.14207

Country: Europe > Germany (0.15)

Genre: Research Report (0.40)

Industry: Energy > Oil & Gas > Upstream (0.39)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Speech > Acoustic Processing (0.35)

Add feedback

Sound Event Detection and Localization with Distance Estimation

Krause, Daniel Aleksander, Politis, Archontis, Mesaros, Annamaria

arXiv.org Artificial IntelligenceJun-12-2024

Sound Event Detection and Localization (SELD) is a combined task of identifying sound events and their corresponding direction-of-arrival (DOA). While this task has numerous applications and has been extensively researched in recent years, it fails to provide full information about the sound source position. In this paper, we overcome this problem by extending the task to Sound Event Detection, Localization with Distance Estimation (3D SELD). We study two ways of integrating distance estimation within the SELD core - a multi-task approach, in which the problem is tackled by a separate model output, and a single-task approach obtained by extending the multi-ACCDOA method to include distance information. We investigate both methods for the Ambisonic and binaural versions of STARSS23: Sony-TAU Realistic Spatial Soundscapes 2023. Moreover, our study involves experiments on the loss function related to the distance estimation part. Our results show that it is possible to perform 3D SELD without any degradation of performance in sound event detection and DOA estimation.

dataset, distance estimation, loss function, (10 more...)

arXiv.org Artificial Intelligence

2403.11827

Country: Europe > Finland > Pirkanmaa > Tampere (0.04)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Quaternion Convolutional Neural Networks for Detection and Localization of 3D Sound Events

Comminiello, Danilo, Lella, Marco, Scardapane, Simone, Uncini, Aurelio

arXiv.org Artificial IntelligenceDec-17-2018

Learning from data in the quaternion domain enables us to exploit internal dependencies of 4D signals and treating them as a single entity. One of the models that perfectly suits with quaternion-valued data processing is represented by 3D acoustic signals in their spherical harmonics decomposition. In this paper, we address the problem of localizing and detecting sound events in the spatial sound field by using quaternion-valued data processing. In particular, we consider the spherical harmonic components of the signals captured by a first-order ambisonic microphone and process them by using a quaternion convolutional neural network. Experimental results show that the proposed approach exploits the correlated nature of the ambisonic signals, thus improving accuracy results in 3D sound event detection and localization.

artificial intelligence, machine learning, signal process, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICASSP.2019.8682711

1812.06811

Country:

Europe > Ukraine > Kyiv Oblast > Kyiv (0.14)
North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.05)
Europe > Italy > Lazio > Rome (0.05)
(5 more...)

Genre: Research Report (0.70)

Industry: Information Technology (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Add feedback