AITopics | freesound

Collaborating Authors

freesound

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

DroneAudioset: An Audio Dataset for Drone-based Search and Rescue

Gupta, Chitralekha, Ramesh, Soundarya, Sasikumar, Praveen, Yeo, Kian Peen, Nanayakkara, Suranga

arXiv.org Artificial IntelligenceOct-20-2025

Unmanned Aerial Vehicles (UAVs) or drones, are increasingly used in search and rescue missions to detect human presence. Existing systems primarily leverage vision-based methods which are prone to fail under low-visibility or occlusion. Drone-based audio perception offers promise but suffers from extreme ego-noise that masks sounds indicating human presence. Existing datasets are either limited in diversity or synthetic, lacking real acoustic interactions, and there are no standardized setups for drone audition. To this end, we present DroneAudioset (The dataset is publicly available at https://huggingface.co/datasets/ahlab-drone-project/DroneAudioSet/ under the MIT license), a comprehensive drone audition dataset featuring 23.5 hours of annotated recordings, covering a wide range of signal-to-noise ratios (SNRs) from -57.2 dB to -2.5 dB, across various drone types, throttles, microphone configurations as well as environments. The dataset enables development and systematic evaluation of noise suppression and classification methods for human-presence detection under challenging conditions, while also informing practical design considerations for drone audition systems, such as microphone placement trade-offs, and development of drone noise-aware audio processing. This dataset is an important step towards enabling design and deployment of drone-audition systems.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2510.15383

Country:

Asia > Singapore (0.04)
Asia > Japan > Honshū > Tōhoku > Iwate Prefecture > Morioka (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)
Health & Medicine (0.87)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

The language of sound search: Examining User Queries in Audio Search Engines

Weck, Benno, Font, Frederic

arXiv.org Artificial IntelligenceOct-10-2024

This study examines textual, user-written search queries within the context of sound search engines, encompassing various applications such as foley, sound effects, and general audio retrieval. Current research inadequately addresses real-world user needs and behaviours in designing text-based audio retrieval systems. To bridge this gap, we analysed search queries from two sources: a custom survey and Freesound website query logs. The survey was designed to collect queries for an unrestricted, hypothetical sound search engine, resulting in a dataset that captures user intentions without the constraints of existing systems. This dataset is also made available for sharing with the research community. In contrast, the Freesound query logs encompass approximately 9 million search requests, providing a comprehensive view of real-world usage patterns. Our findings indicate that survey queries are generally longer than Freesound queries, suggesting users prefer detailed queries when not limited by system constraints. Both datasets predominantly feature keyword-based queries, with few survey participants using full sentences. Key factors influencing survey queries include the primary sound source, intended usage, perceived location, and the number of sound sources. These insights are crucial for developing user-centred, effective text-based audio retrieval systems, enhancing our understanding of user behaviour in sound search contexts.

artificial intelligence, information retrieval, natural language, (15 more...)

arXiv.org Artificial Intelligence

2410.08324

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.16)
Europe > United Kingdom > England > East Sussex > Brighton (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(7 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.88)

Industry:

Media > Music (0.69)
Leisure & Entertainment (0.69)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.92)

Add feedback

Merz

AAAI ConferencesFeb-8-2022, 09:47:54 GMT

In order to create algorithmic art using the wealth of documents available on the internet, artists must discover strategies for organizing those documents. In this paper I demonstrate techniques for organizing and collaging sounds from the user-contributed database at freesound.org. Sounds can be organized in a graph structure by exploiting aural similarity relationships provided by freesound.org,

freesound, graph, merz

AAAI Conferences

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.40)

Technology: Information Technology > Artificial Intelligence (0.43)

Add feedback

General-purpose Tagging of Freesound Audio with AudioSet Labels: Task Description, Dataset, and Baseline

Fonseca, Eduardo, Plakal, Manoj, Font, Frederic, Ellis, Daniel P. W., Favory, Xavier, Pons, Jordi, Serra, Xavier

arXiv.org Machine LearningJul-27-2018

Present but not The type of sound described is present, but the audio clip also predominant (PNP) contains other salient types of sound and/or strong background noise. Not Present (NP) The type of sound described is not present in the audio clip. Unsure (U) I am not sure whether the type of sound described is present or not. Table 2: Categories composing FSDKaggle2018, along with the number of samples and time (in minutes, rounded) in the train set. Percategory AP@3 achieved by the baseline system is reported using all the test files for every category (i.e., not following the public/private splits of the Kaggle leaderboard).

artificial intelligence, category, machine learning, (16 more...)

arXiv.org Machine Learning

1807.09902

Country:

Europe > United Kingdom > England > Surrey (0.05)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > China (0.04)

Genre: Research Report (0.40)

Industry:

Leisure & Entertainment (0.95)
Media > Music (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback