AITopics | Serra, Xavier

Collaborating Authors

Serra, Xavier

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Audio tagging with noisy labels and minimal supervision

Fonseca, Eduardo, Plakal, Manoj, Font, Frederic, Ellis, Daniel P. W., Serra, Xavier

arXiv.org Machine LearningJun-7-2019

This paper introduces Task 2 of the DCASE2019 Challenge, titled "Audio tagging with noisy labels and minimal supervision". This task was hosted on the Kaggle platform as "Freesound Audio Tagging 2019". The task evaluates systems for multi-label audio tagging using a large set of noisy-labeled data, and a much smaller set of manually-labeled data, under a large vocabulary setting of 80 everyday sound classes. In addition, the proposed dataset poses an acoustic mismatch problem between the noisy train set and the test set due to the fact that they come from different web audio sources. This can correspond to a realistic scenario given by the difficulty of gathering large amounts of manually labeled data. We present the task setup, the FSDKaggle2019 dataset prepared for this scientific evaluation, and a baseline system consisting of a convolutional neural network. All these resources are freely available.

dataset, deep learning, neural network, (16 more...)

arXiv.org Machine Learning

1906.02975

Country: North America > United States (0.30)

Genre: Instructional Material > Course Syllabus & Notes (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Learning Sound Event Classifiers from Web Audio with Noisy Labels

Fonseca, Eduardo, Plakal, Manoj, Ellis, Daniel P. W., Font, Frederic, Favory, Xavier, Serra, Xavier

arXiv.org Machine LearningJan-4-2019

ABSTRACT As sound event classification moves towards larger datasets, issues of label noise become inevitable. Web sites can supply large volumes ofuser-contributed audio and metadata, but inferring labels from this metadata introduces errors due to unreliable inputs, and limitations in the mapping. There is, however, little research into the impact of these errors. To foster the investigation of label noise in sound event classification we present FSDnoisy18k, a dataset containing 42.5hours of audio across 20 sound classes, including a small amount of manually-labeled data and a larger quantity of realworld noisydata. We characterize the label noise empirically, and provide a CNN baseline system. Experiments suggest that training withlarge amounts of noisy data can outperform training with smaller amounts of carefully-labeled data. We also show that noiserobust lossfunctions can be effective in improving performance in presence of corrupted labels.

deep learning, label noise, neural network, (21 more...)

arXiv.org Machine Learning

1901.01189

Country:

North America > United States (0.28)
Europe > Middle East > Cyprus (0.14)

Genre: Research Report > Experimental Study (0.34)

Industry: Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Add feedback

Training neural audio classifiers with few data

Pons, Jordi, Serrà, Joan, Serra, Xavier

arXiv.org Artificial IntelligenceNov-3-2018

These studies are mostly based on publiclyavailable datasets, where each class typically contains more than 100 audio examples [5, 6, 7, 8, 9]. Contrastingly, only few works study the problem of training neural audio classifiers with few audio examples (for instance, less than 10 per class) [10, 11, 12, 13]. In this work, we study how a number of neural network architectures perform in such situation. Two primary reasons motivate our work: (i) given that humans are able to learn novel concepts from few examples, we aim to quantify up to what extent such behavior is possible in current neural machine listening systems; and (ii) provided that data curation processes are tedious and expensive, it is unreasonable to assume that sizable amounts of annotated audio are always available for training neural network classifiers. The challenge of training neural networks with few audio data has been previously addressed.

deep learning, neural network, prototypical network, (18 more...)

arXiv.org Artificial Intelligence

1810.10274

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

General-purpose Tagging of Freesound Audio with AudioSet Labels: Task Description, Dataset, and Baseline

Fonseca, Eduardo, Plakal, Manoj, Font, Frederic, Ellis, Daniel P. W., Favory, Xavier, Pons, Jordi, Serra, Xavier

arXiv.org Machine LearningJul-27-2018

This paper describes Task 2 of the DCASE 2018 Challenge, titled "General-purpose audio tagging of Freesound content with AudioSet labels". This task was hosted on the Kaggle platform as "Freesound General-Purpose Audio Tagging Challenge". The goal of the task is to build an audio tagging system that can recognize the category of an audio clip from a subset of 41 heterogeneous categories drawn from the AudioSet Ontology. We present the task, the dataset prepared for the competition, and a baseline system.

artificial intelligence, category, neural network, (20 more...)

arXiv.org Machine Learning

1807.09902

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Industry:

Leisure & Entertainment (0.95)
Media > Music (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

A Simple Fusion of Deep and Shallow Learning for Acoustic Scene Classification

Fonseca, Eduardo, Gong, Rong, Serra, Xavier

arXiv.org Machine LearningJun-27-2018

In the past, Acoustic Scene Classification systems have been based on hand crafting audio features that are input to a classifier. Nowadays, the common trend is to adopt data driven techniques, e.g., deep learning, where audio representations are learned from data. In this paper, we propose a system that consists of a simple fusion of two methods of the aforementioned types: a deep learning approach where log-scaled mel-spectrograms are input to a convolutional neural network, and a feature engineering approach, where a collection of hand-crafted features is input to a gradient boosting machine. We first show that both methods provide complementary information to some extent. Then, we use a simple late fusion strategy to combine both methods. We report classification accuracy of each method individually and the combined system on the TUT Acoustic Scenes 2017 dataset. The proposed fused system outperforms each of the individual methods and attains a classification accuracy of 72.8% on the evaluation set, improving the baseline system by 11.8%.

classification, deep learning, neural network, (18 more...)

arXiv.org Machine Learning

1806.07506

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment (0.93)
Media > Music (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Assessing the impact of machine intelligence on human behaviour: an interdisciplinary endeavour

Gómez, Emilia, Castillo, Carlos, Charisi, Vicky, Dahl, Verónica, Deco, Gustavo, Delipetrev, Blagoj, Dewandre, Nicole, González-Ballester, Miguel Ángel, Gouyon, Fabien, Hernández-Orallo, José, Herrera, Perfecto, Jonsson, Anders, Koene, Ansgar, Larson, Martha, de Mántaras, Ramón López, Martens, Bertin, Miron, Marius, Moreno-Bote, Rubén, Oliver, Nuria, Gallardo, Antonio Puertas, Schweitzer, Heike, Sebastian, Nuria, Serra, Xavier, Serrà, Joan, Tolan, Songül, Vold, Karina

arXiv.org Artificial IntelligenceJun-7-2018

This document contains the outcome of the first Human behaviour and machine intelligence (HUMAINT) workshop that took place 5-6 March 2018 in Barcelona, Spain. The workshop was organized in the context of a new research programme at the Centre for Advanced Studies, Joint Research Centre of the European Commission, which focuses on studying the potential impact of artificial intelligence on human behaviour. The workshop gathered an interdisciplinary group of experts to establish the state of the art research in the field and a list of future research challenges to be addressed on the topic of human and machine intelligence, algorithm's potential impact on human cognitive capabilities and decision making, and evaluation and regulation needs. The document is made of short position statements and identification of challenges provided by each expert, and incorporates the result of the discussions carried out during the workshop. In the conclusion section, we provide a list of emerging research topics and strategies to be addressed in the near future.

algorithm, deep learning, neural network, (22 more...)

arXiv.org Artificial Intelligence

1806.03192

Country:

North America > United States (1.00)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.24)
Europe > United Kingdom > England > Cambridgeshire (0.14)

Genre:

Instructional Material > Course Syllabus & Notes (0.67)
Research Report > Experimental Study (0.46)
Research Report > New Finding (0.46)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)
Information Technology > Security & Privacy (1.00)
(8 more...)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.45)

Add feedback

Transfer Learning of Artist Group Factors to Musical Genre Classification

Kim, Jaehun, Won, Minz, Serra, Xavier, Liem, Cynthia C. S.

arXiv.org Machine LearningMay-5-2018

The automated recognition of music genres from audio information is a challenging problem, as genre labels are subjective and noisy. Artist labels are less subjective and less noisy, while certain artists may relate more strongly to certain genres. At the same time, at prediction time, it is not guaranteed that artist labels are available for a given audio segment. Therefore, in this work, we propose to apply the transfer learning framework, learning artist-related information which will be used at inference time for genre classification. We consider different types of artist-related information, expressed through artist group factors, which will allow for more efficient learning and stronger robustness to potential label noise. Furthermore, we investigate how to achieve the highest validation accuracy on the given FMA dataset, by experimenting with various kinds of transfer methods, including single-task transfer, multi-task transfer and finally multi-task learning.

artificial intelligence, artist, neural network, (15 more...)

arXiv.org Machine Learning

doi: 10.1145/3184558.3191823

1805.02043

Country:

Europe > Netherlands (0.15)
Europe > Spain (0.14)
Europe > France (0.14)
(2 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.86)

Add feedback

Characterization and exploitation of community structure in cover song networks

Serrà, Joan, Zanin, Massimiliano, Herrera, Perfecto, Serra, Xavier

arXiv.org Machine LearningSep-12-2011

The use of community detection algorithms is explored within the framework of cover song identification, i.e. the automatic detection of different audio renditions of the same underlying musical piece. Until now, this task has been posed as a typical query-by-example task, where one submits a query song and the system retrieves a list of possible matches ranked by their similarity to the query. In this work, we propose a new approach which uses song communities to provide more relevant answers to a given query. Starting from the output of a state-of-the-art system, songs are embedded in a complex weighted network whose links represent similarity (related musical content). Communities inside the network are then recognized as groups of covers and this information is used to enhance the results of the system. In particular, we show that this approach increases both the coherence and the accuracy of the system. Furthermore, we provide insight into the internal organization of individual cover song communities, showing that there is a tendency for the original song to be central within the community. We postulate that the methods and results presented here could be relevant to other query-by-example tasks.

algorithm, artificial intelligence, survey article, (19 more...)

arXiv.org Machine Learning

doi: 10.1016/j.patrec.2012.02.013

1108.6003

Country:

North America > United States (0.46)
Europe > Spain (0.28)
Europe > United Kingdom > England (0.14)

Genre: Research Report > New Finding (0.68)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback