AITopics

2501.1372

Country:

Europe (0.69)
North America > United States (0.28)

Genre: Research Report > New Finding (0.67)

Industry:

Media > Music (0.48)
Leisure & Entertainment (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

arXiv.org Artificial IntelligenceMay-3-2024

Joint sentiment analysis of lyrics and audio in music

Schaab, Lea, Kruspe, Anna

Sentiment or mood can express themselves on various levels in music. In automatic analysis, the actual audio data is usually analyzed, but the lyrics can also play a crucial role in the perception of moods. We first evaluate various models for sentiment analysis based on lyrics and audio separately. The corresponding approaches already show satisfactory results, but they also exhibit weaknesses, the causes of which we examine in more detail. Furthermore, different approaches to combining the audio and lyrics results are proposed and evaluated. Considering both modalities generally leads to improved performance. We investigate misclassifications and (also intentional) contradictions between audio and lyrics sentiment more closely, and identify possible causes. Finally, we address fundamental problems in this research area, such as high subjectivity, lack of data, and inconsistency in emotion taxonomies.

artificial intelligence, emotion, natural language, (17 more...)

2405.01988

Country:

North America > United States (0.28)
Europe > Germany > Bavaria > Middle Franconia > Nuremberg (0.14)

Genre: Research Report (1.00)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.73)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.73)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (0.70)

arXiv.org Artificial IntelligenceApr-3-2024

Towards detecting unanticipated bias in Large Language Models

Kruspe, Anna

Over the last year, Large Language Models (LLMs) like ChatGPT have become widely available and have exhibited fairness issues similar to those in previous machine learning systems. Current research is primarily focused on analyzing and quantifying these biases in training data and their impact on the decisions of these models, alongside developing mitigation strategies. This research largely targets well-known biases related to gender, race, ethnicity, and language. However, it is clear that LLMs are also affected by other, less obvious implicit biases. The complex and often opaque nature of these models makes detecting such biases challenging, yet this is crucial due to their potential negative impact in various applications. In this paper, we explore new avenues for detecting these unanticipated biases in LLMs, focusing specifically on Uncertainty Quantification and Explainable AI methods. These approaches aim to assess the certainty of model decisions and to make the internal decision-making processes of LLMs more transparent, thereby identifying and understanding biases that are not immediately apparent. Through this research, we aim to contribute to the development of fairer and more transparent AI systems.

large language model, machine learning, natural language, (16 more...)

2404.0265

Country: North America > United States > California (0.14)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.46)

Industry:

Law (0.68)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

arXiv.org Artificial IntelligenceMar-14-2024

More than words: Advancements and challenges in speech recognition for singing

Kruspe, Anna

This paper addresses the challenges and advancements in speech recognition for singing, a domain distinctly different from standard speech recognition. Singing encompasses unique challenges, including extensive pitch variations, diverse vocal styles, and background music interference. We explore key areas such as phoneme recognition, language identification in songs, keyword spotting, and full lyrics transcription. I will describe some of my own experiences when performing research on these tasks just as they were starting to gain traction, but will also show how recent developments in deep learning and large-scale datasets have propelled progress in this field. My goal is to illuminate the complexities of applying speech recognition to singing, evaluate current capabilities, and outline future research directions.

artificial intelligence, machine learning, natural language, (16 more...)

2403.09298

Country:

Europe (1.00)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.82)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.93)

arXiv.org Artificial IntelligenceSep-14-2023

Towards Large-scale Building Attribute Mapping using Crowdsourced Images: Scene Text Recognition on Flickr and Problems to be Solved

Sun, Yao, Kruspe, Anna, Meng, Liqiu, Tian, Yifan, Hoffmann, Eike J, Auer, Stefan, Zhu, Xiao Xiang

Crowdsourced platforms provide huge amounts of street-view images that contain valuable building information. This work addresses the challenges in applying Scene Text Recognition (STR) in crowdsourced street-view images for building attribute mapping. We use Flickr images, particularly examining texts on building facades. A Berlin Flickr dataset is created, and pre-trained STR models are used for text detection and recognition. Manual checking on a subset of STR-recognized images demonstrates high accuracy. We examined the correlation between STR results and building functions, and analysed instances where texts were recognized on residential buildings but not on commercial ones. Further investigation revealed significant challenges associated with this task, including small text regions in street-view images, the absence of ground truth labels, and mismatches in buildings in Flickr images and building footprints in OpenStreetMap (OSM). To develop city-wide mapping beyond urban hotspot locations, we suggest differentiating the scenarios where STR proves effective while developing appropriate algorithms or bringing in additional data for handling other cases. Furthermore, interdisciplinary collaboration should be undertaken to understand the motivation behind building photography and labeling. The STR-on-Flickr results are publicly available at https://github.com/ya0-sun/STR-Berlin.

artificial intelligence, machine learning, pattern recognition, (15 more...)

2309.08042

Country: Europe > Germany > Bavaria > Middle Franconia > Nuremberg (0.14)

Genre: Research Report (0.82)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Text Recognition (0.62)

arXiv.org Machine LearningJul-7-2021

A Survey of Uncertainty in Deep Neural Networks

Gawlikowski, Jakob, Tassi, Cedrique Rovile Njieutcheu, Ali, Mohsin, Lee, Jongseok, Humt, Matthias, Feng, Jianxiang, Kruspe, Anna, Triebel, Rudolph, Jung, Peter, Roscher, Ribana, Shahzad, Muhammad, Yang, Wen, Bamler, Richard, Zhu, Xiao Xiang

Due to their increasing spread, confidence in neural network predictions became more and more important. However, basic neural networks do not deliver certainty estimates or suffer from over or under confidence. Many researchers have been working on understanding and quantifying uncertainty in a neural network's prediction. As a result, different types and sources of uncertainty have been identified and a variety of approaches to measure and quantify uncertainty in neural networks have been proposed. This work gives a comprehensive overview of uncertainty estimation in neural networks, reviews recent advances in the field, highlights current challenges, and identifies potential research opportunities. It is intended to give anyone interested in uncertainty estimation in neural networks a broad overview and introduction, without presupposing prior knowledge in this field. A comprehensive introduction to the most crucial sources of uncertainty is given and their separation into reducible model uncertainty and not reducible data uncertainty is presented. The modeling of these uncertainties based on deterministic neural networks, Bayesian neural networks, ensemble of neural networks, and test-time data augmentation approaches is introduced and different branches of these fields as well as the latest developments are discussed. For a practical application, we discuss different measures of uncertainty, approaches for the calibration of neural networks and give an overview of existing baselines and implementations. Different examples from the wide spectrum of challenges in different fields give an idea of the needs and challenges regarding uncertainties in practical applications. Additionally, the practical limitations of current methods for mission- and safety-critical real world applications are discussed and an outlook on the next steps towards a broader usage of such methods is given.

deep learning, neural network, survey article, (17 more...)

2107.03342

Country:

North America > United States > New York (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Overview (1.00)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.93)
Health & Medicine > Therapeutic Area (0.93)
Information Technology (0.67)
(3 more...)

arXiv.org Machine LearningAug-27-2020

Cross-language sentiment analysis of European Twitter messages duringthe COVID-19 pandemic

Kruspe, Anna, Häberle, Matthias, Kuhn, Iona, Zhu, Xiao Xiang

Social media data can be a very salient source of information during crises. User-generated messages provide a window into people's minds during such times, allowing us insights about their moods and opinions. Due to the vast amounts of such messages, a large-scale analysis of population-wide developments becomes possible. In this paper, we analyze Twitter messages (tweets) collected during the first months of the COVID-19 pandemic in Europe with regard to their sentiment. This is implemented with a neural network for sentiment analysis using multilingual sentence embeddings. We separate the results by country of origin, and correlate their temporal development with events in those countries. This allows us to study the effect of the situation on people's moods. We see, for example, that lockdown announcements correlate with a deterioration of mood in almost all surveyed countries, which recovers within a short time span.

immunology, neural network, tweet, (19 more...)

2008.12172

Country:

Europe > Germany (0.17)
Europe > United Kingdom (0.14)
North America > United States (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.92)
Health & Medicine > Therapeutic Area > Immunology (0.92)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.86)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

arXiv.org Machine LearningOct-5-2019

Few-shot tweet detection in emerging disaster events

Kruspe, Anna

Social media sources can provide crucial information in crisis situations, but discovering relevant messages is not trivial. Methods have so far focused on universal detection models for all kinds of crises or for certain crisis types (e.g. floods). Event-specific models could implement a more focused search area, but collecting data and training new models for a crisis that is already in progress is costly and may take too much time for a prompt response. As a compromise, manually collecting a small amount of example messages is feasible. Few-shot models can generalize to unseen classes with such a small handful of examples, and do not need be trained anew for each event. We compare how few-shot approaches (matching networks and prototypical networks) perform for this task. Since this is essentially a one-class problem, we also demonstrate how a modified one-class version of prototypical models can be used for this application.

neural network, social media, tweet, (21 more...)

1910.0229

Country:

Europe (1.00)
Asia (1.00)
North America > United States > California > Los Angeles County (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.40)

Industry:

Transportation > Air (0.46)
Health & Medicine > Therapeutic Area (0.31)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)

arXiv.org Machine LearningJun-3-2019

One-Way Prototypical Networks

Kruspe, Anna

Few-shot models have become a popular topic of research in the past years. They offer the possibility to determine class belongings for unseen examples using just a handful of examples for each class. Such models are trained on a wide range of classes and their respective examples, learning a decision metric in the process. Types of few-shot models include matching networks and prototypical networks. We show a new way of training prototypical few-shot models for just a single class. These models have the ability to predict the likelihood of an unseen query belonging to a group of examples without any given counterexamples. The difficulty here lies in the fact that no relative distance to other classes can be calculated via softmax. We solve this problem by introducing a "null class" centered around zero, and enforcing centering with batch normalization. Trained on the commonly used Omniglot data set, we obtain a classification accuracy of .98 on the matched test set, and of .8 on unmatched MNIST data. On the more complex MiniImageNet data set, test accuracy is .8. In addition, we propose a novel Gaussian layer for distance calculation in a prototypical network, which takes the support examples' distribution rather than just their centroid into account. This extension shows promising results when a higher number of support examples is available.

deep learning, neural network, support example, (20 more...)

1906.0082

Country:

Europe (0.93)
North America > United States (0.46)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)