AITopics | Frintrop, Simone

Collaborating Authors

Frintrop, Simone

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Select High-Level Features: Efficient Experts from a Hierarchical Classification Network

Kelm, André, Hannemann, Niels, Heberle, Bruno, Schmidt, Lucas, Rolff, Tim, Wilms, Christian, Yaghoubi, Ehsan, Frintrop, Simone

arXiv.org Artificial IntelligenceMar-7-2024

This study introduces a novel expert generation method that dynamically reduces task and computational complexity without compromising predictive performance. It is based on a new hierarchical classification network topology that combines sequential processing of generic low-level features with parallelism and nesting of high-level features. This structure allows for the innovative extraction technique: the ability to select only high-level features of task-relevant categories. In certain cases, it is possible to skip almost all unneeded high-level features, which can significantly reduce the inference cost and is highly beneficial in resource-constrained conditions. We believe this method paves the way for future network designs that are lightweight and adaptable, making them suitable for a wide range of applications, from compact edge devices to large-scale clouds. In terms of dynamic inference our methodology can achieve an exclusion of up to 88.7\,\% of parameters and 73.4\,\% fewer giga-multiply accumulate (GMAC) operations, analysis against comparative baselines showing an average reduction of 47.6\,\% in parameters and 5.8\,\% in GMACs across the cases we evaluated.

artificial intelligence, category, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2403.05601

Country: Europe > Germany (0.15)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

The MSR-Video to Text Dataset with Clean Annotations

Chen, Haoran, Li, Jianmin, Frintrop, Simone, Hu, Xiaolin

arXiv.org Artificial IntelligenceFeb-25-2024

Video captioning automatically generates short descriptions of the video content, usually in form of a single sentence. Many methods have been proposed for solving this task. A large dataset called MSR Video to Text (MSR-VTT) is often used as the benchmark dataset for testing the performance of the methods. However, we found that the human annotations, i.e., the descriptions of video contents in the dataset are quite noisy, e.g., there are many duplicate captions and many captions contain grammatical problems. These problems may pose difficulties to video captioning models for learning underlying patterns. We cleaned the MSR-VTT annotations by removing these problems, then tested several typical video captioning models on the cleaned dataset. Experimental results showed that data cleaning boosted the performances of the models measured by popular quantitative metrics. We recruited subjects to evaluate the results of a model trained on the original and cleaned datasets. The human behavior experiment demonstrated that trained on the cleaned dataset, the model generated captions that were more coherent and more relevant to the contents of the video clips.

annotation, artificial intelligence, natural language, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.cviu.2022.103581

2102.06448

Country:

Asia > China (0.15)
North America > United States (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.98)
Information Technology > Data Science > Data Quality (0.69)

Add feedback

Audio-Visual Speech Enhancement with Score-Based Generative Models

Richter, Julius, Frintrop, Simone, Gerkmann, Timo

arXiv.org Artificial IntelligenceJun-2-2023

This paper introduces an audio-visual speech enhancement system that leverages score-based generative models, also known as diffusion models, conditioned on visual information. In particular, we exploit audio-visual embeddings obtained from a self-super\-vised learning model that has been fine-tuned on lipreading. The layer-wise features of its transformer-based encoder are aggregated, time-aligned, and incorporated into the noise conditional score network. Experimental evaluations show that the proposed audio-visual speech enhancement system yields improved speech quality and reduces generative artifacts such as phonetic confusions with respect to the audio-only equivalent. The latter is supported by the word error rate of a downstream automatic speech recognition model, which decreases noticeably, especially at low input signal-to-noise ratios.

artificial intelligence, machine learning, speech enhancement, (16 more...)

arXiv.org Artificial Intelligence

2306.01432

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Teacher Network Calibration Improves Cross-Quality Knowledge Distillation

Čuk, Pia, Senge, Robin, Lauri, Mikko, Frintrop, Simone

arXiv.org Artificial IntelligenceApr-15-2023

We investigate cross-quality knowledge distillation (CQKD), a knowledge distillation method where knowledge from a teacher network trained with full-resolution images is transferred to a student network that takes as input low-resolution images. As image size is a deciding factor for the computational load of computer vision applications, CQKD notably reduces the requirements by only using the student network at inference time. Our experimental results show that CQKD outperforms supervised learning in large-scale image classification problems. We also highlight the importance of calibrating neural networks: we show that with higher temperature smoothing of the teacher's output distribution, the student distribution exhibits a higher entropy, which leads to both, a lower calibration error and a higher network accuracy.

artificial intelligence, machine learning, student network, (16 more...)

arXiv.org Artificial Intelligence

2304.07593

Country: Europe > Germany (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.48)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Immersive Neural Graphics Primitives

Li, Ke, Rolff, Tim, Schmidt, Susanne, Bacher, Reinhard, Frintrop, Simone, Leemans, Wim, Steinicke, Frank

arXiv.org Artificial IntelligenceNov-24-2022

Neural radiance field (NeRF), in particular its extension by instant neural graphics primitives, is a novel rendering method for view synthesis that uses real-world images to build photo-realistic immersive virtual scenes. Despite its potential, research on the combination of NeRF and virtual reality (VR) remains sparse. Currently, there is no integration into typical VR systems available, and the performance and suitability of NeRF implementations for VR have not been evaluated, for instance, for different scene complexities or screen resolutions. In this paper, we present and evaluate a NeRF-based framework that is capable of rendering scenes in immersive VR allowing users to freely move their heads to explore complex real-world scenes. We evaluate our framework by benchmarking three different NeRF scenes concerning their rendering performance at different scene complexities and resolutions. Utilizing super-resolution, our approach can yield a frame rate of 30 frames per second with a resolution of 1280x720 pixels per eye. We discuss potential applications of our framework and provide an open source implementation online.

frame rate, human computer interaction, resolution, (15 more...)

arXiv.org Artificial Intelligence

2211.13494

Genre: Research Report (0.64)

Industry:

Information Technology (0.67)
Leisure & Entertainment > Games > Computer Games (0.66)
Semiconductors & Electronics (0.54)

Technology:

Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)

Add feedback