AITopics | Maier, Andreas

Collaborating Authors

Maier, Andreas

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Speaker- and Text-Independent Estimation of Articulatory Movements and Phoneme Alignments from Speech

Weise, Tobias, Klumpp, Philipp, Demir, Kubilay Can, Pérez-Toro, Paula Andrea, Schuster, Maria, Noeth, Elmar, Heismann, Bjoern, Maier, Andreas, Yang, Seung Hee

arXiv.org Artificial IntelligenceJul-3-2024

This paper introduces a novel combination of two tasks, previously treated separately: acoustic-to-articulatory speech inversion (AAI) and phoneme-to-articulatory (PTA) motion estimation. We refer to this joint task as acoustic phoneme-to-articulatory speech inversion (APTAI) and explore two different approaches, both working speaker- and text-independently during inference. We use a multi-task learning setup, with the end-to-end goal of taking raw speech as input and estimating the corresponding articulatory movements, phoneme sequence, and phoneme alignment. While both proposed approaches share these same requirements, they differ in their way of achieving phoneme-related predictions: one is based on frame classification, the other on a two-staged training procedure and forced alignment. We reach competitive performance of 0.73 mean correlation for the AAI task and achieve up to approximately 87% frame overlap compared to a state-of-the-art text-dependent phoneme force aligner.

alignment, artificial intelligence, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2407.03132

Country: Europe > Germany > Bavaria (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Speech (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Data-driven Modeling in Metrology -- A Short Introduction, Current Developments and Future Perspectives

Schneider, Linda-Sophie, Krauss, Patrick, Schiering, Nadine, Syben, Christopher, Schielein, Richard, Maier, Andreas

arXiv.org Artificial IntelligenceJun-24-2024

Abstract: Mathematical models are vital to the field of metrology, playing a key role in the derivation of measurement results and the calculation of uncertainties from measurement data, informed by an understanding of the measurement process. These models generally represent the correlation between the quantity being measured and all other pertinent quantities. Such relationships are used to construct measurement systems that can interpret measurement data to generate conclusions and predictions about the measurement system itself. Classic models are typically analytical, built on fundamental physical principles. However, the rise of digital technology, expansive sensor networks, and high-performance computing hardware have led to a growing shift towards data-driven methodologies. This trend is especially prominent when dealing with large, intricate networked sensor systems in situations where there is limited expert understanding of the frequently changing real-world contexts. Here, we demonstrate the variety of opportunities that data-driven modeling presents, and how they have been already implemented in various real-world applications.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2406.16659

Country:

Europe > Germany > Saarland (0.14)
Europe > United Kingdom > England (0.14)

Genre:

Research Report (1.00)
Overview (1.00)
Instructional Material > Course Syllabus & Notes (0.46)

Industry:

Education (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.93)
Health & Medicine > Therapeutic Area (0.93)
(3 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Modeling & Simulation (1.00)
Information Technology > Internet of Things (1.00)
(8 more...)

Add feedback

The Impact of Speech Anonymization on Pathology and Its Limits

Arasteh, Soroosh Tayebi, Arias-Vergara, Tomas, Perez-Toro, Paula Andrea, Weise, Tobias, Packhaeuser, Kai, Schuster, Maria, Noeth, Elmar, Maier, Andreas, Yang, Seung Hee

arXiv.org Artificial IntelligenceJun-22-2024

Integration of speech into healthcare has intensified privacy concerns due to its potential as a non-invasive biomarker containing individual biometric information. In response, speaker anonymization aims to conceal personally identifiable information while retaining crucial linguistic content. However, the application of anonymization techniques to pathological speech, a critical area where privacy is especially vital, has not been extensively examined. This study investigates anonymization's impact on pathological speech across over 2,700 speakers from multiple German institutions, focusing on privacy, pathological utility, and demographic fairness. We explore both deep-learning-based and signal processing-based anonymization methods, and document substantial privacy improvements across disorders-evidenced by equal error rate increases up to 1933%, with minimal overall impact on utility. Specific disorders such as Dysarthria, Dysphonia, and Cleft Lip and Palate experienced minimal utility changes, while Dysglossia showed slight improvements. Our findings underscore that the impact of anonymization varies substantially across different disorders. This necessitates disorder-specific anonymization strategies to optimally balance privacy with diagnostic utility. Additionally, our fairness analysis revealed consistent anonymization effects across most of the demographics. This study demonstrates the effectiveness of anonymization in pathological speech for enhancing privacy, while also highlighting the importance of customized and disorder-specific approaches to account for inversion attacks.

artificial intelligence, data mining, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2404.08064

Country:

North America > United States (0.67)
Europe > Germany > Bavaria (0.14)
Oceania > Australia > Queensland (0.14)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.68)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
(2 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
(2 more...)

Add feedback

Attri-Net: A Globally and Locally Inherently Interpretable Model for Multi-Label Classification Using Class-Specific Counterfactuals

Sun, Susu, Woerner, Stefano, Maier, Andreas, Koch, Lisa M., Baumgartner, Christian F.

arXiv.org Artificial IntelligenceJun-8-2024

Interpretability is crucial for machine learning algorithms in high-stakes medical applications. However, high-performing neural networks typically cannot explain their predictions. Post-hoc explanation methods provide a way to understand neural networks but have been shown to suffer from conceptual problems. Moreover, current research largely focuses on providing local explanations for individual samples rather than global explanations for the model itself. In this paper, we propose Attri-Net, an inherently interpretable model for multi-label classification that provides local and global explanations. Attri-Net first counterfactually generates class-specific attribution maps to highlight the disease evidence, then performs classification with logistic regression classifiers based solely on the attribution maps. Local explanations for each prediction can be obtained by interpreting the attribution maps weighted by the classifiers' weights. Global explanation of whole model can be obtained by jointly considering learned average representations of the attribution maps for each class (called the class centers) and the weights of the linear classifiers. To ensure the model is ``right for the right reason", we further introduce a mechanism to guide the model's explanations to align with human knowledge. Our comprehensive evaluations show that Attri-Net can generate high-quality explanations consistent with clinical knowledge while not sacrificing classification performance.

artificial intelligence, explanation, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2406.05477

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
North America > Canada > Quebec (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.89)

Add feedback

SNOBERT: A Benchmark for clinical notes entity linking in the SNOMED CT clinical terminology

Kulyabin, Mikhail, Sokolov, Gleb, Galaida, Aleksandr, Maier, Andreas, Arias-Vergara, Tomas

arXiv.org Artificial IntelligenceMay-25-2024

The extraction and analysis of insights from medical data, primarily stored in free-text formats by healthcare workers, presents significant challenges due to its unstructured nature. Medical coding, a crucial process in healthcare, remains minimally automated due to the complexity of medical ontologies and restricted access to medical texts for training Natural Language Processing models. In this paper, we proposed a method, "SNOBERT," of linking text spans in clinical notes to specific concepts in the SNOMED CT using BERT-based models. The method consists of two stages: candidate selection and candidate matching. The models were trained on one of the largest publicly available dataset of labeled clinical notes. SNOBERT outperforms other classical methods based on deep learning, as confirmed by the results of a challenge in which it was applied.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2405.16115

Country:

Europe > Russia (0.14)
Europe > Germany (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine > Health Care Technology > Medical Record (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

Add feedback

Analyzing Narrative Processing in Large Language Models (LLMs): Using GPT4 to test BERT

Krauss, Patrick, Hösch, Jannik, Metzner, Claus, Maier, Andreas, Uhrig, Peter, Schilling, Achim

arXiv.org Artificial IntelligenceMay-3-2024

The ability to transmit and receive complex information via language is unique to humans and is the basis of traditions, culture and versatile social interactions. Through the disruptive introduction of transformer based large language models (LLMs) humans are not the only entity to "understand" and produce language any more. In the present study, we have performed the first steps to use LLMs as a model to understand fundamental mechanisms of language processing in neural networks, in order to make predictions and generate hypotheses on how the human brain does language processing. Thus, we have used ChatGPT to generate seven different stylistic variations of ten different narratives (Aesop's fables). We used these stories as input for the open source LLM BERT and have analyzed the activation patterns of the hidden units of BERT using multi-dimensional scaling and cluster analysis. We found that the activation vectors of the hidden units cluster according to stylistic variations in earlier layers of BERT (1) than narrative content (4-5). Despite the fact that BERT consists of 12 identical building blocks that are stacked and trained on large text corpora, the different layers perform different tasks. This is a very useful model of the human brain, where self-similar structures, i.e. different areas of the cerebral cortex, can have different functions and are therefore well suited to processing language in a very efficient way. The proposed approach has the potential to open the black box of LLMs on the one hand, and might be a further step to unravel the neural processes underlying human language processing and cognition in general.

large language model, machine learning, transformer block, (19 more...)

arXiv.org Artificial Intelligence

2405.02024

Country: Europe > Germany (0.15)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Self-Supervised Learning for Interventional Image Analytics: Towards Robust Device Trackers

Islam, Saahil, Murthy, Venkatesh N., Neumann, Dominik, Das, Badhan Kumar, Sharma, Puneet, Maier, Andreas, Comaniciu, Dorin, Ghesu, Florin C.

arXiv.org Artificial IntelligenceMay-2-2024

An accurate detection and tracking of devices such as guiding catheters in live X-ray image acquisitions is an essential prerequisite for endovascular cardiac interventions. This information is leveraged for procedural guidance, e.g., directing stent placements. To ensure procedural safety and efficacy, there is a need for high robustness no failures during tracking. To achieve that, one needs to efficiently tackle challenges, such as: device obscuration by contrast agent or other external devices or wires, changes in field-of-view or acquisition angle, as well as the continuous movement due to cardiac and respiratory motion. To overcome the aforementioned challenges, we propose a novel approach to learn spatio-temporal features from a very large data cohort of over 16 million interventional X-ray frames using self-supervision for image sequence data. Our approach is based on a masked image modeling technique that leverages frame interpolation based reconstruction to learn fine inter-frame temporal correspondences. The features encoded in the resulting model are fine-tuned downstream. Our approach achieves state-of-the-art performance and in particular robustness compared to ultra optimized reference solutions (that use multi-stage feature fusion, multi-task and flow regularization). The experiments show that our method achieves 66.31% reduction in maximum tracking error against reference solutions (23.20% when flow regularization is used); achieving a success score of 97.95% at a 3x faster inference speed of 42 frames-per-second (on GPU). The results encourage the use of our approach in various other tasks within interventional image analytics that require effective understanding of spatio-temporal semantics.

artificial intelligence, machine learning, pattern recognition, (15 more...)

arXiv.org Artificial Intelligence

2405.01156

Country: North America > United States (0.28)

Genre:

Research Report > Promising Solution (0.48)
Overview > Innovation (0.34)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.64)

Add feedback

Qiskit-Torch-Module: Fast Prototyping of Quantum Neural Networks

Meyer, Nico, Ufrecht, Christian, Periyasamy, Maniraman, Plinge, Axel, Mutschler, Christopher, Scherer, Daniel D., Maier, Andreas

arXiv.org Artificial IntelligenceApr-9-2024

Quantum computer simulation software is an integral tool for the research efforts in the quantum computing community. An important aspect is the efficiency of respective frameworks, especially for training variational quantum algorithms. Focusing on the widely used Qiskit software environment, we develop the qiskit-torch-module. It improves runtime performance by two orders of magnitude over comparable libraries, while facilitating low-overhead integration with existing codebases. Moreover, the framework provides advanced tools for integrating quantum neural networks with PyTorch. The pipeline is tailored for single-machine compute systems, which constitute a widely employed setup in day-to-day research efforts.

artificial intelligence, deep learning, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2404.06314

Country: Europe > Germany > Bavaria (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

The Artificial Neural Twin -- Process Optimization and Continual Learning in Distributed Process Chains

Emmert, Johannes, Mendez, Ronald, Dastjerdi, Houman Mirzaalian, Syben, Christopher, Maier, Andreas

arXiv.org Artificial IntelligenceMar-27-2024

Industrial process optimization and control is crucial to increase economic and ecologic efficiency. However, data sovereignty, differing goals, or the required expert knowledge for implementation impede holistic implementation. Further, the increasing use of data-driven AI-methods in process models and industrial sensory often requires regular fine-tuning to accommodate distribution drifts. We propose the Artificial Neural Twin, which combines concepts from model predictive control, deep learning, and sensor networks to address these issues. Our approach introduces differentiable data fusion to estimate the state of distributed process steps and their dependence on input data. By treating the interconnected process steps as a quasi neural-network, we can backpropagate loss gradients for process optimization or model fine-tuning to process parameters or AI models respectively. The concept is demonstrated on a virtual machine park simulated in Unity, consisting of bulk material processes in plastic recycling.

artificial intelligence, machine learning, time step, (18 more...)

arXiv.org Artificial Intelligence

2403.18343

Country: Europe > Germany (0.28)

Genre:

Instructional Material (0.67)
Workflow (0.54)

Industry:

Materials (0.66)
Energy > Oil & Gas > Upstream (0.34)
Water & Waste Management > Solid Waste Management (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
(3 more...)

Add feedback

Style-Extracting Diffusion Models for Semi-Supervised Histopathology Segmentation

Öttl, Mathias, Wilm, Frauke, Steenpass, Jana, Qiu, Jingna, Rübner, Matthias, Hartmann, Arndt, Beckmann, Matthias, Fasching, Peter, Maier, Andreas, Erber, Ramona, Kainz, Bernhard, Breininger, Katharina

arXiv.org Artificial IntelligenceMar-21-2024

Deep learning-based image generation has seen significant advancements with diffusion models, notably improving the quality of generated images. Despite these developments, generating images with unseen characteristics beneficial for downstream tasks has received limited attention. To bridge this gap, we propose Style-Extracting Diffusion Models, featuring two conditioning mechanisms. Specifically, we utilize 1) a style conditioning mechanism which allows to inject style information of previously unseen images during image generation and 2) a content conditioning which can be targeted to a downstream task, e.g., layout for segmentation. We introduce a trainable style encoder to extract style information from images, and an aggregation block that merges style information from multiple style inputs. This architecture enables the generation of images with unseen styles in a zero-shot manner, by leveraging styles from unseen images, resulting in more diverse generations. In this work, we use the image layout as target condition and first show the capability of our method on a natural image dataset as a proof-of-concept. We further demonstrate its versatility in histopathology, where we combine prior knowledge about tissue composition and unannotated data to create diverse synthetic images with known layouts. This allows us to generate additional synthetic data to train a segmentation network in a semi-supervised fashion. We verify the added value of the generated images by showing improved segmentation results and lower performance variability between patients when synthetic images are included during segmentation training. Our code will be made publicly available at [LINK].

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2403.14429

Country:

Europe > Germany (0.14)
South America > Peru (0.14)
Europe > France (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Diagnostic Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback