AITopics | Sutter, Thomas M.

Collaborating Authors

Sutter, Thomas M.

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

RadVLM: A Multitask Conversational Vision-Language Model for Radiology

Deperrois, Nicolas, Matsuo, Hidetoshi, Ruipérez-Campillo, Samuel, Vandenhirtz, Moritz, Laguna, Sonia, Ryser, Alain, Fujimoto, Koji, Nishio, Mizuho, Sutter, Thomas M., Vogt, Julia E., Kluckert, Jonas, Frauenfelder, Thomas, Blüthgen, Christian, Nooralahzadeh, Farhad, Krauthammer, Michael

arXiv.org Artificial IntelligenceFeb-5-2025

X-rays have played a fundamental role in medicine since their discovery in 1895 (Röntgen, 1895), and continue to be the most frequently used medical imaging modality worldwide due to their convenience and cost-effectiveness (Akhter et al., 2023). Chest X-ray (CXR) remains the most commonly performed radiological exam globally, particularly important for diagnosing and monitoring thoracic conditions such as pneumonia, heart failure, and lung cancer (Çallı et al., 2021). Problematically, the growing volume of CXRs and other imaging studies in recent years have lead to a reduction in the time available for radiologists to thoroughly evaluate each case (Peng et al., 2022). As a result, in many countries, the responsibility of interpreting CXRs is often transferred to non-radiology physicians, who typically possess less specialized training and experience. This shift increases the risk of diagnostic errors or misinterpretations (Shammari et al., 2021; Peng et al., 2022). The shortage of trained personnel for CXR interpretation has led to the exploration of automated agents to assist physicians in diagnostic tasks. In recent years, various deep learning models have shown promise in clinical applications, such as the detection of conditions like COVID-19 pneumonia (Nishio et al., 2020) or pulmonary nodules (Homayounieh et al., 2021). Another extensively studied task is the automated generation of free text reports from CXR images using transformer-based architectures (Nooralahzadeh et al., 2021; Yang et al., 2023; Hyland et al., 2023; Chaves et al., 2024). These models can provide preliminary drafts summarizing key observations from the CXR, offering a potential enhancement to the diagnostic workflow.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.03333

Country:

North America (0.46)
Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Japan > Honshū > Kansai (0.14)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Weakly-Supervised Multimodal Learning on MIMIC-CXR

Agostini, Andrea, Chopard, Daphné, Meng, Yang, Fortin, Norbert, Shahbaba, Babak, Mandt, Stephan, Sutter, Thomas M., Vogt, Julia E.

arXiv.org Artificial IntelligenceNov-15-2024

Multimodal data integration and label scarcity pose significant challenges for machine learning in medical settings. To address these issues, we conduct an in-depth evaluation of the newly proposed Multimodal Variational Mixture-of-Experts (MMVM) VAE on the challenging MIMIC-CXR dataset. Our analysis demonstrates that the MMVM VAE consistently outperforms other multimodal VAEs and fully supervised approaches, highlighting its strong potential for real-world medical applications.

artificial intelligence, machine learning, representation, (17 more...)

arXiv.org Artificial Intelligence

2411.10356

Genre: Research Report (1.00)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.94)
Health & Medicine > Health Care Technology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Unity by Diversity: Improved Representation Learning in Multimodal VAEs

Sutter, Thomas M., Meng, Yang, Agostini, Andrea, Chopard, Daphné, Fortin, Norbert, Vogt, Julia E., Shahbaba, Bahbak, Mandt, Stephan

arXiv.org Artificial IntelligenceMay-31-2024

Variational Autoencoders for multimodal data hold promise for many tasks in data analysis, such as representation learning, conditional generation, and imputation. Current architectures either share the encoder output, decoder input, or both across modalities to learn a shared representation. Such architectures impose hard constraints on the model. In this work, we show that a better latent representation can be obtained by replacing these hard constraints with a soft constraint. We propose a new mixture-of-experts prior, softly guiding each modality's latent representation towards a shared aggregate posterior. This approach results in a superior latent representation and allows each encoding to preserve information better from its uncompressed original features. In extensive experiments on multiple benchmark datasets and two challenging real-world datasets, we show improved learned latent representations and imputation of missing data modalities compared to existing methods.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2403.053

Country: North America > Canada (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.68)
Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Anomaly Detection by Context Contrasting

Ryser, Alain, Sutter, Thomas M., Marx, Alexander, Vogt, Julia E.

arXiv.org Artificial IntelligenceMay-29-2024

Anomaly Detection focuses on identifying samples that deviate from the norm. When working with high-dimensional data such as images, a crucial requirement for detecting anomalous patterns is learning lower-dimensional representations that capture normal concepts seen during training. Recent advances in self-supervised learning have shown great promise in this regard. However, many of the most successful self-supervised anomaly detection methods assume prior knowledge about the structure of anomalies and leverage synthetic anomalies during training. Yet, in many real-world applications, we do not know what to expect from unseen data, and we can solely leverage knowledge about normal data. In this work, we propose Con2, which addresses this problem by setting normal training data into distinct contexts while preserving its normal properties, letting us observe the data from different perspectives. Unseen normal data consequently adheres to learned context representations while anomalies fail to do so, letting us detect them without any knowledge about anomalies during training. Our experiments demonstrate that our approach achieves state-of-the-art performance on various benchmarks while exhibiting superior performance in a more realistic healthcare setting, where knowledge about potential anomalies is often scarce.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2405.18848

Country: North America > United States (0.28)

Genre: Research Report (0.65)

Industry: Health & Medicine > Diagnostic Medicine (0.68)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Differentiable Random Partition Models

Sutter, Thomas M., Ryser, Alain, Liebeskind, Joram, Vogt, Julia E.

arXiv.org Artificial IntelligenceNov-8-2023

Partitioning a set of elements into an unknown number of mutually exclusive subsets is essential in many machine learning problems. However, assigning elements, such as samples in a dataset or neurons in a network layer, to an unknown and discrete number of subsets is inherently non-differentiable, prohibiting end-to-end gradient-based optimization of parameters. We overcome this limitation by proposing a novel two-step method for inferring partitions, which allows its usage in variational inference tasks. This new approach enables reparameterized gradients with respect to the parameters of the new random partition model. Our method works by inferring the number of elements per subset and, second, by filling these subsets in a learned order. We highlight the versatility of our general-purpose approach on three different challenging experiments: variational clustering, inference of shared and independent generative factors under weak supervision, and multitask learning.

artificial intelligence, experiment, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2305.16841

Country:

North America > Canada (0.14)
Europe > Spain (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

M(otion)-mode Based Prediction of Ejection Fraction using Echocardiograms

Ozkan, Ece, Sutter, Thomas M., Hu, Yurong, Balzer, Sebastian, Vogt, Julia E.

arXiv.org Artificial IntelligenceSep-7-2023

Early detection of cardiac dysfunction through routine screening is vital for diagnosing cardiovascular diseases. An important metric of cardiac function is the left ventricular ejection fraction (EF), where lower EF is associated with cardiomyopathy. Echocardiography is a popular diagnostic tool in cardiology, with ultrasound being a low-cost, real-time, and non-ionizing technology. However, human assessment of echocardiograms for calculating EF is time-consuming and expertise-demanding, raising the need for an automated approach. In this work, we propose using the M(otion)-mode of echocardiograms for estimating the EF and classifying cardiomyopathy. We generate multiple artificial M-mode images from a single echocardiogram and combine them using off-the-shelf model architectures. Additionally, we extend contrastive learning (CL) to cardiac imaging to learn meaningful representations from exploiting structures in unlabeled data allowing the model to achieve high accuracy, even with limited annotations. Our experiments show that the supervised setting converges with only ten modes and is comparable to the baseline method while bypassing its cumbersome training process and being computationally much more efficient. Furthermore, CL using M-mode images is helpful for limited data scenarios, such as having labels for only 200 patients, which is common in medical applications.

artificial intelligence, ejection fraction, machine learning, (3 more...)

arXiv.org Artificial Intelligence

2309.03759

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Learning Group Importance using the Differentiable Hypergeometric Distribution

Sutter, Thomas M., Manduchi, Laura, Ryser, Alain, Vogt, Julia E.

arXiv.org Artificial IntelligenceMay-8-2023

Partitioning a set of elements into subsets of a priori unknown sizes is essential in many applications. These subset sizes are rarely explicitly learned - be it the cluster sizes in clustering applications or the number of shared versus independent generative latent factors in weakly-supervised learning. Probability distributions over correct combinations of subset sizes are non-differentiable due to hard constraints, which prohibit gradient-based optimization. In this work, we propose the differentiable hypergeometric distribution. The hypergeometric distribution models the probability of different group sizes based on their relative importance. We introduce reparameterizable gradients to learn the importance between groups and highlight the advantage of explicitly learning the size of subsets in two typical applications: weakly-supervised learning and clustering. In both applications, we outperform previous approaches, which rely on suboptimal heuristics to model the unknown size of groups. Many machine learning approaches rely on differentiable sampling procedures, from which the reparameterization trick for Gaussian distributions is best known (Kingma & Welling, 2014; Rezende et al., 2014). The non-differentiable nature of discrete distributions has long hindered their use in machine learning pipelines with end-to-end gradient-based optimization. Only the concrete distribution (Maddison et al., 2017) or Gumbel-Softmax trick (Jang et al., 2016) boosted the use of categorical distributions in stochastic networks. Unlike the high-variance gradients of score-based methods such as REINFORCE (Williams, 1992), these works enable reparameterized and lowvariance gradients with respect to the categorical weights. Despite enormous progress in recent years, the extension to more complex probability distributions is still missing or comes with a trade-off regarding differentiability or computational speed (Huijben et al., 2021). The hypergeometric distribution plays a vital role in various areas of science, such as social and computer science and biology. The range of applications goes from modeling gene mutations and recommender systems to analyzing social networks (Becchetti et al., 2011; Casiraghi et al., 2016; Lodato et al., 2015). The hypergeometric distribution describes sampling without replacement and, therefore, models the number of samples per group given a limited number of total samples. Hence, it is essential wherever the choice of a single group element influences the probability of the remaining elements being drawn. Previous work mainly uses the hypergeometric distribution implicitly to model assumptions or as a tool to prove theorems.

artificial intelligence, deep learning, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2203.01629

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.46)

Industry:

Information Technology (0.34)
Health & Medicine > Therapeutic Area (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Generalized Multimodal ELBO

Sutter, Thomas M., Daunhawer, Imant, Vogt, Julia E.

arXiv.org Machine LearningMay-6-2021

Multiple data types naturally co-occur when describing real-world phenomena and learning from them is a long-standing goal in machine learning research. However, existing self-supervised generative models approximating an ELBO are not able to fulfill all desired requirements of multimodal models: their posterior approximation functions lead to a trade-off between the semantic coherence and the ability to learn the joint data distribution. We propose a new, generalized ELBO formulation for multimodal data that overcomes these limitations. The new objective encompasses two previous methods as special cases and combines their benefits without compromises. In extensive experiments, we demonstrate the advantage of the proposed method compared to state-of-the-art models in self-supervised, generative learning tasks.

deep learning, modality, neural network, (19 more...)

arXiv.org Machine Learning

2105.0247

Country:

North America > Canada (0.46)
North America > United States (0.28)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Multimodal Generative Learning Utilizing Jensen-Shannon-Divergence

Sutter, Thomas M., Daunhawer, Imant, Vogt, Julia E.

arXiv.org Machine LearningNov-2-2020

Learning from different data types is a long-standing goal in machine learning research, as multiple information sources co-occur when describing natural phenomena. However, existing generative models that approximate a multimodal ELBO rely on difficult or inefficient training schemes to learn a joint distribution and the dependencies between modalities. In this work, we propose a novel, efficient objective function that utilizes the Jensen-Shannon divergence for multiple distributions. It simultaneously approximates the unimodal and joint multimodal posteriors directly via a dynamic prior. In addition, we theoretically prove that the new multimodal JS-divergence (mmJSD) objective optimizes an ELBO. In extensive experiments, we demonstrate the advantage of the proposed mmJSD model compared to previous work in unsupervised, generative learning tasks.

deep learning, modality, neural network, (19 more...)

arXiv.org Machine Learning

2006.08242

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback