AITopics

2502.05458

Country:

Europe (0.67)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.86)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Gastroenterology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

arXiv.org Machine LearningFeb-7-2025

Deep Weight Factorization: Sparse Learning Through the Lens of Artificial Symmetries

Kolb, Chris, Weber, Tobias, Bischl, Bernd, Rügamer, David

Sparse regularization techniques are well-established in machine learning, yet their application in neural networks remains challenging due to the non-differentiability of penalties like the $L_1$ norm, which is incompatible with stochastic gradient descent. A promising alternative is shallow weight factorization, where weights are decomposed into two factors, allowing for smooth optimization of $L_1$-penalized neural networks by adding differentiable $L_2$ regularization to the factors. In this work, we introduce deep weight factorization, extending previous shallow approaches to more than two factors. We theoretically establish equivalence of our deep factorization with non-convex sparse regularization and analyze its impact on training dynamics and optimization. Due to the limitations posed by standard training practices, we propose a tailored initialization scheme and identify important learning rate requirements necessary for training factorized networks. We demonstrate the effectiveness of our deep weight factorization through experiments on various architectures and datasets, consistently outperforming its shallow counterpart and widely used pruning methods.

artificial intelligence, deep learning, machine learning, (18 more...)

2502.02496

Country:

Europe (0.14)
North America > United States (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

arXiv.org Machine LearningJun-7-2024

Linearization Turns Neural Operators into Function-Valued Gaussian Processes

Magnani, Emilia, Pförtner, Marvin, Weber, Tobias, Hennig, Philipp

Modeling dynamical systems, e.g. in climate and engineering sciences, often necessitates solving partial differential equations. Neural operators are deep neural networks designed to learn nontrivial solution operators of such differential equations from data. As for all statistical models, the predictions of these models are imperfect and exhibit errors. Such errors are particularly difficult to spot in the complex nonlinear behaviour of dynamical systems. We introduce a new framework for approximate Bayesian uncertainty quantification in neural operators using function-valued Gaussian processes. Our approach can be interpreted as a probabilistic analogue of the concept of currying from functional programming and provides a practical yet theoretically sound way to apply the linearized Laplace approximation to neural operators. In a case study on Fourier neural operators, we show that, even for a discretized input, our method yields a Gaussian closure--a structured Gaussian process posterior capturing the uncertainty in the output function of the neural operator, which can be evaluated at an arbitrary set of points. The method adds minimal prediction overhead, can be applied post-hoc without retraining the neural operator, and scales to large models and datasets. We showcase the efficacy of our approach through applications to different types of partial differential equations.

artificial intelligence, machine learning, modeling & simulation, (16 more...)

2406.05072

Country: Europe > Germany > Baden-Württemberg (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

arXiv.org Artificial IntelligenceJun-2-2024

Generalizing Orthogonalization for Models with Non-Linearities

Rügamer, David, Kolb, Chris, Weber, Tobias, Kook, Lucas, Nagler, Thomas

The complexity of black-box algorithms can lead to various challenges, including the introduction of biases. These biases present immediate risks in the algorithms' application. It was, for instance, shown that neural networks can deduce racial information solely from a patient's X-ray scan, a task beyond the capability of medical experts. If this fact is not known to the medical expert, automatic decision-making based on this algorithm could lead to prescribing a treatment (purely) based on racial information. While current methodologies allow for the "orthogonalization" or "normalization" of neural networks with respect to such information, existing approaches are grounded in linear models. Our paper advances the discourse by introducing corrections for non-linearities such as ReLU activations. Our approach also encompasses scalar and tensor-valued predictions, facilitating its integration into neural network architectures. Through extensive experiments, we validate our method's effectiveness in safeguarding sensitive data in generalized linear models, normalizing convolutional neural networks for metadata, and rectifying pre-existing embeddings for undesired attributes.

artificial intelligence, machine learning, orthogonalization, (19 more...)

2405.02475

Country: Europe > Austria > Vienna (0.14)

Genre: Research Report > New Finding (0.93)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

arXiv.org Artificial IntelligenceApr-18-2024

Post-Training Network Compression for 3D Medical Image Segmentation: Reducing Computational Efforts via Tucker Decomposition

Weber, Tobias, Dexl, Jakob, Rügamer, David, Ingrisch, Michael

We address the computational barrier of deploying advanced deep learning segmentation models in clinical settings by studying the efficacy of network compression through tensor decomposition. We propose a post-training Tucker factorization that enables the decomposition of pre-existing models to reduce computational requirements without impeding segmentation accuracy. We applied Tucker decomposition to the convolutional kernels of the TotalSegmentator (TS) model, an nnU-Net model trained on a comprehensive dataset for automatic segmentation of 117 anatomical structures. Our approach reduced the floating-point operations (FLOPs) and memory required during inference, offering an adjustable trade-off between computational efficiency and segmentation quality. This study utilized the publicly available TS dataset, employing various downsampling factors to explore the relationship between model size, inference speed, and segmentation performance. The application of Tucker decomposition to the TS model substantially reduced the model parameters and FLOPs across various compression rates, with limited loss in segmentation accuracy. We removed up to 88% of the model's parameters with no significant performance changes in the majority of classes after fine-tuning. Practical benefits varied across different graphics processing unit (GPU) architectures, with more distinct speed-ups on less powerful hardware. Post-hoc network compression via Tucker decomposition presents a viable strategy for reducing the computational demand of medical image segmentation models without substantially sacrificing accuracy. This approach enables the broader adoption of advanced deep learning technologies in clinical practice, offering a way to navigate the constraints of hardware capabilities.

artificial intelligence, deep learning, machine learning, (17 more...)

2404.09683

Country: Europe (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningNov-2-2023

Unreading Race: Purging Protected Features from Chest X-ray Embeddings

Weber, Tobias, Ingrisch, Michael, Bischl, Bernd, Rügamer, David

Purpose: To analyze and remove protected feature effects in chest radiograph embeddings of deep learning models. Materials and Methods: An orthogonalization is utilized to remove the influence of protected features (e.g., age, sex, race) in chest radiograph embeddings, ensuring feature-independent results. To validate the efficacy of the approach, we retrospectively study the MIMIC and CheXpert datasets using three pre-trained models, namely a supervised contrastive, a self-supervised contrastive, and a baseline classifier model. Our statistical analysis involves comparing the original versus the orthogonalized embeddings by estimating protected feature influences and evaluating the ability to predict race, age, or sex using the two types of embeddings. Results: Our experiments reveal a significant influence of protected features on predictions of pathologies. Applying orthogonalization removes these feature effects. Apart from removing any influence on pathology classification, while maintaining competitive predictive performance, orthogonalized embeddings further make it infeasible to directly predict protected attributes and mitigate subgroup disparities. Conclusion: The presented work demonstrates the successful application and evaluation of the orthogonalization technique in the domain of chest X-ray classification.

artificial intelligence, machine learning, pleural effusion, (15 more...)

2311.01349

Country: Asia (0.46)

Genre: Research Report > Experimental Study (0.95)

Industry:

Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningOct-27-2023

Adversarial Anomaly Detection using Gaussian Priors and Nonlinear Anomaly Scores

Lüer, Fiete, Weber, Tobias, Dolgich, Maxim, Böhm, Christian

Anomaly detection in imbalanced datasets is a frequent and crucial problem, especially in the medical domain where retrieving and labeling irregularities is often expensive. By combining the generative stability of a $\beta$-variational autoencoder (VAE) with the discriminative strengths of generative adversarial networks (GANs), we propose a novel model, $\beta$-VAEGAN. We investigate methods for composing anomaly scores based on the discriminative and reconstructive capabilities of our model. Existing work focuses on linear combinations of these components to determine if data is anomalous. We advance existing work by training a kernelized support vector machine (SVM) on the respective error components to also consider nonlinear relationships. This improves anomaly detection performance, while allowing faster optimization. Lastly, we use the deviations from the Gaussian prior of $\beta$-VAEGAN to form a novel anomaly score component. In comparison to state-of-the-art work, we improve the $F_1$ score during anomaly detection from 0.85 to 0.92 on the widely used MITBIH Arrhythmia Database.

artificial intelligence, data mining, machine learning, (16 more...)

2310.18091

Country: Europe > Austria > Vienna (0.14)

Genre: Research Report > Promising Solution (0.66)

Industry:

Health & Medicine > Diagnostic Medicine (0.93)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.88)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceAug-22-2023

Constrained Probabilistic Mask Learning for Task-specific Undersampled MRI Reconstruction

Weber, Tobias, Ingrisch, Michael, Bischl, Bernd, Rügamer, David

Undersampling is a common method in Magnetic Resonance Imaging (MRI) to subsample the number of data points in k-space, reducing acquisition times at the cost of decreased image quality. A popular approach is to employ undersampling patterns following various strategies, e.g., variable density sampling or radial trajectories. In this work, we propose a method that directly learns the undersampling masks from data points, thereby also providing task- and domain-specific patterns. To solve the resulting discrete optimization problem, we propose a general optimization routine called ProM: A fully probabilistic, differentiable, versatile, and model-free framework for mask optimization that enforces acceleration factors through a convex constraint. Analyzing knee, brain, and cardiac MRI datasets with our method, we discover that different anatomic regions reveal distinct optimal undersampling masks, demonstrating the benefits of using custom masks, tailored for a downstream task. For example, ProM can create undersampling masks that maximize performance in downstream tasks like segmentation with networks trained on fully-sampled MRIs. Even with extreme acceleration factors, ProM yields reasonable performance while being more versatile than existing methods, paving the way for data-driven all-purpose mask generation.

artificial intelligence, machine learning, optimization problem, (18 more...)

2305.16376

Genre: Research Report (0.50)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Artificial IntelligenceMar-20-2023

Cascaded Latent Diffusion Models for High-Resolution Chest X-ray Synthesis

Weber, Tobias, Ingrisch, Michael, Bischl, Bernd, Rügamer, David

While recent advances in large-scale foundational models show promising results, their application to the medical domain has not yet been explored in detail. In this paper, we progress into the realms of large-scale modeling in medical synthesis by proposing Cheff - a foundational cascaded latent diffusion model, which generates highly-realistic chest radiographs providing state-of-the-art quality on a 1-megapixel scale. We further propose MaCheX, which is a unified interface for public chest datasets and forms the largest open collection of chest X-rays up to date. With Cheff conditioned on radiological reports, we further guide the synthesis process over text prompts and unveil the research area of report-to-chest-X-ray generation.

artificial intelligence, diffusion model, machine learning, (14 more...)

2303.11224

Country:

North America > United States (0.47)
Europe (0.28)
Asia (0.28)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

arXiv.org Artificial IntelligenceDec-30-2022

ChatGPT Makes Medicine Easy to Swallow: An Exploratory Case Study on Simplified Radiology Reports

Jeblick, Katharina, Schachtner, Balthasar, Dexl, Jakob, Mittermeier, Andreas, Stüber, Anna Theresa, Topalis, Johanna, Weber, Tobias, Wesp, Philipp, Sabel, Bastian, Ricke, Jens, Ingrisch, Michael

The release of ChatGPT, a language model capable of generating text that appears human-like and authentic, has gained significant attention beyond the research community. We expect that the convincing performance of ChatGPT incentivizes users to apply it to a variety of downstream tasks, including prompting the model to simplify their own medical reports. To investigate this phenomenon, we conducted an exploratory case study. In a questionnaire, we asked 15 radiologists to assess the quality of radiology reports simplified by ChatGPT. Most radiologists agreed that the simplified reports were factually correct, complete, and not potentially harmful to the patient. Nevertheless, instances of incorrect statements, missed key medical findings, and potentially harmful passages were reported. While further studies are needed, the initial insights of this study indicate a great potential in using large language models like ChatGPT to improve patient-centered care in radiology and other medical domains.

artificial intelligence, natural language, simplified radiology report, (2 more...)

2212.14882

Genre: Research Report (0.40)

Industry:

Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)