AITopics

2502.03341

Country: Europe > Austria (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)

arXiv.org Artificial IntelligenceDec-20-2024

Function Space Diversity for Uncertainty Prediction via Repulsive Last-Layer Ensembles

Steger, Sophie, Knoll, Christian, Klein, Bernhard, Fröning, Holger, Pernkopf, Franz

Bayesian inference in function space has gained attention due to its robustness against overparameterization in neural networks. However, approximating the infinite-dimensional function space introduces several challenges. In this work, we discuss function space inference via particle optimization and present practical modifications that improve uncertainty estimation and, most importantly, make it applicable for large and pretrained networks. First, we demonstrate that the input samples, where particle predictions are enforced to be diverse, are detrimental to the model performance. While diversity on training data itself can lead to underfitting, the use of label-destroying data augmentation, or unlabeled out-of-distribution data can improve prediction diversity and uncertainty estimates. Furthermore, we take advantage of the function space formulation, which imposes no restrictions on network parameterization other than sufficient flexibility. Instead of using full deep ensembles to represent particles, we propose a single multi-headed network that introduces a minimal increase in parameters and computation. This allows seamless integration to pretrained networks, where this repulsive last-layer ensemble can be used for uncertainty aware fine-tuning at minimal additional cost. We achieve competitive results in disentangling aleatoric and epistemic uncertainty for active learning, detecting out-of-domain data, and providing calibrated uncertainty estimates under distribution shifts with minimal compute and memory.

artificial intelligence, epistemic uncertainty, machine learning, (18 more...)

2412.15758

Country: Europe > Austria (0.46)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Artificial IntelligenceJul-12-2024

Robustness of Explainable Artificial Intelligence in Industrial Process Modelling

Kantz, Benedikt, Staudinger, Clemens, Feilmayr, Christoph, Wachlmayr, Johannes, Haberl, Alexander, Schuster, Stefan, Pernkopf, Franz

In the last years, there has been an effort to provide eXplainable Artificial Intelligence (XAI) aims at explanations to the ML model predictions using XAI providing understandable explanations of black (Lundberg & Lee, 2017; Ribeiro et al., 2018; Alvarez-Melis box models. In this paper, we evaluate current & Jaakkola, 2018; Shrikumar et al., 2017). XAI methods by scoring them based on ground truth simulations and sensitivity analysis. To Most of these works, even if they focus on the robustness this end, we used an Electric Arc Furnace (EAF) and trustworthiness of the XAI method, have the shortcoming model to better understand the limits and robustness that they can only be evaluated through surrogate characteristics of XAI methods such as SHapley measures (Crabbé & van der Schaar, 2023), and the ground Additive exPlanations (SHAP), Local Interpretable truth sensitivity of the evaluated datasets cannot be properly Model-agnostic Explanations (LIME), as calculated (Alvarez-Melis & Jaakkola, 2018). Some well as Averaged Local Effects (ALE) or Smooth existing approaches rather use data augmentation (Sun et al., Gradients (SG) in a highly topical setting. These 2020) or create measures estimating the importance of the XAI methods were applied to various types of features (Yeh et al., 2019); further related work is provided black-box models and then scored based on their in Section A.3. None of these systems, to the best of our correctness compared to the ground-truth sensitivity knowledge, consider the ground truth sensitivity, or gradient, of the data-generating processes using a novel of the data-generating process that created the dataset.

machine learning, natural language, xai method, (15 more...)

2407.09127

Country: Europe > Austria > Upper Austria (0.14)

Genre: Research Report (0.50)

Industry: Materials > Metals & Mining (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)

arXiv.org Machine LearningMay-24-2024

On the Convexity and Reliability of the Bethe Free Energy Approximation

Leisenberger, Harald, Knoll, Christian, Pernkopf, Franz

The Bethe free energy approximation provides an effective way for relaxing NP-hard problems of probabilistic inference. However, its accuracy depends on the model parameters and particularly degrades if a phase transition in the model occurs. In this work, we analyze when the Bethe approximation is reliable and how this can be verified. We argue and show by experiment that it is mostly accurate if it is convex on a submanifold of its domain, the 'Bethe box'. For verifying its convexity, we derive two sufficient conditions that are based on the definiteness properties of the Bethe Hessian matrix: the first uses the concept of diagonal dominance, and the second decomposes the Bethe Hessian matrix into a sum of sparse matrices and characterizes the definiteness properties of the individual matrices in that sum. These theoretical results provide a simple way to estimate the critical phase transition temperature of a model. As a practical contribution we propose $\texttt{BETHE-MIN}$, a projected quasi-Newton method to efficiently find a minimum of the Bethe free energy.

artificial intelligence, bethe free energy, machine learning, (14 more...)

2405.15514

Country:

Europe > Austria (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

arXiv.org Machine LearningFeb-22-2024

Rao-Blackwellising Bayesian Causal Inference

Toth, Christian, Knoll, Christian, Pernkopf, Franz, Peharz, Robert

Bayesian causal inference, i.e., inferring a posterior over causal models for the use in downstream causal reasoning tasks, poses a hard computational inference problem that is little explored in literature. In this work, we combine techniques from order-based MCMC structure learning with recent advances in gradient-based graph learning into an effective Bayesian causal inference framework. Specifically, we decompose the problem of inferring the causal structure into (i) inferring a topological order over variables and (ii) inferring the parent sets for each variable. When limiting the number of parents per variable, we can exactly marginalise over the parent sets in polynomial time. We further use Gaussian processes to model the unknown causal mechanisms, which also allows their exact marginalisation. This introduces a Rao-Blackwellization scheme, where all components are eliminated from the model, except for the causal order, for which we learn a distribution via gradient-based optimisation. The combination of Rao-Blackwellization with our sequential inference procedure for causal orders yields state-of-the-art on linear and non-linear additive noise benchmarks with scale-free and Erdos-Renyi graph structures.

artificial intelligence, causal order, machine learning, (15 more...)

2402.14781

Country: Europe > Austria (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

arXiv.org Artificial IntelligenceDec-18-2023

Angle-Equivariant Convolutional Neural Networks for Interference Mitigation in Automotive Radar

Oswald, Christian, Toth, Mate, Meissner, Paul, Pernkopf, Franz

In automotive applications, frequency modulated continuous wave (FMCW) radar is an established technology to determine the distance, velocity and angle of objects in the vicinity of the vehicle. The quality of predictions might be seriously impaired if mutual interference between radar sensors occurs. Previous work processes data from the entire receiver array in parallel to increase interference mitigation quality using neural networks (NNs). However, these architectures do not generalize well across different angles of arrival (AoAs) of interferences and objects. In this paper we introduce fully convolutional neural network (CNN) with rank-three convolutions which is able to transfer learned patterns between different AoAs. Our proposed architecture outperforms previous work while having higher robustness and a lower number of trainable parameters. We evaluate our network on a diverse data set and demonstrate its angle equivariance.

artificial intelligence, interference, machine learning, (16 more...)

doi: 10.23919/EuRAD58043.2023.10289631

2401.05385

Country: Europe > Austria (0.15)

Genre: Research Report (0.40)

Industry: Automobiles & Trucks (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceDec-15-2023

End-to-End Training of Neural Networks for Automotive Radar Interference Mitigation

Oswald, Christian, Toth, Mate, Meissner, Paul, Pernkopf, Franz

In this paper we propose a new method for training neural networks (NNs) for frequency modulated continuous wave (FMCW) radar mutual interference mitigation. Instead of training NNs to regress from interfered to clean radar signals as in previous work, we train NNs directly on object detection maps. We do so by performing a continuous relaxation of the cell-averaging constant false alarm rate (CA-CFAR) peak detector, which is a well-established algorithm for object detection using radar. With this new training objective we are able to increase object detection performance by a large margin. Furthermore, we introduce separable convolution kernels to strongly reduce the number of parameters and computational complexity of convolutional NN architectures for radar applications. We validate our contributions with experiments on real-world measurement data and compare them against signal processing interference mitigation methods.

artificial intelligence, interference mitigation, machine learning, (18 more...)

2312.0979

Country:

Europe > Austria (0.16)
Europe > Netherlands (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

arXiv.org Artificial IntelligenceDec-8-2022

Explainable Machine Learning for Breakdown Prediction in High Gradient RF Cavities

Obermair, Christoph, Cartier-Michaud, Thomas, Apollonio, Andrea, Millar, William, Felsberger, Lukas, Fischl, Lorenz, Bovbjerg, Holger Severin, Wollmann, Daniel, Wuensch, Walter, Catalan-Lasheras, Nuria, Boronat, Marçà, Pernkopf, Franz, Burt, Graeme

The occurrence of vacuum arcs or radio frequency (rf) breakdowns is one of the most prevalent factors limiting the high-gradient performance of normal conducting rf cavities in particle accelerators. In this paper, we search for the existence of previously unrecognized features related to the incidence of rf breakdowns by applying a machine learning strategy to high-gradient cavity data from CERN's test stand for the Compact Linear Collider (CLIC). By interpreting the parameters of the learned models with explainable artificial intelligence (AI), we reverse-engineer physical properties for deriving fast, reliable, and simple rule-based models. Based on 6 months of historical data and dedicated experiments, our models show fractions of data with a high influence on the occurrence of breakdowns. Specifically, it is shown that the field emitted current following an initial breakdown is closely related to the probability of another breakdown occurring shortly thereafter. Results also indicate that the cavity pressure should be monitored with increased temporal resolution in future experiments, to further explore the vacuum activity associated with breakdowns.

breakdown, data mining, machine learning, (22 more...)

doi: 10.1103/PhysRevAccelBeams.25.104601

2202.0561

Country:

North America > United States (1.00)
Europe (1.00)
Asia (1.00)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine (0.67)
Energy > Oil & Gas (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

arXiv.org Machine LearningApr-14-2021

End-to-end Keyword Spotting using Neural Architecture Search and Quantization

Peter, David, Roth, Wolfgang, Pernkopf, Franz

This paper introduces neural architecture search (NAS) for the automatic discovery of end-to-end keyword spotting (KWS) models in limited resource environments. We employ a differentiable NAS approach to optimize the structure of convolutional neural networks (CNNs) operating on raw audio waveforms. After a suitable KWS model is found with NAS, we conduct quantization of weights and activations to reduce the memory footprint. We conduct extensive experiments on the Google speech commands dataset. In particular, we compare our end-to-end approach to mel-frequency cepstral coefficient (MFCC) based systems. For quantization, we compare fixed bit-width quantization and trained bit-width quantization. Using NAS only, we were able to obtain a highly efficient model with an accuracy of 95.55% using 75.7k parameters and 13.6M operations. Using trained bit-width quantization, the same model achieves a test accuracy of 93.76% while using on average only 2.91 bits per activation and 2.51 bits per weight.

deep learning, neural network, quantization, (14 more...)

2104.06666

Country: Europe > Austria (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

arXiv.org Artificial IntelligenceOct-22-2020

On Resource-Efficient Bayesian Network Classifiers and Deep Neural Networks

Roth, Wolfgang, Schindler, Günther, Fröning, Holger, Pernkopf, Franz

We present two methods to reduce the complexity of Bayesian network (BN) classifiers. First, we introduce quantization-aware training using the straight-through gradient estimator to quantize the parameters of BNs to few bits. Second, we extend a recently proposed differentiable tree-augmented naive Bayes (TAN) structure learning approach by also considering the model size. Both methods are motivated by recent developments in the deep learning community, and they provide effective means to trade off between model size and prediction accuracy, which is demonstrated in extensive experiments. Furthermore, we contrast quantized BN classifiers with quantized deep neural networks (DNNs) for small-scale scenarios which have hardly been investigated in the literature. We show Pareto optimal models with respect to model size, number of operations, and test error and find that both model classes are viable options.

artificial intelligence, classifier, neural network, (19 more...)

2010.11773

Country:

Europe > Austria (0.14)
Europe > Germany (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)