AITopics

2406.01494

Country: Europe (0.28)

Genre: Research Report > Promising Solution (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceMar-20-2024

ExMap: Leveraging Explainability Heatmaps for Unsupervised Group Robustness to Spurious Correlations

Chakraborty, Rwiddhi, Sletten, Adrian, Kampffmeyer, Michael

Group robustness strategies aim to mitigate learned biases in deep learning models that arise from spurious correlations present in their training datasets. However, most existing methods rely on the access to the label distribution of the groups, which is time-consuming and expensive to obtain. As a result, unsupervised group robustness strategies are sought. Based on the insight that a trained model's classification strategies can be inferred accurately based on explainability heatmaps, we introduce ExMap, an unsupervised two stage mechanism designed to enhance group robustness in traditional classifiers. ExMap utilizes a clustering module to infer pseudo-labels based on a model's explainability heatmaps, which are then used during training in lieu of actual labels. Our empirical studies validate the efficacy of ExMap - We demonstrate that it bridges the performance gap with its supervised counterparts and outperforms existing partially supervised and unsupervised methods. Additionally, ExMap can be seamlessly integrated with existing group robustness learning strategies. Finally, we demonstrate its potential in tackling the emerging issue of multiple shortcut mitigation\footnote{Code available at \url{https://github.com/rwchakra/exmap}}.

artificial intelligence, deep learning, machine learning, (19 more...)

2403.1387

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)

arXiv.org Artificial IntelligenceDec-8-2023

Data-Centric Machine Learning for Geospatial Remote Sensing Data

Roscher, Ribana, Rußwurm, Marc, Gevaert, Caroline, Kampffmeyer, Michael, Santos, Jefersson A. dos, Vakalopoulou, Maria, Hänsch, Ronny, Hansen, Stine, Nogueira, Keiller, Prexl, Jonathan, Tuia, Devis

Recent developments and research in modern machine learning have led to substantial improvements in the geospatial field. Although numerous deep learning models have been proposed, the majority of them have been developed on benchmark datasets that lack strong real-world relevance. Furthermore, the performance of many methods has already saturated on these datasets. We argue that shifting the focus towards a complementary data-centric perspective is necessary to achieve further improvements in accuracy, generalization ability, and real impact in end-user applications. This work presents a definition and precise categorization of automated data-centric learning approaches for geospatial data. It highlights the complementary role of data-centric learning with respect to model-centric in the larger machine learning deployment cycle. We review papers across the entire geospatial field and categorize them into different groups. A set of representative experiments shows concrete implementation examples. These examples provide concrete steps to act on geospatial data with data-centric machine learning approaches.

artificial intelligence, machine learning, survey article, (16 more...)

2312.05327

Country:

North America > United States (0.46)
Europe > Netherlands (0.28)
Europe > Germany > Bavaria (0.14)

Genre: Overview (1.00)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJul-2-2023

Defending Against Malicious Behaviors in Federated Learning with Blockchain

Dong, Nanqing, Wang, Zhipeng, Sun, Jiahao, Kampffmeyer, Michael, Wen, Yizhe, Zhang, Shuoying, Knottenbelt, William, Xing, Eric

In the era of deep learning, federated learning (FL) presents a promising approach that allows multi-institutional data owners, or clients, to collaboratively train machine learning models without compromising data privacy. However, most existing FL approaches rely on a centralized server for global model aggregation, leading to a single point of failure. This makes the system vulnerable to malicious attacks when dealing with dishonest clients. In this work, we address this problem by proposing a secure and reliable FL system based on blockchain and distributed ledger technology. Our system incorporates a peer-to-peer voting mechanism and a reward-and-slash mechanism, which are powered by on-chain smart contracts, to detect and deter malicious behaviors. Both theoretical and empirical analyses are presented to demonstrate the effectiveness of the proposed approach, showing that our framework is robust against malicious client-side behaviors.

artificial intelligence, blockchain, machine learning, (15 more...)

2307.00543

Country:

North America > United States (1.00)
Europe (0.68)

Genre: Research Report > Promising Solution (0.34)

Industry:

Information Technology > Security & Privacy (1.00)
Banking & Finance (1.00)
Government > Regional Government > North America Government > United States Government (0.46)
Information Technology > Services > e-Commerce Services (0.34)

Technology:

Information Technology > e-Commerce > Financial Technology (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

arXiv.org Artificial IntelligenceApr-27-2023

Automatic Identification of Chemical Moieties

Lederer, Jonas, Gastegger, Michael, Schütt, Kristof T., Kampffmeyer, Michael, Müller, Klaus-Robert, Unke, Oliver T.

In recent years, the prediction of quantum mechanical observables with machine learning methods has become increasingly popular. Message-passing neural networks (MPNNs) solve this task by constructing atomic representations, from which the properties of interest are predicted. Here, we introduce a method to automatically identify chemical moieties (molecular building blocks) from such representations, enabling a variety of applications beyond property prediction, which otherwise rely on expert knowledge. The required representation can either be provided by a pretrained MPNN, or learned from scratch using only structural information. Beyond the data-driven design of molecular fingerprints, the versatility of our approach is demonstrated by enabling the selection of representative entries in chemical databases, the automatic construction of coarse-grained force fields, as well as the identification of reaction coordinates.

artificial intelligence, machine learning, molecule, (20 more...)

2203.16205

Country:

North America > United States (0.46)
Europe > Germany (0.28)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.92)
Materials > Chemicals (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Artificial IntelligenceMar-16-2023

Self-Supervised Few-Shot Learning for Ischemic Stroke Lesion Segmentation

Tomasetti, Luca, Hansen, Stine, Khanmohammadi, Mahdieh, Engan, Kjersti, Høllesli, Liv Jorunn, Kurz, Kathinka Dæhli, Kampffmeyer, Michael

Precise ischemic lesion segmentation plays an essential role in improving diagnosis and treatment planning for ischemic stroke, one of the prevalent diseases with the highest mortality rate. While numerous deep neural network approaches have recently been proposed to tackle this problem, these methods require large amounts of annotated regions during training, which can be impractical in the medical domain where annotated data is scarce. As a remedy, we present a prototypical few-shot segmentation approach for ischemic lesion segmentation using only one annotated sample during training. The proposed approach leverages a novel self-supervised training mechanism that is tailored to the task of ischemic stroke lesion segmentation by exploiting color-coded parametric maps generated from Computed Tomography Perfusion scans. We illustrate the benefits of our proposed training mechanism, leading to considerable improvements in performance in the few-shot setting. Given a single annotated patient, an average Dice score of 0.58 is achieved for the segmentation of ischemic lesions.

artificial intelligence, machine learning, segmentation, (19 more...)

doi: 10.1109/ISBI53787.2023.10230655

2303.01332

Country: Europe > Norway (0.15)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Hematology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

arXiv.org Machine LearningOct-9-2021

Discriminative Multimodal Learning via Conditional Priors in Generative Models

Mancisidor, Rogelio A., Kampffmeyer, Michael, Aas, Kjersti, Jenssen, Robert

Deep generative models with latent variables have been used lately to learn joint representations and generative processes from multi-modal data. These two learning mechanisms can, however, conflict with each other and representations can fail to embed information on the data modalities. This research studies the realistic scenario in which all modalities and class labels are available for model training, but where some modalities and labels required for downstream tasks are missing. We show, in this scenario, that the variational lower bound limits mutual information between joint representations and missing modalities. We, to counteract these problems, introduce a novel conditional multi-modal discriminative model that uses an informative prior distribution and optimizes a likelihood-free objective function that maximizes mutual information between joint representations and missing modalities. Extensive experimentation shows the benefits of the model we propose, the empirical results showing that our model achieves state-of-the-art results in representative problems such as downstream classification, acoustic inversion and annotation generation.

artificial intelligence, machine learning, neural network, (20 more...)

2110.04616

Country: North America > United States > Wisconsin > Dane County > Madison (0.14)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.84)

arXiv.org Machine LearningMay-20-2021

Negational Symmetry of Quantum Neural Networks for Binary Pattern Classification

Dong, Nanqing, Kampffmeyer, Michael, Voiculescu, Irina, Xing, Eric

Entanglement is a physical phenomenon, which has fueled recent successes of quantum algorithms. Although quantum neural networks (QNNs) have shown promising results in solving simple machine learning tasks recently, for the time being, the effect of entanglement in QNNs and the behavior of QNNs in binary pattern classification are still underexplored. In this work, we provide some theoretical insight into the properties of QNNs by presenting and analyzing a new form of invariance embedded in QNNs for both quantum binary classification and quantum representation learning, which we term negational symmetry. Given a quantum binary signal and its negational counterpart where a bitwise NOT operation is applied to each quantum bit of the binary signal, a QNN outputs the same logits. That is to say, QNNs cannot differentiate a quantum binary signal and its negational counterpart in a binary classification task. We further empirically evaluate the negational symmetry of QNNs in binary pattern classification tasks using Google's quantum computing framework. The theoretical and experimental results suggest that negational symmetry is a fundamental property of QNNs, which is not shared by classical models. Our findings also imply that negational symmetry is a double-edged sword in practical quantum applications.

deep learning, negational symmetry, neural network, (14 more...)

2105.0958

Country: Europe > United Kingdom > England (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Machine LearningDec-7-2020

Joint Optimization of an Autoencoder for Clustering and Embedding

Boubekki, Ahcène, Kampffmeyer, Michael, Brefeld, Ulf, Jenssen, Robert

Incorporating k-means-like clustering techniques into (deep) autoencoders constitutes an interesting idea as the clustering may exploit the learned similarities in the embedding to compute a non-linear grouping of data at-hand. Unfortunately, the resulting contributions are often limited by ad-hoc choices, decoupled optimization problems and other issues. We present a theoretically-driven deep clustering approach that does not suffer from these limitations and allows for joint optimization of clustering and embedding. The network in its simplest form is derived from a Gaussian mixture model and can be incorporated seamlessly into deep autoencoders for state-of-the-art performance.

artificial intelligence, centroid, neural network, (17 more...)

2012.0374

Country:

North America > United States (0.47)
Europe (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)

arXiv.org Machine LearningSep-25-2019

Information Plane Analysis of Deep Neural Networks via Matrix-Based Renyi's Entropy and Tensor Kernels

Wickstrøm, Kristoffer, Løkse, Sigurd, Kampffmeyer, Michael, Yu, Shujian, Principe, Jose, Jenssen, Robert

Analyzing deep neural networks (DNNs) via information plane (IP) theory has gained tremendous attention recently as a tool to gain insight into, among others, their generalization ability. However, it is by no means obvious how to estimate mutual information (MI) between each hidden layer and the input/desired output, to construct the IP . For instance, hidden layers with many neurons require MI estimators with robustness towards the high dimensionality associated with such layers. MI estimators should also be able to naturally handle convolutional layers, while at the same time being computationally tractable to scale to large networks. None of the existing IP methods to date have been able to study truly deep Con-volutional Neural Networks (CNNs), such as the e.g. In this paper, we propose an IP analysis using the new matrix-based R enyi's entropy coupled with tensor kernels over convolutional layers, leveraging the power of kernel methods to represent properties of the probability distribution independently of the dimensionality of the data. The obtained results shed new light on the previous literature concerning small-scale DNNs, however using a completely new approach. Importantly, the new framework enables us to provide the first comprehensive IP analysis of contemporary large-scale DNNs and CNNs, investigating the different training phases and providing new insights into the training dynamics of large-scale neural networks. Although Deep Neural Networks (DNNs) are at the core of most state-of-the art systems in computer vision, the theoretical understanding of such networks is still not at a satisfactory level (Shwartz-Ziv & Tishby, 2017).

activation function, deep learning, neural network, (19 more...)

1909.11396

Country:

Europe (0.46)
North America > United States > California (0.28)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)