AITopics | interpretable direction

Collaborating Authors

interpretable direction

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

DiffEx: Explaining a Classifier with Diffusion Models to Identify Microscopic Cellular Variations

Bourou, Anis, Mahanta, Saranga Kingkor, Boyer, Thomas, Mezger, Valérie, Genovesio, Auguste

arXiv.org Artificial IntelligenceFeb-12-2025

In recent years, deep learning models have been extensively applied to biological data across various modalities. Discriminative deep learning models have excelled at classifying images into categories (e.g., healthy versus diseased, treated versus untreated). However, these models are often perceived as black boxes due to their complexity and lack of interpretability, limiting their application in real-world biological contexts. In biological research, explainability is essential: understanding classifier decisions and identifying subtle differences between conditions are critical for elucidating the effects of treatments, disease progression, and biological processes. To address this challenge, we propose DiffEx, a method for generating visually interpretable attributes to explain classifiers and identify microscopic cellular variations between different conditions. We demonstrate the effectiveness of DiffEx in explaining classifiers trained on natural and biological images. Furthermore, we use DiffEx to uncover phenotypic differences within microscopy datasets. By offering insights into cellular variations through classifier explanations, DiffEx has the potential to advance the understanding of diseases and aid drug discovery by identifying novel biomarkers.

artificial intelligence, classifier, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2502.09663

Country: Europe > Switzerland (0.04)

Genre: Research Report (0.65)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.86)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Unsupervised Panoptic Interpretation of Latent Spaces in GANs Using Space-Filling Vector Quantization

Vali, Mohammad Hassan, Bäckström, Tom

arXiv.org Artificial IntelligenceOct-27-2024

Generative adversarial networks (GANs) learn a latent space whose samples can be mapped to real-world images. Such latent spaces are difficult to interpret. Some earlier supervised methods aim to create an interpretable latent space or discover interpretable directions that require exploiting data labels or annotated synthesized samples for training. However, we propose using a modification of vector quantization called space-filling vector quantization (SFVQ), which quantizes the data on a piece-wise linear curve. SFVQ can capture the underlying morphological structure of the latent space and thus make it interpretable. We apply this technique to model the latent space of pretrained StyleGAN2 and BigGAN networks on various datasets. Our experiments show that the SFVQ curve yields a general interpretable model of the latent space that determines which part of the latent space corresponds to what specific generative factors. Furthermore, we demonstrate that each line of SFVQ's curve can potentially refer to an interpretable direction for applying intelligible image transformations. We also showed that the points located on an SFVQ line can be used for controllable data augmentation.

artificial intelligence, codebook vector, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2410.20573

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Unsupervised Discovery of Interpretable Directions in h-space of Pre-trained Diffusion Models

Zhang, Zijian, Liu, Luping, Lin, Zhijie, Zhu, Yichen, Zhao, Zhou

arXiv.org Artificial IntelligenceNov-30-2023

We propose the first unsupervised and learning-based method to identify interpretable directions in h-space of pre-trained diffusion models. Our method is derived from an existing technique that operates on the GAN latent space. Specifically, we employ a shift control module that works on h-space of pre-trained diffusion models to manipulate a sample into a shifted version of itself, followed by a reconstructor to reproduce both the type and the strength of the manipulation. By jointly optimizing them, the model will spontaneously discover disentangled and interpretable directions. To prevent the discovery of meaningless and destructive directions, we employ a discriminator to maintain the fidelity of shifted sample. Due to the iterative generative process of diffusion models, our training requires a substantial amount of GPU VRAM to store numerous intermediate tensors for back-propagating gradient. To address this issue, we propose a general VRAM-efficient training algorithm based on gradient checkpointing technique to back-propagate any gradient through the whole generative process, with acceptable occupancy of VRAM and sacrifice of training efficiency. Compared with existing related works on diffusion models, our method inherently identifies global and scalable directions, without necessitating any other complicated procedures. Extensive experiments on various datasets demonstrate the effectiveness of our method.

ddim reverse process, latent space, reverse process, (13 more...)

arXiv.org Artificial Intelligence

2310.09912

Country: North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Identifying Interpretable Visual Features in Artificial and Biological Neural Systems

Klindt, David, Sanborn, Sophia, Acosta, Francisco, Poitevin, Frédéric, Miolane, Nina

arXiv.org Machine LearningOct-17-2023

Single neurons in neural networks are often interpretable in that they represent individual, intuitively meaningful features. However, many neurons exhibit mixed selectivity, i.e., they represent multiple unrelated features. A recent hypothesis proposes that features in deep networks may be represented in superposition, i.e., on non-orthogonal axes by multiple neurons, since the number of possible interpretable features in natural data is generally larger than the number of neurons in a given network. Accordingly, we should be able to find meaningful directions in activation space that are not aligned with individual neurons. Here, we propose (1) an automated method for quantifying visual interpretability that is validated against a large database of human psychophysics judgments of neuron interpretability, and (2) an approach for finding meaningful directions in network activation space. We leverage these methods to discover directions in convolutional neural networks that are more intuitively meaningful than individual neurons, as we confirm and investigate in a series of analyses. Moreover, we apply the same method to three recent datasets of visual neural responses in the brain and find that our conclusions largely transfer to real neural data, suggesting that superposition might be deployed by the brain. This also provides a link with disentanglement and raises fundamental questions about robust, efficient and factorized representations in both artificial and biological neural systems. One of the oldest ideas in neuroscience is Cajal's single neuron doctrine (Finger, 2001) and its application to perception (Barlow, 1972), i.e., the hypothesis that individual sensory neurons encode individually meaningful features. The idea dates back to the early 1950s, when researchers began to find evidence of neurons that reliably and selectively fire in response to particular stimuli, such as dots on a contrasting background (Barlow, 1953) and lines of particular orientation and width (Hubel & Wiesel, 1959). These findings gave rise to the standard model of the ventral visual stream as a process of hierarchical feature extraction and pooling (Hubel & Wiesel, 1968; Gross et al., 1972; In this work, we adopt a pragmatic definition of feature based on human discernability, measured through psychophysics experiments (see below). For an attempt at a more formal definition see Elhage et al. (2022). Neurons in the early stages extract simple features, such as oriented lines, while neurons at later stages combine simple features to construct more complex composite features. In the highest stages, complex features are combined to yield representations of entire objects encoded by single neurons--the shape of a hand, or the face of a friend.

artificial intelligence, machine learning, neuron, (18 more...)

arXiv.org Machine Learning

2310.11431

Country:

Asia > Japan > Honshū > Tōhoku > Iwate Prefecture > Morioka (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
(4 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Orthogonal SVD Covariance Conditioning and Latent Disentanglement

Song, Yue, Sebe, Nicu, Wang, Wei

arXiv.org Artificial IntelligenceDec-11-2022

Inserting an SVD meta-layer into neural networks is prone to make the covariance ill-conditioned, which could harm the model in the training stability and generalization abilities. In this paper, we systematically study how to improve the covariance conditioning by enforcing orthogonality to the Pre-SVD layer. Existing orthogonal treatments on the weights are first investigated. However, these techniques can improve the conditioning but would hurt the performance. To avoid such a side effect, we propose the Nearest Orthogonal Gradient (NOG) and Optimal Learning Rate (OLR). The effectiveness of our methods is validated in two applications: decorrelated Batch Normalization (BN) and Global Covariance Pooling (GCP). Extensive experiments on visual recognition demonstrate that our methods can simultaneously improve covariance conditioning and generalization. The combinations with orthogonal weight can further boost the performance. Moreover, we show that our orthogonality techniques can benefit generative models for better latent disentanglement through a series of experiments on various benchmarks. Code is available at: \href{https://github.com/KingJamesSong/OrthoImproveCond}{https://github.com/KingJamesSong/OrthoImproveCond}.

artificial intelligence, machine learning, matrix, (19 more...)

arXiv.org Artificial Intelligence

2212.05599

Country:

Europe > Italy > Trentino-Alto Adige/Südtirol > Trentino Province > Trento (0.04)
Asia > China > Beijing > Beijing (0.04)
Europe > Switzerland (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Interpreting Latent Spaces of Generative Models for Medical Images using Unsupervised Methods

Schön, Julian, Selvan, Raghavendra, Petersen, Jens

arXiv.org Artificial IntelligenceJul-20-2022

Generative models such as Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs) play an increasingly important role in medical image analysis. The latent spaces of these models often show semantically meaningful directions corresponding to human-interpretable image transformations. However, until now, their exploration for medical images has been limited due to the requirement of supervised data. Several methods for unsupervised discovery of interpretable directions in GAN latent spaces have shown interesting results on natural images. This work explores the potential of applying these techniques on medical images by training a GAN and a VAE on thoracic CT scans and using an unsupervised method to discover interpretable directions in the resulting latent space. We find several directions corresponding to non-trivial image transformations, such as rotation or breast size. Furthermore, the directions show that the generative models capture 3D structure despite being presented only with 2D data. The results show that unsupervised methods to discover interpretable directions in GANs generalize to VAEs and can be applied to medical images. This opens a wide array of future work using these methods in medical image analysis.

generative model, interpretable direction, latent space, (12 more...)

arXiv.org Artificial Intelligence

2207.0974

Country:

North America > United States (0.14)
Europe > Denmark > Capital Region > Copenhagen (0.04)

Genre: Research Report > New Finding (0.50)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.87)

Add feedback

Unsupervised Discovery of Interpretable Directions in the GAN Latent Space

Voynov, Andrey, Babenko, Artem

arXiv.org Machine LearningFeb-18-2020

The latent spaces of typical GAN models often have semantically meaningful directions. Moving in these directions corresponds to human-interpretable image transformations, such as zooming or recoloring, enabling a more controllable generation process. However, the discovery of such directions is currently performed in a supervised manner, requiring human labels, pretrained models, or some form of self-supervision. These requirements can severely limit a range of directions existing approaches can discover. In this paper, we introduce an unsupervised method to identify interpretable directions in the latent space of a pretrained GAN model. By a simple model-agnostic procedure, we find directions corresponding to sensible semantic manipulations without any form of (self-)supervision. Furthermore, we reveal several non-trivial findings, which would be difficult to obtain by existing methods, e.g., a direction corresponding to background removal. As an immediate practical benefit of our work, we show how to exploit this finding to achieve a new state-of-the-art for the problem of saliency detection.

interpretable direction, latent space, transformation, (14 more...)

arXiv.org Machine Learning

2002.03754

Country:

Asia > Russia (0.04)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback