AITopics | latent direction

Collaborating Authors

latent direction

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

4bfcebedf7a2967c410b64670f27f904-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 12:36:26 GMT

artificial intelligence, editing, machine learning, (15 more...)

Neural Information Processing Systems

Industry: Media (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.99)

Add feedback

Unveiling the Latent Directions of Reflection in Large Language Models

Chang, Fu-Chieh, Lee, Yu-Ting, Wu, Pei-Yuan

arXiv.org Artificial IntelligenceDec-12-2025

Reflection, the ability of large language models (LLMs) to evaluate and revise their own reasoning, has been widely used to improve performance on complex reasoning tasks. Yet, most prior works emphasizes designing reflective prompting strategies or reinforcement learning objectives, leaving the inner mechanisms of reflection underexplored. In this paper, we investigate reflection through the lens of latent directions in model activations. We propose a methodology based on activation steering to characterize how instructions with different reflective intentions: no reflection, intrinsic reflection, and triggered reflection. By constructing steering vectors between these reflection levels, we demonstrate that (1) new reflection-inducing instructions can be systematically identified, (2) reflective behavior can be directly enhanced or suppressed through activation interventions, and (3) suppressing reflection is considerably easier than stimulating it. Experiments on GSM8k-adv and Cruxeval-o-adv with Qwen2.5-3B and Gemma3-4B-IT reveal clear stratification across reflection levels, and steering interventions confirm the controllability of reflection. Our findings highlight both opportunities (e.g., reflection-enhancing defenses) and risks (e.g., adversarial inhibition of reflection in jailbreak attacks). This work opens a path toward mechanistic understanding of reflective reasoning in LLMs.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2508.16989

Country: Asia > Middle East (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Latent Feature Alignment: Discovering Biased and Interpretable Subpopulations in Face Recognition Models

Serna, Ignacio

arXiv.org Artificial IntelligenceOct-20-2025

Modern face recognition models achieve high overall accuracy but continue to exhibit systematic biases that disproportionately affect certain subpopulations. Conventional bias evaluation frameworks rely on labeled attributes to form subpopulations, which are expensive to obtain and limited to predefined categories. We introduce Latent Feature Alignment (LFA), an attribute-label-free algorithm that uses latent directions to identify subpopulations. This yields two main benefits over standard clustering: (i) semantically coherent grouping, where faces sharing common attributes are grouped together more reliably than by proximity-based methods, and (ii) discovery of interpretable directions, which correspond to semantic attributes such as age, ethnicity, or attire. Across four state-of-the-art recognition models (ArcFace, CosFace, ElasticFace, PartialFC) and two benchmarks (RFW, CelebA), LFA consistently outperforms k-means and nearest-neighbor search in intra-group semantic coherence, while uncovering interpretable latent directions aligned with demographic and contextual attributes. These results position LFA as a practical method for representation auditing of face recognition models, enabling practitioners to identify and interpret biased subpopulations without predefined attribute annotations.

artificial intelligence, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2510.1552

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Tennessee > Davidson County > Nashville (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > Germany > Berlin (0.04)

Genre: Research Report (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Table A1: Hyper-parameter settings. model t

Neural Information Processing SystemsOct-8-2025, 15:34:49 GMT

Regarding the reasons for these failure cases, we emphasize two factors.

artificial intelligence, editing, machine learning, (15 more...)

Neural Information Processing Systems

Industry: Media (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.99)

Add feedback

Steering LVLMs via Sparse Autoencoder for Hallucination Mitigation

Hua, Zhenglin, He, Jinghan, Yao, Zijun, Han, Tianxu, Guo, Haiyun, Jia, Yuheng, Fang, Junfeng

arXiv.org Artificial IntelligenceSep-16-2025

Large vision-language models (LVLMs) have achieved remarkable performance on multimodal tasks. However, they still suffer from hallucinations, generating text inconsistent with visual input, posing significant risks in real-world applications. Existing approaches to address this issue focus on incorporating external knowledge bases, alignment training, or decoding strategies, all of which require substantial computational cost and time. Recent works try to explore more efficient alternatives by adjusting LVLMs' internal representations. Although promising, these methods may cause hallucinations to be insufficiently suppressed or lead to excessive interventions that negatively affect normal semantics. In this work, we leverage sparse autoencoders (SAEs) to identify semantic directions closely associated with faithfulness or hallucination, extracting more precise and disentangled hallucination-related representations. Our analysis demonstrates that interventions along the identified faithful direction can mitigate hallucinations, while those along the hallucinatory direction can exacerbate them. Building on these insights, we propose Steering LVLMs via SAE Latent Directions (SSL), a plug-and-play method based on SAE-derived latent directions to mitigate hallucinations in LVLMs. Extensive experiments demonstrate that SSL significantly outperforms existing decoding approaches in mitigating hallucinations, while maintaining transferability across different model architectures with negligible additional time overhead. The code is available at https://github.com/huazhenglin2003/SSL.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2505.16146

Country:

Asia (0.28)
Europe (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.34)
Transportation > Ground > Road (0.34)
Automobiles & Trucks (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Local Disentanglement in V ariational Auto-Encoders Using Jacobian L1 Regularization

Neural Information Processing SystemsAug-17-2025, 04:28:35 GMT

Inferring a good latent representation from a dataset is a difficult problem and is generally under-specified in algorithms. This underspecification is called the "model identification" problem, and

artificial intelligence, disentanglement, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Robots (0.94)

Add feedback

Disentanglement Analysis in Deep Latent Variable Models Matching Aggregate Posterior Distributions

Saha, Surojit, Joshi, Sarang, Whitaker, Ross

arXiv.org Machine LearningJan-26-2025

Deep latent variable models (DLVMs) are designed to learn meaningful representations in an unsupervised manner, such that the hidden explanatory factors are interpretable by independent latent variables (aka disentanglement). The variational autoencoder (VAE) is a popular DLVM widely studied in disentanglement analysis due to the modeling of the posterior distribution using a factorized Gaussian distribution that encourages the alignment of the latent factors with the latent axes. Several metrics have been proposed recently, assuming that the latent variables explaining the variation in data are aligned with the latent axes (cardinal directions). However, there are other DLVMs, such as the AAE and WAE-MMD (matching the aggregate posterior to the prior), where the latent variables might not be aligned with the latent axes. In this work, we propose a statistical method to evaluate disentanglement for any DLVMs in general. The proposed technique discovers the latent vectors representing the generative factors of a dataset that can be different from the cardinal latent axes. We empirically demonstrate the advantage of the method on two datasets.

artificial intelligence, latent direction, machine learning, (15 more...)

arXiv.org Machine Learning

2501.15705

Country: North America > United States > Utah > Salt Lake County > Salt Lake City (0.05)

Genre: Research Report (0.41)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Decoding Diffusion: A Scalable Framework for Unsupervised Analysis of Latent Space Biases and Representations Using Natural Language Prompts

Zeng, E. Zhixuan, Chen, Yuhao, Wong, Alexander

arXiv.org Artificial IntelligenceNov-4-2024

Recent advances in image generation have made diffusion models powerful tools for creating high-quality images. However, their iterative denoising process makes understanding and interpreting their semantic latent spaces more challenging than other generative models, such as GANs. Recent methods have attempted to address this issue by identifying semantically meaningful directions within the latent space. However, they often need manual interpretation or are limited in the number of vectors that can be trained, restricting their scope and utility. This paper proposes a novel framework for unsupervised exploration of diffusion latent spaces. We directly leverage natural language prompts and image captions to map latent directions. This method allows for the automatic understanding of hidden features and supports a broader range of analysis without the need to train specific vectors. Our method provides a more scalable and interpretable understanding of the semantic knowledge encoded within diffusion models, facilitating comprehensive analysis of latent biases and the nuanced representations these models learn. Experimental results show that our framework can uncover hidden patterns and associations in various domains, offering new insights into the interpretability of diffusion model latent spaces.

latent space, representation, vector, (14 more...)

arXiv.org Artificial Intelligence

2410.21314

Country: Asia (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Controlling Face's Frame generation in StyleGAN's latent space operations: Modifying faces to deceive our memory

Roca, Agustín, Britos, Nicolás Ignacio

arXiv.org Artificial IntelligenceJun-30-2024

Innocence Project is a non-profitable organization that works in reducing wrongful convictions. In collaboration with Laboratorio de Sue\~no y Memoria from Instituto Tecnol\'ogico de Buenos Aires (ITBA), they are studying human memory in the context of face identification. They have a strong hypothesis stating that human memory heavily relies in face's frame to recognize faces. If this is proved, it could mean that face recognition in police lineups couldn't be trusted, as they may lead to wrongful convictions. This study uses experiments in order to try to prove this using faces with different properties, such as eyes size, but maintaining its frame as much as possible. In this project, we continue the work from a previous project that provided the basic tool to generate realistic faces using StyleGAN2. We take a deep dive into the internals of this tool to make full use of StyleGAN2 functionalities, while also adding more features, such as modifying certain of its attributes, including mouth-opening or eye-opening. As the usage of this tool heavily relies on maintaining the face-frame, we develop a way to identify the face-frame of each image and a function to compare it to the output of the neural network after applying some operations. We conclude that the face-frame is maintained when modifying eye-opening or mouth opening. When modifying vertical face orientation, gender, age and smile, have a considerable impact on its frame variation. And finally, the horizontal face orientation shows a major impact on the face-frame. This way, the Lab may apply some operations being confident that the face-frame won't significantly change, making them viable to be used to deceive subjects' memories.

face-frame variation, latent space, variation, (16 more...)

arXiv.org Artificial Intelligence

2407.00803

Country:

South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.24)
North America > United States > Ohio (0.04)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Latent Space Perspicacity and Interpretation Enhancement (LS-PIE) Framework

Stevens, Jesse, Wilke, Daniel N., Setshedi, Itumeleng

arXiv.org Artificial IntelligenceJul-10-2023

Linear latent variable models such as principal component analysis (PCA), independent component analysis (ICA), canonical correlation analysis (CCA), and factor analysis (FA) identify latent directions (or loadings) either ordered or unordered. The data is then projected onto the latent directions to obtain their projected representations (or scores). For example, PCA solvers usually rank the principal directions by explaining the most to least variance, while ICA solvers usually return independent directions unordered and often with single sources spread across multiple directions as multiple sub-sources, which is of severe detriment to their usability and interpretability. This paper proposes a general framework to enhance latent space representations for improving the interpretability of linear latent spaces. Although the concepts in this paper are language agnostic, the framework is written in Python. This framework automates the clustering and ranking of latent vectors to enhance the latent information per latent vector, as well as, the interpretation of latent vectors. Several innovative enhancements are incorporated including latent ranking (LR), latent scaling (LS), latent clustering (LC), and latent condensing (LCON). For a specified linear latent variable model, LR ranks latent directions according to a specified metric, LS scales latent directions according to a specified metric, LC automatically clusters latent directions into a specified number of clusters, while, LCON automatically determines an appropriate number of clusters into which to condense the latent directions for a given metric. Additional functionality of the framework includes single-channel and multi-channel data sources, data preprocessing strategies such as Hankelisation to seamlessly expand the applicability of linear latent variable models (LLVMs) to a wider variety of data. The effectiveness of LR, LS, and LCON are showcased on two crafted foundational problems with two applied latent variable models, namely, PCA and ICA.

artificial intelligence, latent direction, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2307.0562

Country:

Africa > South Africa > Gauteng > Pretoria (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.40)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.69)

Add feedback