AITopics | Large Language Model

Collaborating Authors

Large Language Model

News Overviews Instructional Materials AI-Alerts Classics

Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models

Neural Information Processing SystemsJun-23-2026, 09:35:41 GMT

Sparse Autoencoders (SAEs) have recently gained attention as a means to improve the interpretability and steerability of Large Language Models (LLMs), both of which are essential for AI safety. In this work, we extend the application of SAEs to Vision-Language Models (VLMs), such as CLIP, and introduce a comprehensive framework for evaluating monosemanticity at the neuron-level in visual representations. To ensure that our evaluation aligns with human perception, we propose a benchmark derived from a large-scale user study. Our experimental results reveal that SAEs trained on VLMs significantly enhance the monosemanticity of individual neurons, with sparsity and wide latents being the most influential factors. Further, we demonstrate that applying SAE interventions on CLIP's vision encoder directly steers multimodal LLM outputs (e.g., LLaVA), without any modifications to the underlying language model. These findings emphasize the practicality and efficacy of SAEs as an unsupervised tool for enhancing both interpretability and control of VLMs.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Aggregation Hides Out-of-Distribution Generalization Failures from Spurious Correlations

Neural Information Processing SystemsJun-23-2026, 09:25:31 GMT

Benchmarks for out-of-distribution (OOD) generalization frequently show a strong positive correlation between in-distribution (ID) and OOD accuracy across models, termed "accuracy-on-the-line." This pattern is often taken to imply that spurious correlations--correlations that improve ID but reduce OOD performance--are rare in practice. We find that this positive correlation is often an artifact of aggregating heterogeneous OOD examples. Using a simple gradient-based method, OODSelect, we identify semantically coherent OOD subsets where accuracy on the line does not hold. Across widely used distribution shift benchmarks, the OODSelect uncovers subsets, sometimes up to over half of the standard OOD set, where higher ID accuracy predicts lower OOD accuracy. Our findings indicate that aggregate metrics can obscure important failure modes of OOD robustness. We release code and the identified subsets to facilitate further research.

correlation, large language model, machine learning, (20 more...)

Neural Information Processing Systems

Country: North America > United States (0.67)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.87)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Nuclear Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Add feedback

LLMSafety Alignment is Divergence Estimation in Disguise

Neural Information Processing SystemsJun-23-2026, 09:25:20 GMT

We present a theoretical framework showing that popular LLM alignment methods--including RLHF and its variants--can be understood as divergence estimators between aligned (safe or preferred) and unaligned (harmful or less-preferred) distributions. This perspective explains the emergence of separation in the latent space between safe and harmful prompts after alignment. As an application of our general divergence framework, we propose KLDO, a novel KL divergence-based alignment method, and empirically validate its effectiveness. We further show that using compliance-refusal datasets, rather than standard preference-based datasets, leads to stronger separation and improved safety alignment. Finally, to quantify the separation effect, we propose a distance-based metric in the prompt representation space, which also acts as a statistically significant indicator for model safety.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Health & Medicine > Consumer Health (0.93)
Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)

Add feedback

OpenAI's new Daybreak initiative will help open-source projects fend off bugs

EngadgetJun-23-2026, 09:14:13 GMT

OpenAI's new Daybreak initiative will help open-source projects fend off bugs OpenAI's new Daybreak initiative will help open-source projects fend off bugs Patch the Planet will pair security researchers with open-source projects. OpenAI has launched Patch the Planet, a new initiative part of its Daybreak cybersecurity program, which was designed to serve the open-source community. The company is working with cybersecurity firm Trail of Bits that has committed its entire security research organization for the project. In its own announcement, Trail of Bits said that while models like GPT-5.5-Cyber can produce a firehose of security findings for users, project maintainers, who are already stretched thin, will have to sift through all of them to identify real vulnerabilities from false positives. Patch the Planet is meant to reduce project maintainers' burden by putting them in contact with security researchers, who use OpenAI's top models and Codex Security to identify vulnerabilities and review findings before they even reach the maintainers.

large language model, machine learning, natural language, (12 more...)

Engadget

Industry:

Information Technology > Security & Privacy (1.00)
Leisure & Entertainment > Games > Computer Games (0.76)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models

Neural Information Processing SystemsJun-23-2026, 09:12:59 GMT

We introduce Eagle2.5, a frontier vision-language model (VLM) for long-context multimodal learning. Our work addresses the challenges in long video comprehension and high-resolution image understanding, introducing a generalist framework for both tasks. The proposed training framework incorporates Automatic Degrade Sampling and Image Area Preservation, two techniques that preserve contextual integrity and visual details. The framework also includes numerous efficiency optimizations in the pipeline for long-context data training. Finally, we propose Eagle-Video-110K, a novel dataset that integrates both story-level and clip-level annotations, facilitating long-video understanding. Eagle2.5 demonstrates substantial improvements on long-context multimodal benchmarks, providing a robust solution to the limitations of existing VLMs.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

VisualQuality-R1: Reasoning-Induced Image Quality Assessment via Reinforcement Learning to Rank Tianhe Wu1,2, Jian Zou1, Jie Liang2, Lei Zhang2,3, and Kede Ma1

Neural Information Processing SystemsJun-23-2026, 09:06:56 GMT

Image quality assessment (IQA) aims to quantify the visual quality of digital images consistent with human perceptual judgments. Commonly, IQA models are classified into full-reference (FR) and noreference (NR) approaches [47], depending on the availability of pristine-quality reference images. In this paper, we focus on NR-IQA due to its practical relevance in real-world scenarios where reference images are unavailable. Over the decades, NR-IQA has evolved from knowledge-driven [33, 12] to data-driven approaches [30, 19, 54], and shifted from regression-based to ranking-based [58, 59] techniques. Nevertheless, achieving strong model generalization (e.g., generalization to unseen image distortions) remains a significant, unresolved challenge, driving recent research toward multi-dataset training [6], active fine-tuning [44], and continual model adaptation [57]. The rapid advancement of vision-language models (VLMs) offers promising avenues for enhancing NR-IQA generalization by contextualizing it into broader vision tasks [51]. VLMs can effectively integrate multi-modal information, enabling understanding of both low-level image distortions (e.g., noise and blur) and high-level perceptual attributes (e.g., aesthetics and content semantics). This multi-modal semantic contextualization allows VLMs to articulate nuanced quality descriptions with stronger generalization. However, current NR-IQA methods mainly leverage VLMs through supervised fine-tuning (SFT), which face several critical limitations [49, 56].

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Industry: Media > Photography (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Add feedback

The 400 million machine powering the future of chipmaking

MIT Technology ReviewJun-23-2026, 09:00:00 GMT

The AI era needs ever faster chips. ASML has a monopoly on the expensive contraptions needed to pattern them. Jos Benschop is climbing a ladder to get to the top of his newest machine. The contraption is the size of a double-decker bus--more than 150 tons of gleaming precision-milled aluminum covered in thousands of snaking tubes, colored cables, and pressurized tanks. From the ground, it looks like a futuristic V8 engine. When I reach the top with Benschop we're looking down from about 15 feet in the air, with bunny-suited technicians scurrying around below. It's more than 200 cubic meters of tech--"mechatronic devices that hold a few mirrors in a position with atomic precision," he says, gesturing at the gargantuan apparatus. Benschop, a tall and grizzled 66-year-old, has spent over a decade working with his engineers to design this thing, but even so, he'll sometimes look at it and go: Benschop is the executive vice president of technology for ASML, a Dutch company that is the linchpin of the microchip industry. If you want to make powerful chips to power phones or AI, a lithography machine like the one we're standing on is what you need to create increasingly tiny circuitry. Lithography is the art and science of shining light on a silicon wafer to pattern out the transistors, wiring, and other components of the microchips that will be cut from it. The chipmaking field is essentially controlled by only two big players: ASML, which creates the lithography machines, and TSMC, the chipmaking giant. Nine years ago, ASML began selling machines that use a daring new way of patterning chip features.

large language model, machine learning, natural language, (21 more...)

MIT Technology Review

Country: North America > United States (1.00)

Industry:

Semiconductors & Electronics (1.00)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Communications > Social Media (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

NaViL: Rethinking Scaling Properties of Native Multimodal Large Language Models under Data Constraints

Neural Information Processing SystemsJun-23-2026, 08:57:22 GMT

Compositional training has been the de-facto paradigm in existing Multimodal Large Language Models (MLLMs), where pre-trained visual encoders are connected with pre-trained LLMs through continuous multimodal pre-training. However, the multimodal scaling property of this paradigm remains difficult to explore due to the separated training. In this paper, we focus on the native training of MLLMs in an end-to-end manner and systematically study its design space and scaling property under a practical setting, i.e., data constraint. Through careful study of various choices in MLLM, we obtain the optimal meta-architecture that best balances performance and training cost. After that, we further explore the scaling properties of the native MLLM and indicate the positively correlated scaling relationship between visual encoders and LLMs. Based on these findings, we propose a native MLLM called NaViL, combined with a simple and cost-effective recipe. Experimental results on 14 multimodal benchmarks confirm the competitive performance of NaViL against existing MLLMs. Besides that, our findings and results provide in-depth insights for the future study of native MLLMs.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Asia > China (0.46)
North America > United States (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.66)

Industry:

Transportation > Infrastructure & Services (0.92)
Education (0.67)
Leisure & Entertainment > Sports > Soccer (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

On the Entropy Calibration of Language Models

Neural Information Processing SystemsJun-23-2026, 08:56:40 GMT

We study the problem of entropy calibration, which asks whether a language model's entropy over generations matches its log loss on human text. Past work found that models are miscalibrated, with entropy per step increasing as generations grow longer, due to error accumulation. To calibrate the model and improve text quality, it has become standard practice to truncate the distribution, but this approach reduces output diversity, which we would like to avoid. Therefore, in this paper, we ask: does miscalibration improve automatically with scale, and if not, is it theoretically possible to calibrate without tradeoffs? To build intuition, we first study a simplified theoretical setting to characterize the scaling behavior of miscalibration with respect to dataset size. We find that the rate of scaling depends on the power law exponent of the data distribution -- in particular, for a power law exponent close to 1, the scaling exponent is close to 0, meaning that miscalibration improves very slowly with scale.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

Asia (0.92)
North America > United States > Pennsylvania (0.28)
North America > United States > Minnesota (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry:

Media (0.67)
Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
(2 more...)

Add feedback

Evaluating Based Capabilities of LLMs in Video Scenarios

Neural Information Processing SystemsJun-23-2026, 08:51:41 GMT

Multimodal Large Language Models (MLLMs) have achieved considerable accuracy in Optical Character Recognition (OCR) from static images. However, their efficacy in video OCR is significantly diminished due to factors such as motion blur, temporal variations, and visual effects inherent in video content. To provide clearer guidance for training practical MLLMs, we introduce MMEVideoOCR benchmark, which encompasses a comprehensive range of video OCR application scenarios.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

Neural Information Processing Systems

Country: Asia (0.68)

Genre: