AITopics | Large Language Model

Collaborating Authors

Large Language Model

News Overviews Instructional Materials AI-Alerts Classics

Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling

Neural Information Processing SystemsJun-17-2026, 16:09:54 GMT

Vision-Language Models (VLMs) excel at visual understanding but often suffer from visual hallucinations, where they generate descriptions of nonexistent objects, actions, or concepts, posing significant risks in safety-critical applications. Existing hallucination mitigation methods typically follow one of two paradigms: generation adjustment, which modifies decoding behavior to align text with visual inputs, and post-hoc verification, where external models assess and correct outputs. While effective, generation adjustment methods often rely on heuristics and lack correction mechanisms, while post-hoc verification is complicated, typically requiring multiple models and tending to reject outputs rather than refine them. In this work, we introduce REVERSE, a unified framework that integrates hallucination-aware training with on-the-fly self-verification. By leveraging a new hallucination-verification dataset containing over 1.3M semi-synthetic samples, along with a novel inference-time retrospective resampling technique, our approach enables VLMs to both detect hallucinations during generation and dynamically revise those hallucinations. Our evaluations show that REVERSE achieves state-of-the-art hallucination reduction, outperforming the best existing methods by up to 12% on CHAIR-MSCOCO and 34% on HaloQuest.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

North America > United States (1.00)
Asia (0.67)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Industry:

Information Technology (1.00)
Government > Regional Government > North America Government > United States Government (0.67)
Transportation > Ground > Road (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

5eef668e502203f257488efe742e53f1-Paper-Conference.pdf

Neural Information Processing SystemsJun-17-2026, 16:09:38 GMT

Recent struggle te with xt-to-image maintaining models concept have re consistenc volutionized y across image generated generation, images.

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: Asia (0.46)

Genre: Research Report > Experimental Study (1.00)

Industry:

Media (0.46)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Sensing and Signal Processing > Image Processing (0.68)

Add feedback

ViSPLA: Visual Iterative Self-Prompting for Language-Guided 3DAffordance Learning

Neural Information Processing SystemsJun-17-2026, 15:59:52 GMT

We address the problem of language-guided 3D affordance prediction, a core capability for embodied agents interacting with unstructured environments. Existing methods often rely on fixed affordance categories or require external expert prompts, limiting their ability to generalize across different objects and interpret multi-step instructions. In this work, we introduce ViSPLA, a novel iterative selfprompting framework that leverages the intrinsic geometry of predicted masks for continual refinement. We redefine affordance detection as a language-conditioned segmentation task: given a 3D point cloud and language instruction, our model predicts a sequence of refined affordance masks, each guided by differential geometric feedback including Laplacians, normal derivatives, and curvature fields. This feedback is encoded into visual prompts that drive a multi-stage refinement decoder, enabling the model to self-correct and adapt to complex spatial structures. To further enhance precision and coherence, we introduce Implicit Neural Affordance Fields, which define continuous probabilistic regions over the 3D surface without additional supervision. Additionally, our Spectral Convolutional Self-Prompting module operates in the frequency domain of the point cloud, enabling multi-scale refinement that captures both coarse and fine affordance structures. Extensive experiments demonstrate that ViSPLA achieves state-of-the-art results on both seen and unseen objects on two benchmark datasets. Our framework establishes a new paradigm for open-world 3D affordance reasoning by unifying language comprehension with low-level geometric perception through iterative refinement.

justification, large language model, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Genre: Research Report > Experimental Study (1.00)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.49)

Add feedback

Loquetier: AVirtualized Multi-LoRA Framework for Unified LLMFine-tuning and Serving

Neural Information Processing SystemsJun-17-2026, 15:59:07 GMT

Low-Rank Adaptation (LoRA) has become a widely adopted parameter-efficient fine-tuning (PEFT) technique for adapting large language models (LLMs) to downstream tasks. While prior work has explored strategies for integrating LLM training and serving, there still remains a gap in unifying fine-tuning and inference for LoRA-based models.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Industry:

Information Technology (0.46)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Concept-Guided Interpretability via Neural Chunking

Neural Information Processing SystemsJun-17-2026, 15:54:18 GMT

Neural networks are often described as black boxes, reflecting the significant challenge of understanding their internal workings and interactions. We propose a different perspective that challenges the prevailing view: rather than being inscrutable, neural networks exhibit patterns in their raw population activity that mirror regularities in the training data. We refer to this as the Reflection Hypothesis and provide evidence for this phenomenon in both simple recurrent neural networks (RNNs) and complex large language models (LLMs). Building on this insight, we propose to leverage our cognitive tendency of chunking to segment high-dimensional neural population dynamics into interpretable units that reflect underlying concepts. We propose three methods to extract recurring chunks on a neural population level, complementing each other based on label availability and neural data dimensionality.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry:

Information Technology (0.68)
Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)

Add feedback

WALL-E: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents

Neural Information Processing SystemsJun-17-2026, 15:53:14 GMT

Can we build accurate world models out of large language models (LLMs)? How can world models benefit LLM agents? The gap between the prior knowledge of LLMs and the specified environment's dynamics usually bottlenecks LLMs' performance as world models. To bridge the gap, we propose a training-free world alignment that learns an environment's symbolic knowledge complementary to LLMs. The symbolic knowledge covers action rules, knowledge graphs, and scene graphs, which are extracted by LLMs from exploration trajectories and encoded into executable codes to regulate LLM agents' policies.

artificial intelligence, large language model, natural language, (11 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

System-1.5 Reasoning: Traversal in Language and Latent Spaces with Dynamic Shortcuts

Neural Information Processing SystemsJun-17-2026, 15:53:02 GMT

Chain-of-thought (CoT) reasoning enables large language models (LLMs) to move beyond fast System-1 responses and engage in deliberative System-2 reasoning. However, this comes at the cost of significant inefficiency due to verbose intermediate output. Recent latent-space reasoning methods improve efficiency by operating on hidden states without decoding into language, yet they treat all steps uniformly, failing to distinguish critical deductions from auxiliary steps and resulting in suboptimal use of computational resources. In this paper, we propose System-1.5

arxiv preprint arxiv, large language model, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America > Canada (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

APIGen-MT: Agentic PIpeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay

Neural Information Processing SystemsJun-17-2026, 15:32:52 GMT

Training effective AI agents for multi-turn interactions requires high-quality data that captures realistic human-agent dynamics, yet such data is scarce and expensive to collect manually. We introduce APIGen-MT, a two-phase framework that generates verifiable and diverse multi-turn agent data. In the first phase, our agentic pipeline produces detailed task blueprints with ground-truth actions, leveraging a committee of LLM reviewers and iterative feedback loops. These blueprints are then transformed into complete interaction trajectories through simulated humanagent interplay. We train a family of models--the xLAM-2-fc-rseries with sizes ranging from 1B to 70B parameters. Our models outperform frontier models such as GPT-4o and Claude 3.5 on τ-bench and BFCL benchmarks, with the smaller models surpassing their larger counterparts, particularly in multi-turn settings, while maintaining superior consistency across multiple trials. Comprehensive experiments demonstrate that our verified blueprint-to-details approach yields highquality training data, enabling the development of more reliable, efficient, and capable agents. We open-source both the synthetic data collected and the trained xLAM-2-fc-rmodels to advance research in AI agents.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report > Experimental Study (0.46)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(2 more...)

Add feedback

ShapeCraft: LLMAgents for Structured, Textured and Interactive 3DModeling

Neural Information Processing SystemsJun-17-2026, 15:31:34 GMT

However, existing methods often yield unstructured meshes and exhibit poor interactivity, making them impractical for artistic workflows. To address these limitations, we represent 3D assets as shape programs and introduce ShapeCraft, a novel multi-agent framework for text-to-3D generation. At its core, we propose a Graph-based Procedural Shape (GPS) representation that decomposes complex natural language into a structured graph of sub-tasks, thereby facilitating accurate LLM comprehension and interpretation of spatial relationships and semantic shape details. Specifically, LLM agents hierarchically parse user input to initialize GPS, then iteratively refine procedural modeling and painting to produce structured, textured, and interactive 3D assets. Qualitative and quantitative experiments demonstrate ShapeCraft's superior performance in generating geometrically accurate and semantically rich 3D assets compared to existing LLM-based agents. We further show the versatility of ShapeCraft through examples of animated and user-customized editing, highlighting its potential for broader interactive applications.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Workflow (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CAS-Spec: Cascade Adaptive Self-Speculative Decoding for On-the-Fly Lossless Inference Acceleration of LLMs

Neural Information Processing SystemsJun-17-2026, 15:31:25 GMT

Speculative decoding has become a widely adopted as an effective technique for lossless inference acceleration when deploying large language models (LLMs). While on-the-fly self-speculative methods offer seamless integration and broad utility, they often fall short of the speed gains achieved by methods relying on specialized training. Cascading a hierarchy of draft models promises further acceleration and flexibility, but the high cost of training multiple models has limited its practical application. In this paper, we propose a novel Cascade Adaptive Self-Speculative Decoding (CAS-Spec) method which constructs speculative draft models by leveraging dynamically switchable inference acceleration (DSIA) strategies, including layer sparsity and activation quantization. Furthermore, traditional vertical and horizontal cascade algorithms are inefficient when applied to selfspeculative decoding methods. We introduce a Dynamic Tree Cascade (DyTC) algorithm that adaptively routes the multi-level draft models and assigns the draft lengths, based on the heuristics of acceptance rates and latency prediction. Our CAS-Spec method achieves state-of-the-art acceleration compared to existing on-the-fly speculative decoding methods, with an average speedup from 1.1 to 2.3 over autoregressive decoding across various LLMs and datasets. DyTC improves the average speedup by 47% and 48% over cascade-based baseline and tree-based baseline algorithms, respectively. CAS-Spec can be easily integrated into most existing LLMs and holds promising potential for further acceleration as self-speculative decoding techniques continue to evolve.

draft model, large language model, natural language, (15 more...)

Neural Information Processing Systems

Country:

Asia (0.93)
North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback