AITopics | Large Language Model

Collaborating Authors

Large Language Model

News Overviews Instructional Materials AI-Alerts Classics

TAPEREDOFF-POLICYREINFORCE Stable and efficient reinforcement learning for LLMs

Neural Information Processing SystemsJun-17-2026, 19:41:36 GMT

We propose a new algorithm for fine-tuning large language models using reinforcement learning. Tapered Off-Policy REINFORCE (TOPR) uses an asymmetric, tapered variant of importance sampling to speed up learning while maintaining stable learning dynamics, even without the use of KL regularization. TOPR can be applied in a fully offline fashion, allows the handling of positive and negative examples in a unified framework, and benefits from the implementational simplicity that is typical of Monte Carlo algorithms. We demonstrate the effectiveness of our approach with a series of experiments on the GSM8K and MATH reasoning benchmarks, finding performance gains for training both a model for solution generation as a generative verifier, and on a learning to search task, using the model as a query expander. We show that properly leveraging positive and negative examples alike in the off-policy regime simultaneously increases test-time accuracy and training data efficiency, all the while avoiding the "wasted inference" that comes with discarding negative examples. We find that this advantage persists over multiple iterations of training and can be amplified by dataset curation techniques, enabling us to match 70B-parameter model performance with 8B language models. As a corollary to this work, we find that REINFORCE's baseline parameter plays an important and unexpected role in defining dataset composition in the presence of negative examples, and is consequently critical in driving off-policy performance.

large language model, machine learning, reinforcement learning, (20 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Evaluating LLM-Contaminated Crowdsourcing Data Without Ground Truth

Neural Information Processing SystemsJun-17-2026, 19:31:40 GMT

The recent success of generative AI highlights the crucial role of high-quality human feedback in building trustworthy AI systems. However, the increasing use of large language models (LLMs) by crowdsourcing workers poses a significant challenge: datasets intended to reflect human input may be compromised by LLM-generated responses. Existing LLM detection approaches often rely on high-dimensional training data such as text, making them unsuitable for structured annotation tasks like multiple-choice labeling. In this work, we investigate the potential of peer prediction--a mechanism that evaluates the information within workers' responses--to mitigate LLM-assisted cheating in crowdsourcing with a focus on annotation tasks.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.93)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Education (0.66)
Information Technology (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Need 3D Aware Representation Supervision for Scene Understanding

Neural Information Processing SystemsJun-17-2026, 19:20:27 GMT

Recent advances in scene understanding have leveraged multimodal large language models (MLLMs) for 3D reasoning by capitalizing on their strong 2D pretraining.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Guiding Cross-Modal Representations with MLLM Priors via Preference Alignment

Neural Information Processing SystemsJun-17-2026, 19:09:39 GMT

Despite Contrastive Language-Image Pre-training (CLIP)'s remarkable capability to retrieve content across modalities, a substantial modality gap persists in its feature space. Intriguingly, we discover that off-the-shelf MLLMs (Multimodal Large Language Models) demonstrate powerful inherent modality alignment properties. While recent MLLM-based retrievers with unified architectures partially mitigate this gap, their reliance on coarse modality alignment mechanisms fundamentally limits their potential. In this work, We introduce MAPLE (Modality-Aligned Preference Learning for Embeddings), a novel framework that leverages the finegrained alignment priors inherent in MLLM to guide cross-modal representation learning. MAPLE formulates the learning process as reinforcement learning with two key components: (1) Automatic preference data construction using off-theshelf MLLM, and (2) a new Relative Preference Alignment (RPA) loss, which adapts Direct Preference Optimization (DPO) to the embedding learning setting. Experimental results show that our preference-guided alignment achieves substantial gains in fine-grained cross-modal retrieval, underscoring its effectiveness in handling nuanced semantic distinctions.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)

Add feedback

3DLLM-Mem: Long-Term Spatial-Temporal Memory for Embodied 3DLarge Language Model

Neural Information Processing SystemsJun-17-2026, 19:09:17 GMT

Humans excel at performing complex tasks by leveraging long-term memory across temporal and spatial experiences. In contrast, current Large Language room box is the Modelsenvironments.(LLMs)Wstrugglee

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: North America > United States > California > Los Angeles County (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Consumer Health (0.69)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
(2 more...)

Add feedback

Less is More: Local Intrinsic Dimensions of Contextual Language Models

Benjamin Matthias Ruppik, Julius von Rohrscheidt, Carel van Niekerk, Michael Heck, Renato Vukovic, Shutong Feng, Hsien-chin Lin, Nurul Lubis, Bastian Rieck, Marcus Zibrowius, Milica Gasic

Neural Information Processing SystemsJun-17-2026, 19:07:41 GMT

Understanding the internal mechanisms of large language models (LLMs) remains a challenging and complex endeavor. Even fundamental questions, such as how fine-tuning affects model behavior, often require extensive empirical evaluation. In this paper, we introduce a novel perspective based on the geometric properties of contextual latent embeddings to study the effects of training and fine-tuning. To that end, we measure the local dimensions of a contextual language model's latent space and analyze their shifts during training and fine-tuning. We show that the local dimensions provide insights into the model's training dynamics and generalization ability. Specifically, the mean of the local dimensions predicts when the model's training capabilities are exhausted, as exemplified in a dialogue state tracking task, overfitting, as demonstrated in an emotion recognition task, and grokking, as illustrated with an arithmetic task. Furthermore, our experiments suggest a practical heuristic: reductions in the mean local dimension tend to accompany and predict subsequent performance gains. Through this exploration, we aim to provide practitioners with a deeper understanding of the implications of fine-tuning on embedding spaces, facilitating informed decisions when configuring models for specific applications. The results of this work contribute to the ongoing discourse on the interpretability, adaptability, and generalizability of LLMs by bridging the gap between intrinsic model mechanisms and geometric properties in embeddings.

dimension, large language model, machine learning, (21 more...)

Neural Information Processing Systems

Country:

Asia (0.92)
Europe > Germany (0.46)
North America > United States > Minnesota (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Anthropic's design assistant now works better with its coding agent

EngadgetJun-17-2026, 19:00:00 GMT

Anthropic's design assistant now works better with its coding agent Anthropic's design assistant now works better with its coding agent Exactly two months after releasing a preview of Claude Design to subscribers, Anthropic has begun rolling out a major update for its design assistant that brings better integration with its other apps. To start, Claude Design can now begin working from a local codebase, meaning any assets it generates will contain elements that already exist in your front-facing products. From there, the app can hand off a design to Claude Code, allowing the coding agent to program an interface without the need to start from scratch. You also don't need to provide it with screenshots to give it an idea of your intent. And if you want to skip Claude Design, you can do that too, with Anthropic adding the option to create and edit designs directly from Claude Code. Outside of more robust integration with Anthropic's other apps, today's update brings with it quality of life improvements, starting with a more flexible import tool that can build entire design systems from GitHub and raw files.

artificial intelligence, large language model, natural language, (10 more...)

Engadget

Industry: Leisure & Entertainment > Games > Computer Games (0.74)

Technology:

Information Technology > Communications > Mobile (0.55)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.51)
Information Technology > Communications > Social Media (0.44)

Add feedback

Automatic Auxiliary Task Selection and Adaptive Weighting Boost Molecular Property Prediction

Neural Information Processing SystemsJun-17-2026, 18:58:34 GMT

Recent studies in Machine Learning (ML) for biological research focus on investigating molecular properties to accelerate drug discovery. However, limited labeled molecular data often hampers the performance of ML models. A common strategy to mitigate data scarcity is leveraging auxiliary learning tasks to provide additional supervision, but selecting effective auxiliary tasks requires substantial domain expertise and manual effort, and their inclusion does not always guarantee performance gains. To overcome these challenges, we introduce Automatic Auxiliary Task Selection (AUTAUT), a fully automated framework that seamlessly retrieves auxiliary tasks using large language models and adaptively integrates them through a novel gradient alignment weighting mechanism. By automatically emphasizing auxiliary tasks aligned with the primary objective, AUTAUT significantly enhances predictive accuracy while reducing negative impacts from irrelevant tasks. Extensive evaluations demonstrate that AUTAUT outperforms 10 auxiliary task-based approaches and 18 advanced molecular property prediction models.

large language model, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.92)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.67)
Government > Regional Government > North America Government > United States Government > FDA (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Diagnosing and Addressing Pitfalls in KG-RAG Datasets: Toward More Reliable Benchmarking

Neural Information Processing SystemsJun-17-2026, 18:58:11 GMT

Knowledge Graph Question Answering (KGQA) systems rely on high-quality benchmarks to evaluate complex multi-hop reasoning. However, despite their widespread use, popular datasets such as WebQSP and CWQ suffer from critical quality issues, including inaccurate or incomplete ground-truth annotations, poorly constructed questions that are ambiguous, trivial, or unanswerable, and outdated or inconsistent knowledge. Through a manual audit of 16 popular KGQA datasets--including WebQSPand CWQ--we find that the average factual correctness rate is only 57%. To address these issues, we introduce KGQAGen, an LLM-inthe-loop framework that systematically resolves these pitfalls. KGQAGencombines structured knowledge grounding, LLM-guided generation, and symbolic verification to produce challenging and verifiable QA instances. Using KGQAGen, we construct KGQAGen-10k, a 10K-scale benchmark grounded in Wikidata, and evaluate a diverse set of KG-RAG models. Experimental results demonstrate that even state-of-the-art systems struggle on this benchmark, highlighting its ability to expose limitations of existing models. Our findings advocate for more rigorous benchmark construction and position KGQAGen as a scalable framework for advancing KGQA evaluation 1.

large language model, machine learning, question answering, (19 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Alabama (0.28)

Genre:

Research Report > New Finding (1.00)
Personal (1.00)

Industry:

Leisure & Entertainment > Sports (1.00)
Government > Regional Government (0.92)
Media (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(3 more...)

Add feedback

Text-to-Code Generation for Modular Building Layouts in Building Information Modeling

Neural Information Processing SystemsJun-17-2026, 18:56:46 GMT

We present Text2MBL, a text-to-code generation framework that generates executable Building Information Modeling (BIM) code directly from textual descriptions of modular building layout (MBL) design. Unlike conventional layout generation approaches that operate in 2D space, Text2MBL produces fully parametric, semantically rich BIM layouts through on-the-fly code instantiation. To address MBLs' unique challenges due to their hierarchical three-tier structure: modules (physical building blocks), units (self-contained dwellings), and rooms (functional spaces), we developed an object-oriented code architecture and fine-tuned large language models to output structured action sequences in code format. To train and evaluate the framework, we curated a dataset of paired descriptions and ground truth layouts drawn from real-world modular housing projects. Performance was assessed using metrics for executable validity, semantic fidelity, and geometric consistency. By tightly unifying natural language understanding with BIM code generation, Text2MBL establishes a scalable pipeline from high-level conceptual design to automation-ready modular construction workflows.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Genre: