AITopics | Large Language Model

b8b93c48f5bfa385d071342089d70422-Paper-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing SystemsApr-30-2026, 01:20:31 GMT

caption, large language model, machine learning, (22 more...)

Neural Information Processing Systems

Country: Europe (0.93)

Genre:

Overview (0.68)
Research Report > New Finding (0.46)

Industry:

Information Technology (0.68)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Towards Revealing the Mystery behind Chain of Thought: ATheoretical Perspective

Neural Information Processing SystemsApr-30-2026, 01:19:36 GMT

Recent studies have discovered that Chain-of-Thought prompting (CoT) can dramatically improve the performance of Large Language Models (LLMs), particularly when dealing with complex tasks involving mathematics or reasoning. Despite the enormous empirical success, the underlying mechanisms behind CoT and how it unlocks the potential of LLMs remain elusive. In this paper, we take a first step towards theoretically answering these questions. Specifically, we examine the expressivity of LLMs with CoT in solving fundamental mathematical and decisionmaking problems. By using circuit complexity theory, we first give impossibility results showing that bounded-depth Transformers are unable to directly produce correct answers for basic arithmetic/equation tasks unless the model size grows super-polynomially with respect to the input length. In contrast, we then prove by construction that autoregressive Transformers of constant size suffice to solve both tasks by generating CoT derivations using a commonly used math language format. Moreover, we show LLMs with CoT can handle a general class of decision-making problems known as Dynamic Programming, thus justifying their power in tackling complex real-world tasks. Finally, an extensive set of experiments show that, while Transformers always fail to directly predict the answers, they can consistently learn to generate correct solutions step-by-step given sufficient CoT demonstrations.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: Asia (0.28)

Genre:

Research Report > New Finding (1.00)
Overview (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Add feedback

Transfer Q: Principled Decoding for LLMAlignment

Neural Information Processing SystemsApr-30-2026, 01:19:06 GMT

Aligning foundation models is essential for their safe and trustworthy deployment. However, traditional fine-tuning methods are computationally intensive and require updating billions of model parameters. A promising alternative, alignment via decoding, adjusts the response distribution directly without model updates to maximize a target reward r, thus providing a lightweight and adaptable framework for alignment. However, principled decoding methods rely on oracle access to an optimal Q-function (Q), which is often unavailable in practice. Hence, prior SoTA methods either approximate this Q using Qπsft (derived from the reference SFTmodel) or rely on short-term rewards, resulting in sub-optimal decoding performance. In this work, we propose Transfer Q, which implicitly estimates the optimal value function for a target reward r through a baseline model ρBL aligned with a baseline reward rBL (which can be different from the target reward r). Theoretical analyses of Transfer Q provide a rigorous characterization of its optimality, deriving an upper bound on the sub-optimality gap and identifying a hyperparameter to control the deviation from the pre-trained reference SFTmodel based on user needs. Our approach significantly reduces the sub-optimality gap observed in prior SoTA methods and demonstrates superior empirical performance across key metrics such as coherence, diversity, and quality in extensive tests on several synthetic and real datasets.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.67)
Asia (0.46)

Genre: Research Report > Experimental Study (0.93)

Industry:

Education (0.46)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Communications (0.92)

Add feedback

df4f6e43446b1ee29c5a33d32c279f83-Paper-Conference.pdf

Neural Information Processing SystemsApr-30-2026, 01:06:01 GMT

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

OpenDlign: Open-World Point Cloud Understanding with Depth-Aligned Images

Neural Information Processing SystemsApr-30-2026, 00:51:01 GMT

Recent open-world 3D representation learning methods using Vision-Language Models (VLMs) to align 3D point clouds with image-text information have shown superior 3D zero-shot performance.

large language model, machine learning, natural language, (9 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.59)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.33)

Add feedback

deb3c28192f979302c157cb653c15e90-Paper-Conference.pdf

Neural Information Processing SystemsApr-30-2026, 00:50:57 GMT

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.67)

Genre: Research Report > New Finding (1.00)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)
Automobiles & Trucks > Manufacturer (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.93)

Add feedback

dd83eada2c3c74db3c7fe1c087513756-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsApr-30-2026, 00:22:47 GMT

large language model, machine learning, natural language, (22 more...)

Neural Information Processing Systems

Country:

Europe (0.28)
North America > United States > California > Santa Clara County > Palo Alto (0.15)

Genre:

Overview (1.00)
Research Report > New Finding (0.46)

Industry:

Information Technology (0.92)
Law > Intellectual Property & Technology Law (0.68)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.95)
(2 more...)

Add feedback

dc9d5dcf3e86b83e137bad367227c8ca-Paper-Conference.pdf

Neural Information Processing SystemsApr-30-2026, 00:08:10 GMT

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.95)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

The Geometric Canary: Predicting Steerability and Detecting Drift via Representational Stability

Raju, Prashant C.

arXiv.org Machine LearningApr-30-2026

Reliable deployment of language models requires two capabilities that appear distinct but share a common geometric foundation: predicting whether a model will accept targeted behavioral control, and detecting when its internal structure degrades. We show that geometric stability, the consistency of a representation's pairwise distance structure, addresses both. Supervised Shesha variants that measure task-aligned geometric stability predict linear steerability with near-perfect accuracy ($ρ= 0.89$-$0.97$) across 35-69 embedding models and three NLP tasks, capturing unique variance beyond class separability (partial $ρ= 0.62$-$0.76$). A critical dissociation emerges: unsupervised stability fails entirely for steering on real-world tasks ($ρ\approx 0.10$), revealing that task alignment is essential for controllability prediction. However, unsupervised stability excels at drift detection, measuring nearly $2\times$ greater geometric change than CKA during post-training alignment (up to $5.23\times$ in Llama) while providing earlier warning in 73\% of models and maintaining a $6\times$ lower false alarm rate than Procrustes. Together, supervised and unsupervised stability form complementary diagnostics for the LLM deployment lifecycle: one for pre-deployment controllability assessment, the other for post-deployment monitoring.

large language model, machine learning, shesha, (18 more...)

arXiv.org Machine Learning

2604.17698

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

Inverting Foundation Models of Brain Function with Simulation-Based Inference

Bracher, Niels, Intes, Xavier, Radev, Stefan T.

arXiv.org Machine LearningApr-30-2026

Foundation models of brain activity promise a new frontier for in silico neuroscience by emulating neural responses to complex stimuli across tasks and modalities. A natural next step is to ask whether these models can also be used in reverse. Can we recover a stimulus or its properties from synthetic brain activity? We study this question in a proof-of-concept setting using TRIBEv2. We pair the brain emulator with large language models (LLMs) that generate news headlines from linguistic parameters such as valence, arousal, and dominance. We then use simulation-based inference to learn a probabilistic mapping from brain maps to latent stimulus parameters. Our results show that these parameters can be recovered from predicted brain maps, validating the quality of neural encodings. They also show that LLMs can serve as controllable stimulus generators for simulated experiments. Together, these findings provide a step toward decoding and inverse design with foundation brain models.

large language model, machine learning, natural language, (21 more...)

arXiv.org Machine Learning

2604.23865

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: