AITopics | Genre

Collaborating Authors

Genre

VideoMAR: Autoregressive Video Generation with Continuous Tokens

Neural Information Processing SystemsJun-17-2026, 07:16:58 GMT

Mask-based autoregressive models have demonstrated promising image generation capability in continuous space. However, their potential for video generation remains under-explored. In this paper, we propose VideoMAR, a concise and efficient decoder-only autoregressive image-to-video model with continuous tokens, composing temporal frame-by-frame and spatial masked generation. We first identify temporal causality and spatial bi-directionality as the first principle of video AR models, and propose the next-frame diffusion loss for the integration of mask and video generation. Besides, the huge cost and difficulty of long sequence autoregressive modeling is a basic but crucial issue. To this end, we propose the temporal short-to-long curriculum learning and spatial progressive resolution training, and employ progressive temperature strategy at inference time to mitigate the accumulation error.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

TAPVid-360: Tracking Any Point in 360 from Narrow Field of View Video

Neural Information Processing SystemsJun-17-2026, 07:13:23 GMT

Humans excel at constructing panoramic mental models of their surroundings, maintaining object permanence and inferring scene structure beyond visible regions.

artificial intelligence, machine learning, object-oriented architecture, (20 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Industry:

Information Technology (0.93)
Transportation (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Graphics (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.46)

Add feedback

GTOmnire Ours

Neural Information Processing SystemsJun-17-2026, 07:12:23 GMT

Recently, Gaussian Splatting (GS) has shown great potential for urban scene reconstruction in the field of autonomous driving. However, current urban scene reconstruction methods often depend on multimodal sensors as inputs, i.e.

artificial intelligence, gaussian, machine learning, (16 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry: Information Technology (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Position: Require Frontier AILabs To Release Small " Analog " Models Shriyash Upadhyay Martian Chaithanya Bandi Martian Narmeen Oozeer Martian Philip Quirke Martian

Neural Information Processing SystemsJun-17-2026, 07:12:02 GMT

Recent proposals for regulating frontier AI models have sparked concerns about the cost of safety regulation, and most such regulations have been shelved due to the safety-innovation tradeoff. This paper argues for an alternative regulatory approach that ensures AI safety while actively promoting innovation: mandating that large AI laboratories release small, openly accessible "analog models"--scaled-down versions trained similarly to and distilled from their largest proprietary models. Analog models serve as public proxies, allowing broad participation in safety verification, interpretability research, and algorithmic transparency without forcing labs to disclose their full-scale models. Recent research demonstrates that safety and interpretability methods developed using these smaller models generalize effectively to frontier-scale systems. By enabling the wider research community to directly investigate and innovate upon accessible analogs, our policy substantially reduces the regulatory burden and accelerates safety advancements. This mandate promises minimal additional costs, leveraging reusable resources like data and infrastructure, while significantly contributing to the public good. Our hope is not only that this policy be adopted, but that it illustrates a broader principle supporting fundamental research in machine learning: deeper understanding of models relaxes the safety-innovation tradeoff and lets us have more of both.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Law (1.00)
Government > Regional Government > North America Government > United States Government (0.95)
Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)

Add feedback

524730fd7bbf292675f5011fac1923b9-Paper-Conference.pdf

Neural Information Processing SystemsJun-17-2026, 07:11:01 GMT

They define near access-freeness (NAF) and propose it as sufficient for protection.

large language model, machine learning, natural language, (22 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre: Research Report > Experimental Study (1.00)

Industry:

Law > Intellectual Property & Technology Law (1.00)
Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Security & Privacy (0.67)
Information Technology > Artificial Intelligence > Vision (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLMReasoning

Neural Information Processing SystemsJun-17-2026, 07:03:27 GMT

Reinforcement Learning with Verifiable Rewards (RLVR) has proven effective for training large language models (LLMs) on complex reasoning tasks, such as mathematical problem solving. A prerequisite for the scalability of RLVR is a high-quality problem set with precise and verifiable answers.

arxiv preprint arxiv, large language model, machine learning, (20 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Genre:

Research Report > Experimental Study (1.00)
Overview (0.67)
Research Report > New Finding (0.67)

Industry: Education > Educational Setting (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Fast attention mechanisms: a tale of parallelism

Neural Information Processing SystemsJun-17-2026, 07:03:05 GMT

Transformers have the representational capacity to simulate Massively Parallel Computation (MPC) algorithms, but they suffer from quadratic time complexity, which severely limits their scalability. We introduce an efficient attention mechanism called Approximate Nearest Neighbor Attention (ANNA) with sub-quadratic time complexity. We prove that ANNA-transformers (1) retain the expressive power previously established for standard attention in terms of matching the capabilities of MPC algorithms, and (2) can solve key reasoning tasks such as Match2 and k-hop with near-optimal depth. Using the MPC framework, we further prove that constant-depth ANNA-transformers can simulate constant-depth low-rank transformers, thereby providing a unified way to reason about a broad class of efficient attention approximations.

anna-transformer, large language model, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.67)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

PSMBENCH: ABenchmark and Dataset for Evaluating LLMs Extraction of Protocol State Machines from RFCSpecifications

Neural Information Processing SystemsJun-17-2026, 07:02:05 GMT

Accurately extracting protocol-state machines (PSMs) from the long, densely written Request-for-Comments (RFC) standards that govern Internet-scale communication remains a bottleneck for automated security analysis and protocol testing. In this paper, we introduce RFC2PSM, the first large-scale dataset that pairs 1,580 pages of cleaned RFC text with 108 manually validated states and 297 transitions covering 14 widely deployed protocols spanning the data-link, transport, session, and application layers. Built on this corpus, we propose PSMBENCH, a benchmark that (i) feeds chunked RFC to an LLM, (ii) prompts the model to emit a machine-readable PSM, and (iii) scores the output with structure-aware, semantic fuzzy-matching metrics that reward partially correct graphs. A comprehensive baseline study of nine state-of-the-art open and commercial LLMs reveals a persistent state-transition gap: models identify many individual states (up to 0.82 F1) but struggle to assemble coherent transition graphs ( 0.38 F1), highlighting challenges in long-context reasoning, alias resolution, and action/event disambiguation. We release the dataset, evaluation code, and all model outputs as open-sourced1, providing a fully reproducible starting point for future work on reasoning over technical prose and generating executable graph structures. RFC2PSM and PSMBENCH aim to catalyze cross-disciplinary progress toward LLMs that can interpret and verify the protocols that keep the Internet safe.

large language model, machine learning, natural language, (22 more...)

Neural Information Processing Systems

Country: North America > United States (1.00)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning

Neural Information Processing SystemsJun-17-2026, 07:01:47 GMT

Recent large language models (LLMs) have demonstrated strong reasoning capabilities that benefits from online reinforcement learning (RL). These capabilities have primarily been demonstrated within the left-to-right autoregressive (AR) generation paradigm. In contrast, non-autoregressive paradigms based on diffusion generate text in a coarse-to-fine manner. Although recent diffusion-based large language models (dLLMs) have achieved competitive language modeling performance compared to their AR counterparts, it remains unclear if dLLMs can also leverage recent advances in LLM reasoning. To this end, we propose d1, a framework to adapt pre-trained masked dLLMs into reasoning models via a combination of supervised finetuning (SFT) and RL. Specifically, we develop and extend techniques to improve reasoning in pretrained dLLMs: (a) we utilize a masked SFT technique to distill knowledge and instill self-improvement behavior directly from existing datasets, and (b) we introduce a novel critic-free, policygradient based RL algorithm called diffu-GRPO, the first integration of policy gradient methods to masked dLLMs. Through empirical studies, we investigate the performance of different post-training recipes on multiple mathematical and planning benchmarks. We find that d1 yields the best performance and significantly improves performance of a state-of-the-art dLLM. Our code is released at https://dllm-reasoning.github.io/.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

What Moves the Eyes: Doubling Mechanistic Model Performance Using Deep Networks to Discover and Test Cognitive Hypotheses

Neural Information Processing SystemsJun-17-2026, 06:52:46 GMT

Understanding how humans move their eyes to gather visual information is a central question in neuroscience, cognitive science, and vision research. While recent deep learning (DL) models achieve state-of-the-art performance in predicting human scanpaths, their underlying decision processes remain opaque. At an opposite end of the modeling spectrum, cognitively inspired mechanistic models aim to explain scanpath behavior through interpretable cognitive mechanisms but lag far behind in predictive accuracy. In this work, we bridge this gap by using a high-performing deep model--DeepGaze III--to discover and test mechanisms that improve a leading mechanistic model, SceneWalk. By identifying individual fixations where DeepGaze III succeeds and SceneWalk fails, we isolate behaviorally meaningful discrepancies and use them to motivate targeted extensions of the mechanistic framework. These include time-dependent temperature scaling, saccadic momentum and an adaptive cardinal attention bias: Simple, interpretable additions that substantially boost predictive performance. With these extensions, SceneWalk's explained variance on the MIT1003 dataset doubles from 35% to 70%, setting a new state of the art in mechanistic scanpath prediction. Our findings show how performance-optimized neural networks can serve as tools for cognitive model discovery, offering a new path toward interpretable and high-performing models of visual behavior.

artificial intelligence, bit fix, machine learning, (20 more...)

Neural Information Processing Systems

Country: