AITopics | Large Language Model

Collaborating Authors

Large Language Model

News Overviews Instructional Materials AI-Alerts Classics

Value-Guided KVCompression for LLMs via Approximated CURDecomposition

Neural Information Processing SystemsJun-18-2026, 10:46:15 GMT

Key-value (KV) cache compression has emerged as a critical technique for reducing the memory and latency overhead of autoregressive language models during inference. Prior approaches predominantly rely on query-key attention scores to rank and evict cached tokens, assuming that attention intensity correlates with semantic importance. However, this heuristic overlooks the contribution of value vectors, which directly influence the attention output. In this paper, we propose CurDKV, a novel, value-centric KV compression method that selects keys and values based on leverage scores computed from CUR matrix decomposition. Our approach approximates the dominant subspace of the attention output softmax(QK)V, ensuring that the retained tokens best preserve the model's predictive behavior. Theoretically, we show that attention score approximation does not guarantee output preservation, and demonstrate that CUR-based selection minimizes end-to-end attention reconstruction loss.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: Asia (0.45)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

SpaceServe: Spatial Multiplexing of Complementary Encoders and Decoders for Multimodal LLMs

Neural Information Processing SystemsJun-18-2026, 10:34:49 GMT

Recent multimodal large language models (MLLMs) marry modality-specific vision or audio encoders with a shared text decoder. While the encoder is computeintensive but memory-light, the decoder is the opposite, yet state-of-the-art serving stacks still time-multiplex these complementary kernels, idling SMs or HBM in turn. We introduce SpaceServe, a serving system that space-multiplexes MLLMs: it decouples all modality encoders from the decoder, and co-locates them on the same GPU using fine-grained SM partitioning available in modern runtimes. A cost-model-guided Space-Inference Scheduler (SIS) dynamically assigns SM slices, while a Time-Windowed Shortest-Remaining-First (TWSRFT) policy batches encoder requests to minimise completion latency and smooth decoder arrivals. Evaluation shows that SpaceServe reduces time-per-output-token by 4.81 on average and up to 28.9 on Nvidia A100 GPUs.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country: North America > United States (0.93)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

ASustainable AIEconomy Needs Data Deals That Work for Generators

Neural Information Processing SystemsJun-18-2026, 10:33:07 GMT

We argue that the machine learning value chain is structurally unsustainable due to an economic data processing inequality: each state in the data cycle from inputs to model weights to synthetic outputs refines technical signal but strips economic equity from data generators. We show, by analyzing seventy-three public data deals, that the majority of value accrues to aggregators, with documented creator royalties rounding to zero and widespread opacity of deal terms. This is not just an economic welfare concern: as data and its derivatives become economic assets, the feedback loop that sustains current learning algorithms is at risk. We identify three structural faults--missing provenance, asymmetric bargaining power, and nondynamic pricing--as the operational machinery of this inequality. In our analysis, we trace these problems along the machine learning value chain and propose an Equitable Data-Value Exchange (EDVEX) Framework to enable a minimal market that benefits all participants. Finally, we outline research directions where our community can make concrete contributions to data deals and contextualize our position with related and orthogonal viewpoints.

arxiv preprint arxiv, large language model, machine learning, (21 more...)

Neural Information Processing Systems

Country:

Europe (0.93)
North America > United States (0.68)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Media > News (0.94)
(4 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.75)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.52)

Add feedback

Linear Attention for Efficient Bidirectional Sequence Modeling

Neural Information Processing SystemsJun-18-2026, 10:17:14 GMT

Linear Transformers and State Space Models have emerged as efficient alternatives to softmax Transformers for causal sequence modeling, enabling parallel training via matrix multiplication and efficient RNN-style inference. However, despite their success in causal tasks, no unified framework exists for applying Linear Transformers to bidirectional sequence modeling. We introduce LION, the first framework to systematically extend Linear Transformers to the bidirectional setting. LION generalizes three core representations commonly used in the causal case--full Linear Attention, bidirectional RNN, and chunkwise parallel form--to the bidirectional setting. These forms are theoretically equivalent and enable models to exploit the strengths of each during training and inference. We prove that a broad class of Linear Transformers can be extended using LION and validate our framework via three core examples based on the choice of decay type: LION-LIT, the bidirectional extension of [25]; LION-D, based on [44]; and LION-S, a variant using selective decay [34, 13]. Across standard bidirectional tasks, LION enables models to match or exceed the performance of softmax Transformers, while offering significantly faster training and more efficient inference than existing State Space Models.

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Add feedback

WEAVER: Shrinking the Generation-Verification Gap with Weak Verifiers

Neural Information Processing SystemsJun-18-2026, 10:06:12 GMT

Verifiers can improve language model (LM) capabilities by providing feedback or selecting the best response from a pool of generated candidates. Currently, high-quality verifiers are either unscalable (e.g., humans) or limited in utility (e.g., tools like Lean for formal proofs). While LM judges and reward models have become broadly useful as general-purpose verifiers, a significant performance gap remains between them and oracle verifiers. To help close this gap, we introduce WEAVER, a framework for designing a strong verifier by combining multiple weak, imperfect verifiers. First we find that weighted ensembles of verifiers, which typically require learning from labeled data, significantly outperform unweighted combinations due to differences in the verifiers. To reduce the dependency on labeled data, WEAVER leverages weak supervision to estimate each verifier's accuracy and combines their outputs into a unified score that better reflects true response quality.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country: North America > United States (0.67)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(5 more...)

Add feedback

IF-GUIDE: Influence Function-Guided Detoxification of LLMs

Neural Information Processing SystemsJun-18-2026, 09:58:21 GMT

We study how training data contributes to the emergence of toxic behaviors in large language models. Most prior work on reducing model toxicity adopts reactive approaches, such as fine-tuning pre-trained (and potentially toxic) models to align them with human values. In contrast, we propose a proactive approach-- IF-GUIDE--that leverages influence functions to identify and suppress harmful tokens in the training data. To this end, we first show that standard influence functions are ineffective at discovering harmful training records. We then present a novel adaptation that measures token-level attributions from training data to model toxicity, along with techniques for selecting toxic training documents and a learning objective that can be integrated into both pre-training and fine-tuning. Moreover, IF-GUIDE does not rely on human-preference data, which is typically required by existing alignment methods. In our evaluation, we demonstrate that IF-GUIDE substantially reduces both explicit and implicit toxicity--by up to 10 compared to uncensored models, and up to 3 compared to baseline alignment methods such as DPO and RAD--across both pre-training and fine-tuning scenarios. IF-GUIDE is computationally efficient: a billion-parameter model is not necessary for computing influence scores; a million-parameter model--with 7.5 fewer parameters--can effectively serve as a proxy for identifying harmful data.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

North America > United States (0.67)
Europe (0.67)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology (0.67)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Panoptic Captioning: An Equivalence Bridge for Image and Text

Neural Information Processing SystemsJun-18-2026, 09:44:17 GMT

This work introduces panoptic captioning, a novel task striving to seek the minimum text equivalent of images, which has broad potential applications. We take the first step towards panoptic captioning by formulating it as a task of generating a comprehensive textual description for an image, which encapsulates all entities, their respective locations and attributes, relationships among entities, as well as global image state. Through an extensive evaluation, our work reveals that state-of-the-art Multi-modal Large Language Models (MLLMs) have limited performance in solving panoptic captioning. To address this, we propose an effective data engine named PancapEngine to produce high-quality data and a novel method named PancapChain to improve panoptic captioning. Specifically, our PancapEngine first detects diverse categories of entities in images by an elaborate detection suite, and then generates required panoptic captions using entity-aware prompts. Additionally, our PancapChain explicitly decouples the challenging panoptic captioning task into multiple stages and generates panoptic captions step by step. More importantly, we contribute a comprehensive metric named PancapScore and a human-curated test set for reliable model evaluation. Experiments show that our PancapChain-13B model can beat state-of-the-art opensource MLLMs like InternVL-2.5-78B and even surpass proprietary models like GPT-4o and Gemini-2.0-Pro,

caption, large language model, machine learning, (21 more...)

Neural Information Processing Systems

Country: Europe (0.45)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

710f3f8473b93394505a082f9a8f3ba2-Paper-Conference.pdf

Neural Information Processing SystemsJun-18-2026, 09:44:06 GMT

Xing identifying chronizing illustrated audio et (V2A) al., in the multiple 2024; Figure generation subtle

arxiv preprint arxiv, large language model, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Media (0.46)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(3 more...)

Add feedback

SWE-RL: Advancing LLMReasoning via Reinforcement Learning on Open Software Evolution

Neural Information Processing SystemsJun-18-2026, 09:42:40 GMT

The recent DeepSeek-R1 release has demonstrated the immense potential of reinforcement learning (RL) in enhancing the general reasoning capabilities of large language models (LLMs). While DeepSeek-R1 and other follow-up work primarily focus on applying RL to competitive coding and math problems, this paper introduces SWE-RL, the first approach to scale RL-based LLM reasoning for real-world software engineering. Leveraging a lightweight rule-based reward (e.g., the similarity score between ground-truth and LLM-generated solutions), SWE-RL enables LLMs to autonomously recover a developer's reasoning processes and solutions by learning from extensive open-source software evolution data--the record of entire software development cycles, including code snapshots, code changes, and events such as issues and pull requests. Trained on top of Llama 3, our resulting reasoning model, Llama3-SWE-RL-70B, achieves a 41.0% solve rate on SWEbench Verified--a human-verified collection of real-world GitHub issues. To our knowledge, this is the best performance reported for medium-sized (<100B) LLMs to date, even comparable to leading proprietary LLMs like GPT-4o. Surprisingly, despite performing RL solely on software evolution data, Llama3-SWE-RL has even emerged with generalized reasoning skills. For example, it shows improved results on five out-of-domain tasks, namely, function coding, library use, code reasoning, mathematics, and general language understanding, whereas a supervisedfinetuning baseline even leads to performance degradation on average. Overall, SWE-RL opens up a new direction to improve the reasoning capabilities of LLMs through reinforcement learning on massive software engineering data.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

7103cd82de95a7b30983fcf74ba499ac-Paper-Conference.pdf

Neural Information Processing SystemsJun-18-2026, 09:42:23 GMT

Self-e beha Despite vior volution, curre, is essential nt the adv ability ancements for the of embodied agents in reinforcement to autonomously domain with fine-tuning long-horizon, improv (RFT) e their real-w sho reasoning wing orld strong tasks.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(6 more...)

Add feedback