AITopics

2605.11865

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Vaidya, Omatharv Bharat, Jerzak, Connor T., Ho, Nhat, Bajaj, Chandrajit

Queryable LoRA: Instruction-Regularized Routing Over Shared Low-Rank Update Atoms

arXiv.org Machine LearningMay-12-2026

We present a data-adaptive method for parameter-efficient fine-tuning of large neural networks. Standard low-rank adaptation methods improve efficiency by restricting each layer update to a fixed low-rank form, but this static parameterization can be too rigid when the appropriate correction depends on the input and on the evolving depth-wise computation of the network. Our approach replaces a purely layer-local adapter with a shared queryable memory of low-rank update atoms. For each block of layers, the model forms a query from the current low-rank state and a running summary of previous blocks, uses this query to retrieve a content-dependent combination of shared update components via attention, and applies the resulting routed operator within the low-rank bottleneck. In this way, the method retains the efficiency and scalability of low-rank adaptation while allowing the effective update to vary across inputs and to share reusable structure across layers. The resulting architecture provides a principled middle ground between static LoRA-style updates and fully generated parameter updates: it remains compact and parameter-efficient while supporting dynamic, context-sensitive adaptation. Further, we incorporate instruction-regularization by augmenting routing logits with a language-induced prior over update atoms, thereby biasing the selection of low-rank transformations toward semantically relevant directions without generating unconstrained parameter updates. Experiments on noisy non-linear regression tasks and LLM fine-tuning suggest that this queryable update-memory formulation can improve final test performance and training stability compared to standard low-rank adaptation, while using a comparable number of trainable parameters.

artificial intelligence, machine learning, natural language, (16 more...)

2605.08423

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.86)

Neural Information Processing SystemsApr-30-2026, 08:24:34 GMT

Appendix APrompt Retrieval

The task of PubMedQA is to answer research questions with yes/no/maybe provided with the corresponding abstracts.

gpt-3, large language model, machine learning, (16 more...)

Genre: Research Report > New Finding (0.89)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.40)

Neural Information Processing SystemsApr-30-2026, 01:34:45 GMT

Arboretum: ALarge Multimodal Dataset Enabling AI for Biodiversity (Supplemental Material)

Arboretum is a 134.6M sample dataset designed to advance AI for biodiversity applications by providing a large-scale, accurately annotated multimodal dataset that includes images and corresponding textual descriptions for a diverse set of species. Arboretum aims to facilitate the development of AI models for species identification, ecological monitoring, and agricultural research. Additionally, we introduce three new benchmark datasets: Arboretum-Unseen, Arboretum-LifeStages, and Arboretum-Balanced. As the authors of this submission, we affirm that we bear all responsibility in case of any rights violations or ethical issues associated with this work. We confirm that the submitted work is original, and if it includes third-party content, it is used with proper permissions and attributions.

alarge multimodal dataset enabling ai, artificial intelligence, supplemental material, (12 more...)

Country: North America > United States > Arizona > Pima County > Tucson (0.18)

Technology: Information Technology > Artificial Intelligence (0.58)

Neural Information Processing SystemsApr-30-2026, 01:20:35 GMT

b8b93c48f5bfa385d071342089d70422-Supplemental-Datasets_and_Benchmarks_Track.pdf

artificial intelligence, caption, machine learning, (16 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.96)

Hao, Yongchang, Mou, Lili

Cactus: Accelerating Auto-Regressive Decoding with Constrained Acceptance Speculative Sampling

arXiv.org Machine LearningApr-8-2026

Speculative sampling (SpS) has been successful in accelerating the decoding throughput of auto-regressive large language models by leveraging smaller draft models. SpS strictly enforces the generated distribution to match that of the verifier LLM. This is unnecessarily restrictive as slight variations of the verifier's distribution, such as sampling with top-$k$ or temperature, would also be acceptable. Typical acceptance sampling (TAS) alleviates this issue by accepting more tokens using entropy-based heuristics. However, this approach distorts the verifier distribution, potentially degrading output quality when the verifier encodes critical information. In this work, we formalize the speculative sampling algorithm through the lens of constrained optimization. Based on this formulation, we propose Cactus (constrained acceptance speculative sampling), a method that guarantees controlled divergence from the verifier distribution and increasing acceptance rates. Empirical results across a wide range of benchmarks confirm the effectiveness of our approach.

large language model, machine learning, urlhttp, (20 more...)

2604.04987

Country:

North America > Canada (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States (0.04)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsMar-21-2026, 11:30:37 GMT

Tiny Time Mixers (TTMs): Fast Pre-trained Models for Enhanced Zero/Few-Shot Forecasting of Multivariate Time Series

Large pre-trained models excel in zero/few-shot learning for language and vision tasks but face challenges in multivariate time series (TS) forecasting due to diverse data characteristics. Consequently, recent research efforts have focused on developing pre-trained TS forecasting models. These models, whether built from scratch or adapted from large language models (LLMs), excel in zero/few-shot forecasting tasks. However, they are limited by slow performance, high computational demands, and neglect of cross-channel and exogenous correlations. To address this, we introduce Tiny Time Mixers (TTM), a compact model (starting from 1M parameters) with effective transfer learning capabilities, trained exclusively on public TS datasets. TTM, based on the light-weight TSMixer architecture, incorporates innovations like adaptive patching, diverse resolution sampling, and resolution prefix tuning to handle pre-training on varied dataset resolutions with minimal model capacity. Additionally, it employs multi-level modeling to capture channel correlations and infuse exogenous signals during fine-tuning. TTM outperforms existing popular benchmarks in zero/few-shot forecasting by (4-40\%), while reducing computational requirements significantly. Moreover, TTMs are lightweight and can be executed even on CPU-only machines, enhancing usability and fostering wider adoption in resource-constrained environments.

large language model, machine learning, natural language, (9 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.58)

arXiv.org Machine LearningMar-16-2026

Finite Difference Flow Optimization for RL Post-Training of Text-to-Image Models

McAllister, David, Aittala, Miika, Karras, Tero, Hellsten, Janne, Kanazawa, Angjoo, Aila, Timo, Laine, Samuli

Reinforcement learning (RL) has become a standard technique for post-training diffusion-based image synthesis models, as it enables learning from reward signals to explicitly improve desirable aspects such as image quality and prompt alignment. In this paper, we propose an online RL variant that reduces the variance in the model updates by sampling paired trajectories and pulling the flow velocity in the direction of the more favorable image. Unlike existing methods that treat each sampling step as a separate policy action, we consider the entire sampling process as a single action. We experiment with both high-quality vision language models and off-the-shelf quality metrics for rewards, and evaluate the outputs using a broad set of metrics. Our method converges faster and yields higher output quality and prompt alignment than previous approaches.

machine learning, natural language, reinforcement learning, (18 more...)

2603.12893

Country:

Europe > United Kingdom > England (0.04)
Europe > Hungary > Budapest > Budapest (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
Asia > China > Jiangsu Province > Changzhou (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsFeb-19-2026, 10:53:07 GMT

bc218a0c656e49d4b086975a9c785f47-Supplemental-Datasets_and_Benchmarks.pdf

Emerging ethical approaches have attempted to filter pretraining material, but such approaches have been ad hoc and failed to take context into account. We offer an approach to filtering grounded in law, which has directly addressed the tradeoffs in filtering material.

information, machine learning, natural language, (20 more...)

Country:

Europe > Germany (0.14)
Asia > China (0.14)
North America > Canada > British Columbia (0.04)
(12 more...)

Genre: Research Report (0.67)

Industry:

Law > Statutes (1.00)
Law > Litigation (1.00)
Law > Civil Rights & Constitutional Law (1.00)
(5 more...)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.67)