Goto

Collaborating Authors

 shortlist





DynaSpec: Context-aware Dynamic Speculative Sampling for Large-Vocabulary Language Models

Zhang, Jinbin, Ullah, Nasib, Schultheis, Erik, Babbar, Rohit

arXiv.org Artificial Intelligence

Speculative decoding has become a standard way to accelerate LLM inference: a small drafter proposes multiple tokens and a large target model verifies them once per speculation length. Recently, scaling of the LLM vocabulary has pushed the number of tokens to grow substantially. While verification over the full vocabulary leaves the target model largely unaffected, the O(|V|d) parameters in the drafter's output head become a latency bottleneck, slowing the entire pipeline. Contemporary methods (e.g., FR-Spec, VocabTrim) restrict the drafter's vocabulary to a fixed top frequent subset of the target model's vocabulary. Although this reduces draft-time compute, it is brittle, since: (i) frequency lists are corpus-dependent and require retuning to generalize, and (ii) static shortlists suppress rare or domain-specific tokens, lowering the expected number of tokens per verification step. We propose DynaSpec, a context-dependent dynamic shortlisting mechanism that is robust, speeds up drafting, and generalizes across diverse tasks. Concretely, we introduce lightweight, coarse-grained meta-classifiers that route contexts to a small number of token clusters; the union of the top-k selected clusters forms the drafter's shortlist, while verification retains the full vocabulary and exactness. The meta-classifier finishes its computation earlier than the drafter's hidden state generation by exploiting parallel execution of draft encoding and meta shortlisting on separate streams. Across standard speculative decoding benchmarks, DynaSpec delivers consistent improvements in mean accepted length, for Llama-3-8B, reaching upto 98.2% of full-vocabulary performance, while fixed-shortlist baselines attain only 84.4%. By leveraging context-dependent selection, DynaSpec achieves up to a 2.18 times increase in generated tokens compared to 1.91 times for fixed-vocabulary approaches.


'Vibe coding' named word of the year by Collins Dictionary

BBC News

'Vibe coding' named word of the year by Collins Dictionary If you've ever wanted to create your own computer program but never learnt how to code, you might try vibe coding. Collins Dictionary's word of the year - which is confusingly made up of two words - is the art of making an app or website by describing it to artificial intelligence (AI) rather than by writing programming code manually. The term was coined in February by OpenAI co-founder Andrej Karpathy, who came up with the name to represent how AI can let some programmers forget that the code even exists and give in to the vibes while making a computer program. It was one of 10 words on a shortlist to reflect the mood, language and preoccupations of 2025. By giving an AI tool a simple description such as make me a program that schedules my weekly meals, people can use vibe coding to make basic apps without any previous programming knowledge.


PartnerMAS: An LLM Hierarchical Multi-Agent Framework for Business Partner Selection on High-Dimensional Features

Li, Lingyao, Wu, Haolun, Li, Zhenkun, Hu, Jiabei, Wang, Yu, Huang, Xiaoshan, Hua, Wenyue, Wang, Wenqian

arXiv.org Artificial Intelligence

High-dimensional decision-making tasks, such as business partner selection, involve evaluating large candidate pools with heterogeneous numerical, categorical, and textual features. MAS, a hierarchical multi-agent framework that decomposes evaluation into three layers: a Planner Agent that designs strategies, Specialized Agents that perform role-specific assessments, and a Supervisor Agent that integrates their outputs. To support systematic evaluation, we also introduce a curated benchmark dataset of venture capital co-investments, featuring diverse firm attributes and ground-truth syndicates. MAS consistently outperforms single-agent and debate-based multi-agent baselines, achieving up to 10-15% higher match rates. Analysis of agent reasoning shows that planners are most responsive to domain-informed prompts, specialists produce complementary feature coverage, and supervisors play an important role in aggregation. Our implementation is available at this anonymous link. In real-world decision-making, practitioners often navigate high-dimensional data including extensive option sets and numerous evaluative features (Sandanayake et al., 2018; Sigle et al., 2023). Business partner selection which includes partner shortlisting and strategic alliance formation exemplifies this challenge (Mindruta et al., 2016): firms often face a vast pool of potential candidates, each described by diverse attributes ranging from quantitative indicators (e.g., financial metrics, geographic presence) to text-rich information (e.g., strategic fit, investment preferences) (Shah & Swaminathan, 2008). The scale and complexity of such data can easily overwhelm human decision-makers, incurring significant costs (Li et al., 2008). This underscores the need for intelligent systems capable of analyzing large candidate sets and diverse features. Large language models (LLMs) have emerged as promising tools for addressing reasoning tasks in data-rich domains (Lee et al., 2025; Mischler et al., 2024). With appropriate prompting (e.g., few-shot learning) or information retrieval techniques (e.g., RAG), these models can identify salient features using only feature and task descriptions, achieving performance comparable to established methods (Li et al., 2025a; Jeong et al., 2024).


Challenger-Based Combinatorial Bandits for Subcarrier Selection in OFDM Systems

Amiri, Mohsen, Venktesh, V, Magnússon, Sindri

arXiv.org Artificial Intelligence

This paper investigates the identification of the top-m user-scheduling sets in multi-user MIMO downlink, which is cast as a combinatorial pure-exploration problem in stochastic linear bandits. Because the action space grows exponentially, exhaustive search is infeasible. We therefore adopt a linear utility model to enable efficient exploration and reliable selection of promising user subsets. We introduce a gap-index framework that maintains a shortlist of current estimates of champion arms (top-m sets) and a rotating shortlist of challenger arms that pose the greatest threat to the champions. This design focuses on measurements that yield the most informative gap-index-based comparisons, resulting in significant reductions in runtime and computation compared to state-of-the-art linear bandit methods, with high identification accuracy. The method also exposes a tunable trade-off between speed and accuracy. Simulations on a realistic OFDM downlink show that shortlist-driven pure exploration makes online, measurement-efficient subcarrier selection practical for AI-enabled communication systems.