AITopics | shortlist

Choosing Online Experiment Designs under Interference in Ads, Recommendations, and Member-Experience Systems

arXiv.org Machine LearningMay-26-2026

Online experiments in ads, recommendation, and member-experience systems are often planned before the dominant interference mechanism is known. A treatment may propagate through budgets, inventory, producer exposure, graph spillovers, or temporal carryover, making the randomization design itself a statistical decision. We formulate this problem as robust design selection over uncertain exposure mechanisms. Given a finite catalog of six implementable designs, the selector compares each design by worst-case planning risk over an ambiguity set. The risk combines exposure bias, assignment-unit variance, minimum detectable effect, contamination or carryover, operational cost, and estimand mismatch. For theoretical justification, the paper develops a geometry-aware guarantee, stating that design bias is bounded by Wasserstein distance to the launch exposure distribution, and this penalty is minimax tight under Lipschitz exposure response. We also prove finite-catalog approximation and a robust selector theorem with excess-risk control, exact recovery under separation, and certified shortlists when the risk surface is flat. Empirically, the same selector gives different recommendations across samples from public datasets. It selects user-randomization on Criteo ads with dimensionless robust risk 1.295, switchbacks on Open Bandit-bts/men with risk 2.105, and cluster-randomization on KuaiRand with risk 2.240. The Open Bandit case stresses known but uneven logging support, with propensities from 0.00006 to 0.594 and a 5.17% IPS effective-sample share. Overall, the paper contributes an interference-aware experiment design framework based on mechanism-robust design decisions, where the output is either a justified design choice or an uncertainty shortlist.

artificial intelligence, machine learning, randomization, (18 more...)

arXiv.org Machine Learning

2605.2529

Country: North America > United States (0.28)

Genre: Research Report > Experimental Study (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

c32319f4868da7613d78af9993100e42-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 18:56:22 GMT

accuracy, representation, representation size, (15 more...)

Neural Information Processing Systems

Country: Africa > Madagascar (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

CascadeXML: RethinkingTransformersfor End-to-endMulti-resolutionTraininginExtreme Multi-labelClassification

Neural Information Processing SystemsFeb-7-2026, 11:22:29 GMT

CascadeXML significantly outperforms allexisting approaches with non-trivial gains obtained on benchmark datasets consisting of up to three millionlabels.

artificial intelligence, classification, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Europe > Finland (0.04)
Europe > Italy > Tuscany > Florence (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

ELIAS: End-to-End Learning to Index and Search in Large Output Spaces

Neural Information Processing SystemsNov-15-2025, 05:58:10 GMT

A popular approach for dealing with the large label space is to arrange the labels into a shallow tree-based index and then learn an ML model to efficiently search this index via beam search.

Add feedback

DynaSpec: Context-aware Dynamic Speculative Sampling for Large-Vocabulary Language Models

Zhang, Jinbin, Ullah, Nasib, Schultheis, Erik, Babbar, Rohit

arXiv.org Artificial IntelligenceNov-11-2025

Speculative decoding has become a standard way to accelerate LLM inference: a small drafter proposes multiple tokens and a large target model verifies them once per speculation length. Recently, scaling of the LLM vocabulary has pushed the number of tokens to grow substantially. While verification over the full vocabulary leaves the target model largely unaffected, the O(|V|d) parameters in the drafter's output head become a latency bottleneck, slowing the entire pipeline. Contemporary methods (e.g., FR-Spec, VocabTrim) restrict the drafter's vocabulary to a fixed top frequent subset of the target model's vocabulary. Although this reduces draft-time compute, it is brittle, since: (i) frequency lists are corpus-dependent and require retuning to generalize, and (ii) static shortlists suppress rare or domain-specific tokens, lowering the expected number of tokens per verification step. We propose DynaSpec, a context-dependent dynamic shortlisting mechanism that is robust, speeds up drafting, and generalizes across diverse tasks. Concretely, we introduce lightweight, coarse-grained meta-classifiers that route contexts to a small number of token clusters; the union of the top-k selected clusters forms the drafter's shortlist, while verification retains the full vocabulary and exactness. The meta-classifier finishes its computation earlier than the drafter's hidden state generation by exploiting parallel execution of draft encoding and meta shortlisting on separate streams. Across standard speculative decoding benchmarks, DynaSpec delivers consistent improvements in mean accepted length, for Llama-3-8B, reaching upto 98.2% of full-vocabulary performance, while fixed-shortlist baselines attain only 84.4%. By leveraging context-dependent selection, DynaSpec achieves up to a 2.18 times increase in generated tokens compared to 1.91 times for fixed-vocabulary approaches.

arxiv preprint arxiv, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2510.13847

Country:

North America > United States (0.28)
Europe > Austria (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

'Vibe coding' named word of the year by Collins Dictionary

BBC NewsNov-6-2025, 02:22:22 GMT

'Vibe coding' named word of the year by Collins Dictionary If you've ever wanted to create your own computer program but never learnt how to code, you might try vibe coding. Collins Dictionary's word of the year - which is confusingly made up of two words - is the art of making an app or website by describing it to artificial intelligence (AI) rather than by writing programming code manually. The term was coined in February by OpenAI co-founder Andrej Karpathy, who came up with the name to represent how AI can let some programmers forget that the code even exists and give in to the vibes while making a computer program. It was one of 10 words on a shortlist to reflect the mood, language and preoccupations of 2025. By giving an AI tool a simple description such as make me a program that schedules my weekly meals, people can use vibe coding to make basic apps without any previous programming knowledge.

collin dictionary, computer program, vibe, (5 more...)

BBC News

Country:

North America > United States (0.31)
South America (0.16)
North America > Central America (0.16)
(13 more...)

Industry: Government > Regional Government (0.50)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

PartnerMAS: An LLM Hierarchical Multi-Agent Framework for Business Partner Selection on High-Dimensional Features

Li, Lingyao, Wu, Haolun, Li, Zhenkun, Hu, Jiabei, Wang, Yu, Huang, Xiaoshan, Hua, Wenyue, Wang, Wenqian

arXiv.org Artificial IntelligenceNov-3-2025

High-dimensional decision-making tasks, such as business partner selection, involve evaluating large candidate pools with heterogeneous numerical, categorical, and textual features. MAS, a hierarchical multi-agent framework that decomposes evaluation into three layers: a Planner Agent that designs strategies, Specialized Agents that perform role-specific assessments, and a Supervisor Agent that integrates their outputs. To support systematic evaluation, we also introduce a curated benchmark dataset of venture capital co-investments, featuring diverse firm attributes and ground-truth syndicates. MAS consistently outperforms single-agent and debate-based multi-agent baselines, achieving up to 10-15% higher match rates. Analysis of agent reasoning shows that planners are most responsive to domain-informed prompts, specialists produce complementary feature coverage, and supervisors play an important role in aggregation. Our implementation is available at this anonymous link. In real-world decision-making, practitioners often navigate high-dimensional data including extensive option sets and numerous evaluative features (Sandanayake et al., 2018; Sigle et al., 2023). Business partner selection which includes partner shortlisting and strategic alliance formation exemplifies this challenge (Mindruta et al., 2016): firms often face a vast pool of potential candidates, each described by diverse attributes ranging from quantitative indicators (e.g., financial metrics, geographic presence) to text-rich information (e.g., strategic fit, investment preferences) (Shah & Swaminathan, 2008). The scale and complexity of such data can easily overwhelm human decision-makers, incurring significant costs (Li et al., 2008). This underscores the need for intelligent systems capable of analyzing large candidate sets and diverse features. Large language models (LLMs) have emerged as promising tools for addressing reasoning tasks in data-rich domains (Lee et al., 2025; Mischler et al., 2024). With appropriate prompting (e.g., few-shot learning) or information retrieval techniques (e.g., RAG), these models can identify salient features using only feature and task descriptions, achieving performance comparable to established methods (Li et al., 2025a; Jeong et al., 2024).

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2509.24046

Country: North America > United States (0.93)

Genre: Research Report > New Finding (1.00)

Industry:

Banking & Finance > Trading (1.00)
Banking & Finance > Capital Markets (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Challenger-Based Combinatorial Bandits for Subcarrier Selection in OFDM Systems

Amiri, Mohsen, Venktesh, V, Magnússon, Sindri

arXiv.org Artificial IntelligenceOct-7-2025

This paper investigates the identification of the top-m user-scheduling sets in multi-user MIMO downlink, which is cast as a combinatorial pure-exploration problem in stochastic linear bandits. Because the action space grows exponentially, exhaustive search is infeasible. We therefore adopt a linear utility model to enable efficient exploration and reliable selection of promising user subsets. We introduce a gap-index framework that maintains a shortlist of current estimates of champion arms (top-m sets) and a rotating shortlist of challenger arms that pose the greatest threat to the champions. This design focuses on measurements that yield the most informative gap-index-based comparisons, resulting in significant reductions in runtime and computation compared to state-of-the-art linear bandit methods, with high identification accuracy. The method also exposes a tunable trade-off between speed and accuracy. Simulations on a realistic OFDM downlink show that shortlist-driven pure exploration makes online, measurement-efficient subcarrier selection practical for AI-enabled communication systems.

data mining, machine learning, subcarrier, (18 more...)

arXiv.org Artificial Intelligence

2510.04559

Genre: Research Report (1.00)

Technology: