AITopics | allocation

Collaborating Authors

allocation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Matching Rates and Optimal Allocation for Federated Probe-Logit Distillation under Heterogeneous Bandwidth Budgets

Dubey, Prasanjit, Huo, Xiaoming

arXiv.org Machine LearningMay-29-2026

In federated language modeling, $K$ nodes each hold $n$ samples but cannot pool data or exchange full-precision gradients or weights. We study the minimax rate at which a conditional distribution over $V$ tokens can be estimated when each node may upload at most $B$ bits per query in a public probe set. In federated probe-logit distillation (FPLD), each node transmits a scalar-quantized logit vector on the probe set, and an aggregator distills a global parametric student. Prior work (Dubey and Huo, 2026) establishes a high-probability KL rate $O(d/(Kn) + ρ\sqrt{V \log V / m} + K^{-1} \cdot 2^{-2B/V})$ plus optimization slack, with the bandwidth term in its trace-sharpened form. Whether this bandwidth-term rate is tight, and how the upper bound generalizes to heterogeneous per-node bandwidths, are left open. We close both gaps. First, the dithered FPLD construction has a matching single-round lower bound $Ω(K^{-1} \cdot 2^{-2B/V})$ under non-degeneracy, pinning the bandwidth-axis rate at $Θ(K^{-1} \cdot 2^{-2B/V})$. $T$-round sequential refinement with nested/scaled residual quantizers achieves $O(K^{-1} \cdot 2^{-2TB/V})$; vanilla FPLD's $T$-independent bandwidth term is suboptimal for every $T > 1$. Second, we establish a heterogeneous-bandwidth upper bound for per-node budgets $B_i$, paired with a closed-form optimal allocation $B_i^* = B_{\mathrm{tot}}/K + (V/2) \log_2(w_i / \bar{w}_g)$, a log-tilted water-filling rule that is the per-node analogue of reverse water-filling for distortion-rate optimization. A plug-in adaptive variant estimates the weights from a short warm-up phase and attains $1 + O(\sqrt{\log(K/δ)/(m T_0)})$ relative suboptimality. Synthetic n-gram simulations confirm that empirical KL is bracketed by the upper and lower bounds and that the optimal allocation strictly dominates uniform and inverse-weighted baselines under heterogeneous clipping.

artificial intelligence, assumption, machine learning, (19 more...)

arXiv.org Machine Learning

2605.29642

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.82)

Add feedback

CART Random Forests as Sequential Allocation over Random Opportunity Sets: A Stochastic-Control Theory of Ensemble Risk

Mei, Tianxing, Fan, Yingying, Leng, Mingming, Lv, Jinchi

arXiv.org Machine LearningMay-27-2026

CART random forests are among the most widely used modern predictive methods, with well-documented empirical success. Yet, at the mechanistic level, the algorithm is often treated as a black box because of its complexity. In this paper, we develop a stochastic-control perspective on feature-subsampled CART random forests, named CART random opportunity-set allocation (CART-ROSA). At each node, the random subset of features is interpreted as a random feasible action set, and the CART split rule as a masked-action allocation policy. This policy induces a controlled stochastic process over informative split-count states, whose terminal law determines both single-tree error and cross-tree interaction terms in the forest mean squared error (MSE). Such representation opens the black box of CART-forests by separating two design levers: the informative-opportunity rate induced by feature subsampling, and the contraction strength from the within-mask split policy. We establish that the CART policy is locally stabilizing: it contracts imbalances in informative split allocations and concentrates terminal tree geometry. At the system level, however, it can be globally suboptimal for the forest objective. Specializing to the linear model, we derive the MSE risk expansion explicitly. Our results show how an operations-research perspective makes tractable a theoretical gap difficult to access from the standard algorithmic description of CART forests.

artificial intelligence, informative coordinate, machine learning, (17 more...)

arXiv.org Machine Learning

2605.26675

Country:

North America > United States (0.28)
Europe (0.27)

Genre: Research Report > New Finding (0.85)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Instance-Optimal Estimation with Multiple LLM Judges on a Budget

Lee, Junghyun, Kim, Sanghwa, Jedra, Yassir, Proutière, Alexandre, Yun, Se-Young

arXiv.org Machine LearningMay-25-2026

Evaluating large language models increasingly relies on LLM-as-a-judge protocols, but such evaluations remain costly: different judges have different prices and reliabilities, and the difficulty of each prompt-response pair can vary substantially. This raises a basic allocation question: under a fixed budget, how should one distribute evaluation queries across heterogeneous judges and instances to obtain the most accurate score estimates? We formalize this question as *budgeted heteroskedastic multi-judge estimation*. Given $K$ prompt-response pairs, $J$ judges with known costs, and unknown query-judge variances, the goal is to estimate a bounded score vector while minimizing an $\ell_p$-error. Our first contribution is to analyze the inverse-variance weighted estimator (IVWE) and to derive the oracle allocation that minimizes its error rate. Since this allocation depends on the unknown variances, we then address the practical unknown-variance setting by proposing EST-IVWE, an adaptive algorithm that constructs and leverages *optimistically biased* variance estimates to stabilize the empirical allocation. We prove that EST-IVWE matches the oracle IVWE rate up to lower-order terms in the budget. Our second and central theoretical contribution is a matching *local* minimax lower bound, which establishes the instance-optimality of the proposed algorithms. A key technical insight is that Fano-type high-probability arguments are too coarse for this problem: their packing construction loses the local variance structure that governs the optimal allocation. We instead use an Assouad-type in-expectation argument, based on local perturbations, which preserves this structure and yields the sharp allocation-dependent lower bound. Finally, we numerically validate the superiority of our approach over naïve uniform allocation on synthetic and HelpSteer2 datasets.

large language model, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

2605.23362

Country: North America > United States > New York (0.28)

Genre: Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Policy Learning with Observational Data: The Case of Hepatitis C Treatment for HIV/HCV Co-Infected Patients

Langevin, Raphaël

arXiv.org Machine LearningMay-19-2026

Decision-makers frequently must choose a single action from a finite set of alternatives -- for example, physicians selecting a treatment, investors choosing a portfolio risk level, or judges determining sentences. To improve outcomes, policymakers often issue policy rules or guidelines to inform such choices. In this paper, I show how to generally derive policy rules from observational data in a multi-action framework under relatively weak assumptions about the underlying structure of the heterogeneous sampled population. Conditional average treatment effects (CATEs) are consistently estimated via a weighted K-means algorithm, assuming the outcome model is correctly specified within each homogeneous subgroup. Feasible policy rules are then implemented via a standard decision tree, allowing for both perfect and imperfect adherence to treatment. The methodology is applied to treatment options for Hepatitis C (HCV) among patients co-infected with human immunodeficiency virus (HIV), a setting in which no uniform guideline exists for modern pharmaceutical therapies. The results identify a subgroup of patients with approximately an 80% probability of spontaneous HCV clearance without treatment. Estimation results also show that reallocating treatments among treated individuals could have reduced total treatment costs by CAN$3.6-4.9 million while still increasing aggregate health benefits relative to the status quo. These findings demonstrate that the proposed approach can generate improved, data-driven treatment guidelines for the management of HIV/HCV co-infected patients.

adherence, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

2605.16593

Country:

North America > United States (1.00)
North America > Canada (1.00)

Genre:

Research Report > Strength High (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology > HIV (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Add feedback

Adaptive Experimentation for Censored Survival Outcomes

Wang, Yuxin, Frauen, Dennis, Schweisthal, Jonas, Schröder, Maresa, Javurek, Emil, Feuerriegel, Stefan

arXiv.org Machine LearningMay-19-2026

Adaptive experimentation enables efficient estimation of causal effects, but existing methods are not designed for survival data with censoring, where event times are only partially observed (e.g., overall survival in cancer trials but with dropout). In this paper, we develop a novel framework for adaptive experimentation to estimate causal effects under right censoring. For this, we derive the semiparametric efficiency bound for the average survival effect curve as a function of the treatment allocation policy and thereby obtain a closed-form efficiency-optimal allocation policy. The policy generalizes classical Neyman allocation to survival settings by prioritizing patient strata where both event and censoring dynamics induce high uncertainty. Building on this, we propose the Adaptive Survival Estimator (ASE), an adaptive framework that learns the allocation policy and estimates the average survival effect curve sequentially. Our framework has three main benefits: (i) it accommodates arbitrary machine learning models for nuisance estimation; (ii) it is guided by a closed-form efficiency-optimal allocation policy; and (iii) it admits strong theoretical guarantees, including asymptotic normality via a martingale central limit theorem. We demonstrate our framework across various numerical experiments to show consistent efficiency gains over uniform randomization and censoring-agnostic baselines.

allocation, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

2605.18459

Country:

North America > United States (0.67)
Europe > United Kingdom > England (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)

Add feedback

Robust Sequential Experimental Design for A/B Testing

Wen, Qianglin, Wu, Xiangkun, Shi, Chengchun, Li, Ting, Tang, Niansheng, Zhang, Yingying, Zhu, Hongtu

arXiv.org Machine LearningMay-14-2026

Experimental design has emerged as a powerful approach for improving the sample efficiency of A/B testing, yet existing designs rely critically on correctly specified models. We study robust sequential experimental design under model misspecification and develop a unified framework that covers both contextual bandit and dynamic settings. Theoretically, we prove that our design bounds the worst-case mean squared error of the estimated treatment effect. Empirically, we demonstrate the effectiveness of the proposed approach using synthetic and real-world datasets from a leading technology company.

artificial intelligence, machine learning, robust sequential experimental design, (16 more...)

arXiv.org Machine Learning

2605.12899

Country:

Asia > China (0.46)
North America > United States (0.45)

Genre: Research Report > Experimental Study (1.00)

Industry:

Information Technology (0.68)
Transportation (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.67)
(2 more...)

Add feedback

Adaptive Policy Learning Under Unknown Network Interference

Gleich, Aidan, Laber, Eric, Volfovsky, Alexander

arXiv.org Machine LearningMay-13-2026

Adaptive experimentation under unknown network interference requires solving two coupled problems: (i) learning the underlying dynamics of interference among units and (ii) using these dynamics to inform treatment allocation in order to maximize a cumulative outcome of interest (e.g. revenue). Existing adaptive experimentation methods either assume the interference network is fully known or bypass the network by operating on coarse cluster-level randomizations. We develop a Thompson sampling algorithm that jointly learns the interference network and adaptively optimizes individual-level treatment allocations via a Gibbs sampler. The algorithm returns both an optimized treatment policy and an estimate of the interference network; the latter supports downstream causal analyses such as estimation of direct, indirect, and total treatment effects. For additive spillover models, we show that total reward is linear in the treatment vector with coefficients given by an $n$-dimensional latent score. We prove a Bayesian regret bound of order $\sqrt{nT \cdot B \log(en/B)}$ for exact posterior sampling; empirically, our Gibbs-based approximate sampler achieves regret consistent with this rate and remains sublinear when the additive spillovers assumption is violated. For general Neighborhood Interference, where this reduction is unavailable, we analyze an explore-then-commit variant with $O(n^2 \log T)$ graph-discovery cost. An information-theoretic $Ω(n \log T)$ lower bound complements both results. Empirically, our method achieves more than an order-of-magnitude reduction in regret in head-to-head comparisons. On two real-world networks, the algorithm achieves sublinear regret and yields downstream effect estimates with small RMSE relative to the truth.

artificial intelligence, interference, machine learning, (19 more...)

arXiv.org Machine Learning

2605.11191

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Optimal Policy Learning under Budget and Coverage Constraints

Cerulli, Giovanni

arXiv.org Machine LearningMay-13-2026

We study optimal policy learning under combined budget and minimum coverage constraints. We show that the problem admits a knapsack-type structure and that the optimal policy can be characterized by an affine threshold rule involving both budget and coverage shadow prices. We establish that the linear programming relaxation of the combinatorial solution has an O(1) integrality gap, implying asymptotic equivalence with the optimal discrete allocation. Building on this result, we analyze two implementable approaches: a Greedy-Lagrangian (GLC) and a rank-and-cut (RC) algorithm. We show that the GLC closely approximates the optimal solution and achieves near-optimal performance in finite samples. By contrast, RC is approximately optimal whenever the coverage constraint is slack or costs are homogeneous, while misallocation arises only when cost heterogeneity interacts with a binding coverage constraint. Monte Carlo evidence supports these findings.

artificial intelligence, constraint, machine learning, (16 more...)

arXiv.org Machine Learning

2605.12235

Country: Europe > Italy (0.40)

Genre: Research Report (1.00)

Industry:

Government (0.46)
Banking & Finance > Economy (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Budget-Constrained Causal Bandits: Bridging Uplift Modeling and Sequential Decision-Making

Pillai, Abhirami

arXiv.org Machine LearningApr-30-2026

Treatment allocation under budget constraints is a central challenge in digital advertising: advertisers must decide which users to show ads to while spending a limited budget wisely. The standard approach follows a two-stage offline pipeline - first collect historical data to estimate heterogeneous treatment effects (HTE), then solve a constrained optimization to allocate the budget. This works well with abundant data, but fails in cold-start settings such as new campaigns, new markets, or new customer segments where little historical data exists. We propose Budget-Constrained Causal Bandits (BCCB), an online framework that learns which users respond to ads while simultaneously spending the budget, making treatment decisions one user at a time. BCCB unifies three components into a single sequential process: learning individual-level ad effectiveness, exploring users whose response is uncertain, and pacing the budget over time. We evaluated on the Criteo Uplift dataset, a large-scale advertising dataset from a real randomized controlled trial. Our key finding is a data-efficiency crossover: offline methods require approximately 10,000 historical observations to produce reliable results, while BCCB operates effectively from the very first user. Furthermore, BCCB exhibits 3-5x lower performance variance between runs, making it more practical for real campaign planning. Among purely online methods, BCCB consistently outperforms standard Thompson Sampling, budgeted Thompson Sampling, and greedy HTE estimation across all budget levels tested.

budget, data mining, machine learning, (16 more...)

arXiv.org Machine Learning

2604.26169

Genre:

Research Report > Strength High (1.00)
Research Report > Experimental Study (1.00)

Industry: Marketing (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Data Science > Data Mining > Big Data (0.70)

Add feedback

Filters

Collaborating Authors

allocation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Matching Rates and Optimal Allocation for Federated Probe-Logit Distillation under Heterogeneous Bandwidth Budgets

CART Random Forests as Sequential Allocation over Random Opportunity Sets: A Stochastic-Control Theory of Ensemble Risk

Instance-Optimal Estimation with Multiple LLM Judges on a Budget

Policy Learning with Observational Data: The Case of Hepatitis C Treatment for HIV/HCV Co-Infected Patients

Adaptive Experimentation for Censored Survival Outcomes

Robust Sequential Experimental Design for A/B Testing

Adaptive Policy Learning Under Unknown Network Interference

Optimal Policy Learning under Budget and Coverage Constraints

Budget-Constrained Causal Bandits: Bridging Uplift Modeling and Sequential Decision-Making

d9b564716709357b4bccec9fc9ad04d2-Paper-Conference.pdf