AITopics | npl

Collaborating Authors

npl

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Fisher Random Walk: Automatic Debiasing Contextual Preference Inference for Large Language Model Evaluation

Zhang, Yichi, Belloni, Alexander, Fang, Ethan X., Lu, Junwei, Xu, Xiaoan

arXiv.org Machine LearningSep-9-2025

Motivated by the need for rigorous and scalable evaluation of large language models, we study contextual preference inference for pairwise comparison functionals of context-dependent preference score functions across domains. Focusing on the contextual Bradley-Terry-Luce model, we develop a semiparametric efficient estimator that automates the debiased estimation through aggregating weighted residual balancing terms across the comparison graph. We show that the efficiency is achieved when the weights are derived from a novel strategy called Fisher random walk. We also propose a computationally feasible method to compute the weights by a potential representation of nuisance weight functions. We show our inference procedure is valid for general score function estimators accommodating the practitioners' need to implement flexible deep learning methods. We extend the procedure to multiple hypothesis testing using a Gaussian multiplier bootstrap that controls familywise error and to distributional shift via a cross-fitted importance-sampling adjustment for target-domain inference. Numerical studies, including language model evaluations under diverse contexts, corroborate the accuracy, efficiency, and practical utility of our method.

estimator, graph, inequality, (13 more...)

arXiv.org Machine Learning

2509.05852

Country:

Asia > Middle East > UAE (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (0.46)
Research Report > New Finding (0.45)

Industry:

Health & Medicine (0.45)
Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Decision Making under Model Misspecification: DRO with Robust Bayesian Ambiguity Sets

Dellaporta, Charita, O'Hara, Patrick, Damoulas, Theodoros

arXiv.org Machine LearningMay-7-2025

Distributionally Robust Optimisation (DRO) protects risk-averse decision-makers by considering the worst-case risk within an ambiguity set of distributions based on the empirical distribution or a model. To further guard against finite, noisy data, model-based approaches admit Bayesian formulations that propagate uncertainty from the posterior to the decision-making problem. However, when the model is misspecified, the decision maker must stretch the ambiguity set to contain the data-generating process (DGP), leading to overly conservative decisions. We address this challenge by introducing DRO with Robust, to model misspecification, Bayesian Ambiguity Sets (DRO-RoBAS). These are Maximum Mean Discrepancy ambiguity sets centred at a robust posterior predictive distribution that incorporates beliefs about the DGP. We show that the resulting optimisation problem obtains a dual formulation in the Reproducing Kernel Hilbert Space and we give probabilistic guarantees on the tolerance level of the ambiguity set. Our method outperforms other Bayesian and empirical DRO approaches in out-of-sample performance on the Newsvendor and Portfolio problems with various cases of model misspecification.

artificial intelligence, machine learning, npl, (18 more...)

arXiv.org Machine Learning

2505.03585

Country:

Europe > United Kingdom > England > West Midlands > Coventry (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback

Covariate Assisted Entity Ranking with Sparse Intrinsic Scores

Fan, Jianqing, Hou, Jikai, Yu, Mengxin

arXiv.org Machine LearningJul-9-2024

This paper addresses the item ranking problem with associate covariates, focusing on scenarios where the preference scores can not be fully explained by covariates, and the remaining intrinsic scores, are sparse. Specifically, we extend the pioneering Bradley-Terry-Luce (BTL) model by incorporating covariate information and considering sparse individual intrinsic scores. Our work introduces novel model identification conditions and examines the regularized penalized Maximum Likelihood Estimator (MLE) statistical rates. We then construct a debiased estimator for the penalized MLE and analyze its distributional properties. Additionally, we apply our method to the goodness-of-fit test for models with no latent intrinsic scores, namely, the covariates fully explaining the preference scores of individual items. We also offer confidence intervals for ranks. Our numerical studies lend further support to our theoretical findings, demonstrating validation for our proposed method

confidence interval, covariate, probability, (17 more...)

arXiv.org Machine Learning

2407.08814

Country: Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)

Genre:

Research Report > New Finding (0.47)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Uncertainty Quantification of MLE for Entity Ranking with Covariates

Fan, Jianqing, Hou, Jikai, Yu, Mengxin

arXiv.org Artificial IntelligenceDec-19-2022

This paper concerns with statistical estimation and inference for the ranking problems based on pairwise comparisons with additional covariate information such as the attributes of the compared items. Despite extensive studies, few prior literatures investigate this problem under the more realistic setting where covariate information exists. To tackle this issue, we propose a novel model, Covariate-Assisted Ranking Estimation (CARE) model, that extends the well-known Bradley-Terry-Luce (BTL) model, by incorporating the covariate information. Specifically, instead of assuming every compared item has a fixed latent score $\{\theta_i^*\}_{i=1}^n$, we assume the underlying scores are given by $\{\alpha_i^*+{x}_i^\top\beta^*\}_{i=1}^n$, where $\alpha_i^*$ and ${x}_i^\top\beta^*$ represent latent baseline and covariate score of the $i$-th item, respectively. We impose natural identifiability conditions and derive the $\ell_{\infty}$- and $\ell_2$-optimal rates for the maximum likelihood estimator of $\{\alpha_i^*\}_{i=1}^{n}$ and $\beta^*$ under a sparse comparison graph, using a novel `leave-one-out' technique (Chen et al., 2019) . To conduct statistical inferences, we further derive asymptotic distributions for the MLE of $\{\alpha_i^*\}_{i=1}^n$ and $\beta^*$ with minimal sample complexity. This allows us to answer the question whether some covariates have any explanation power for latent scores and to threshold some sparse parameters to improve the ranking performance. We improve the approximation method used in (Gao et al., 2021) for the BLT model and generalize it to the CARE model. Moreover, we validate our theoretical results through large-scale numerical studies and an application to the mutual fund stock holding dataset.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2212.09961

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Industry: Banking & Finance > Trading (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.45)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)

Add feedback

Uncertainty quantification in the Bradley-Terry-Luce model

Gao, Chao, Shen, Yandi, Zhang, Anderson Y.

arXiv.org Machine LearningOct-7-2021

The Bradley-Terry-Luce (BTL) model is a benchmark model for pairwise comparisons between individuals. Despite recent progress on the first-order asymptotics of several popular procedures, the understanding of uncertainty quantification in the BTL model remains largely incomplete, especially when the underlying comparison graph is sparse. In this paper, we fill this gap by focusing on two estimators that have received much recent attention: the maximum likelihood estimator (MLE) and the spectral estimator. Using a unified proof strategy, we derive sharp and uniform non-asymptotic expansions for both estimators in the sparsest possible regime (up to some poly-logarithmic factors) of the underlying comparison graph. These expansions allow us to obtain: (i) finite-dimensional central limit theorems for both estimators; (ii) construction of confidence intervals for individual ranks; (iii) optimal constant of $\ell_2$ estimation, which is achieved by the MLE but not by the spectral estimator. Our proof is based on a self-consistent equation of the second-order remainder vector and a novel leave-two-out analysis.

lemma, logn, probability, (17 more...)

arXiv.org Machine Learning

2110.03874

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
North America > United States > Illinois > Cook County > Chicago (0.04)
(2 more...)

Genre: Research Report (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.34)

Add feedback

Lagrangian Inference for Ranking Problems

Liu, Yue, Fang, Ethan X., Lu, Junwei

arXiv.org Machine LearningSep-30-2021

We propose a novel combinatorial inference framework to conduct general uncertainty quantification in ranking problems. We consider the widely adopted Bradley-Terry-Luce (BTL) model, where each item is assigned a positive preference score that determines the Bernoulli distributions of pairwise comparisons' outcomes. Our proposed method aims to infer general ranking properties of the BTL model. The general ranking properties include the "local" properties such as if an item is preferred over another and the "global" properties such as if an item is among the top $K$-ranked items. We further generalize our inferential framework to multiple testing problems where we control the false discovery rate (FDR), and apply the method to infer the top-$K$ ranked items. We also derive the information-theoretic lower bound to justify the minimax optimality of the proposed method. We conduct extensive numerical studies using both synthetic and real datasets to back up our theory.

fang and lu, lagrangian inference, procedure, (13 more...)

arXiv.org Machine Learning

2110.00151

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Pennsylvania > Centre County > University Park (0.04)
North America > United States > Michigan (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.67)
Leisure & Entertainment > Sports (0.45)

Technology:

Information Technology > Communications (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.46)

Add feedback

Partial Recovery for Top-$k$ Ranking: Optimality of MLE and Sub-Optimality of Spectral Method

Chen, Pinhan, Gao, Chao, Zhang, Anderson Y.

arXiv.org Machine LearningJun-29-2020

Given partially observed pairwise comparison data generated by the Bradley-Terry-Luce (BTL) model, we study the problem of top-$k$ ranking. That is, to optimally identify the set of top-$k$ players. We derive the minimax rate with respect to a normalized Hamming loss. This provides the first result in the literature that characterizes the partial recovery error in terms of the proportion of mistakes for top-$k$ ranking. We also derive the optimal signal to noise ratio condition for the exact recovery of the top-$k$ set. The maximum likelihood estimator (MLE) is shown to achieve both optimal partial recovery and optimal exact recovery. On the other hand, we show another popular algorithm, the spectral method, is in general sub-optimal. Our results complement the recent work by Chen et al. (2019) that shows both the MLE and the spectral method achieve the optimal sample complexity for exact recovery. It turns out the leading constants of the sample complexity are different for the two algorithms. Another contribution that may be of independent interest is the analysis of the MLE without any penalty or regularization for the BTL model. This closes an important gap between theory and practice in the literature of ranking.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Machine Learning

2006.16485

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Here, there and everywhere

#artificialintelligenceMar-11-2017, 03:59:15 GMT

After decades as laboratory curiosities, some of quantum physics' oddest effects are beginning to be put to use, says Jason Palmer PATRICK GILL, a director of the new Quantum Metrology Institute at Britain's National Physical Laboratory (NPL) in south-west London and an expert in atomic clocks, points to a large table full of lenses and mirrors, vacuum chambers and electronics. "And there's a smaller one over there," he says. NPL is part of a consortium of the planet's official timekeepers. In all its atomic-clock laboratories, each of the flagship devices--some of which are huge--is flanked by a smaller one under construction. Miniaturisation is the name of the game.

artificial intelligence, atomic clock, quantum technology, (12 more...)

#artificialintelligence

Country: Europe > United Kingdom > England > Greater London > London (0.25)

Industry: Information Technology (0.49)

Technology:

Information Technology > Hardware (0.57)
Information Technology > Artificial Intelligence (0.49)

Add feedback

Adversarial Top-$K$ Ranking

Suh, Changho, Tan, Vincent Y. F., Zhao, Renbo

arXiv.org Machine LearningFeb-15-2016

We study the top-$K$ ranking problem where the goal is to recover the set of top-$K$ ranked items out of a large collection of items based on partially revealed preferences. We consider an adversarial crowdsourced setting where there are two population sets, and pairwise comparison samples drawn from one of the populations follow the standard Bradley-Terry-Luce model (i.e., the chance of item $i$ beating item $j$ is proportional to the relative score of item $i$ to item $j$), while in the other population, the corresponding chance is inversely proportional to the relative score. When the relative size of the two populations is known, we characterize the minimax limit on the sample size required (up to a constant) for reliably identifying the top-$K$ items, and demonstrate how it scales with the relative size. Moreover, by leveraging a tensor decomposition method for disambiguating mixture distributions, we extend our result to the more realistic scenario in which the relative population size is unknown, thus establishing an upper bound on the fundamental limit of the sample size for recovering the top-$K$ set.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Machine Learning

1602.04567

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Communications > Social Media (0.88)

Add feedback