AITopics | pairwise

Collaborating Authors

pairwise

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Scorio.jl: A Julia package for ranking stochastic responses

Hariri, Mohsen, Hinczewski, Michael, Chaudhary, Vipin

arXiv.org Machine LearningMar-17-2026

Scorio.jl is a Julia package for evaluating and ranking systems from repeated responses to shared tasks. It provides a common tensor-based interface for direct score-based, pairwise, psychometric, voting, graph, and listwise methods, so the same benchmark can be analyzed under multiple ranking assumptions. We describe the package design, position it relative to existing Julia tools, and report pilot experiments on synthetic rank recovery, stability under limited trials, and runtime scaling.

artificial intelligence, machine learning, scorio, (18 more...)

arXiv.org Machine Learning

2603.14103

Country:

North America > United States > Ohio > Cuyahoga County > Cleveland (0.05)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
North America > United States > Massachusetts > Middlesex County > Reading (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.95)

Add feedback

c4546b4f9e1a44ed15c253dd43307dd5-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 00:39:55 GMT

algorithm, evaluation, experiment, (16 more...)

Neural Information Processing Systems

Country:

Oceania > Australia (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.93)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

SharperGeneralizationBoundsforPairwise Learning

Neural Information Processing SystemsFeb-11-2026, 02:27:28 GMT

We also introduce anew on-average stability measure to develop optimistic bounds in a low noise setting.

artificial intelligence, machine learning, stability, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Germany (0.04)
Asia > China (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

e6384711491713d29bc63fc5eeb5ba4f-Paper.pdf

Neural Information Processing SystemsFeb-10-2026, 21:33:40 GMT

algorithm, graph, matching, (13 more...)

Neural Information Processing Systems

Country:

Asia > China > Shanghai > Shanghai (0.04)
North America > United States > California (0.04)
North America > Canada (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)

Add feedback

GeneralizationGuaranteeofSGDforPairwise Learning

Neural Information Processing SystemsFeb-10-2026, 18:14:44 GMT

Representative problems include AUC maximization [14, 25, 42, 63, 66], metric learning [8, 31], ranking [1, 13] and learning with minimum error entropy loss functions [29]. For example, in supervised metric learning we wish to find a distance function between pairs of examples so that examples within the same class are relatively close while examples from different classes are far apartfromeachother.

artificial intelligence, machine learning, pairwise, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

a87d27f712df362cd22c7a8ef823e987-Paper.pdf

Neural Information Processing SystemsFeb-10-2026, 12:46:30 GMT

algorithm, algorithm 1, generalization, (15 more...)

Neural Information Processing Systems

Country:

Asia > China > Hong Kong (0.04)
North America > United States > New York > Albany County > Albany (0.04)
North America > United States > Iowa > Johnson County > Iowa City (0.04)
(2 more...)

Industry: Education > Educational Setting > Online (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.52)

Add feedback

Generalization Guarantee of SGD for Pairwise Learning

Neural Information Processing SystemsDec-24-2025, 18:32:46 GMT

Recently, there is a growing interest in studying pairwise learning since it includes many important machine learning tasks as specific examples, e.g., metric learning, AUC maximization and ranking. While stochastic gradient descent (SGD) is an efficient method, there is a lacking study on its generalization behavior for pairwise learning. In this paper, we present a systematic study on the generalization analysis of SGD for pairwise learning to understand the balance between generalization and optimization. We develop a novel high-probability generalization bound for uniformly-stable algorithms to incorporate the variance information for better generalization, based on which we establish the first nonsmooth learning algorithm to achieve almost optimal high-probability and dimension-independent generalization bounds in linear time. We consider both convex and nonconvex pairwise learning problems. Our stability analysis for convex problems shows how the interpolation can help generalization. We establish a uniform convergence of gradients, and apply it to derive the first generalization bounds on population gradients for nonconvex problems. Finally, we develop better generalization bounds for gradient-dominated problems.

artificial intelligence, generalization, machine learning, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.60)

Add feedback

Incentivizing Agentic Reasoning in LLM Judges via Tool-Integrated Reinforcement Learning

Xu, Ran, Chen, Jingjing, Ye, Jiayu, Wu, Yu, Yan, Jun, Yang, Carl, Yu, Hongkun

arXiv.org Artificial IntelligenceOct-28-2025

Large Language Models (LLMs) are widely used as judges to evaluate response quality, providing a scalable alternative to human evaluation. However, most LLM judges operate solely on intrinsic text-based reasoning, limiting their ability to verify complex constraints or perform accurate computation. Motivated by the success of tool-integrated reasoning (TIR) in numerous tasks, we propose TIR-Judge, an end-to-end RL framework for training LLM judges that integrates a code executor for precise evaluation. TIR-Judge is built on three principles: (i) diverse training across verifiable and non-verifiable domains, (ii) flexible judgment formats (pointwise, pairwise, listwise), and (iii) iterative RL that bootstraps directly from the initial model without distillation. On seven public benchmarks, TIR-Judge surpasses strong reasoning-based judges by up to 6.4% (pointwise) and 7.7% (pairwise), and achieves listwise performance comparable to Claude-Opus-4 despite having only 8B parameters. Remarkably, TIR-Judge-Zero - trained entirely without distilled judge trajectories, matches the performance of distilled variants, demonstrating that tool-augmented judges can self-evolve through iterative reinforcement learning.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2510.23038

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Add feedback

Active Set Ordering

Neural Information Processing SystemsOct-10-2025, 15:56:16 GMT

In this paper, we formalize the active set ordering problem, which involves actively discovering a set of inputs based on their orderings determined by expensive evaluations of a blackbox function.

algorithm, evaluation, experiment, (16 more...)

Neural Information Processing Systems

Country:

Oceania > Australia (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.93)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

Efficiency-Effectiveness Reranking FLOPs for LLM-based Rerankers

Peng, Zhiyuan, Wei, Ting-ruen, Song, Tingyu, Zhao, Yilun

arXiv.org Artificial IntelligenceOct-10-2025

Large Language Models (LLMs) have recently been applied to reranking tasks in information retrieval, achieving strong performance. However, their high computational demands often hinder practical deployment. Existing studies evaluate the efficiency of LLM-based rerankers using proxy metrics such as latency, the number of forward passes, input tokens, and output tokens. However, these metrics depend on hardware and running-time choices (\eg parallel or not, batch size, etc), and often fail to account for model size, making it difficult to interpret and obscuring the evaluation of the efficiency-effectiveness tradeoff. To address this issue, we propose \ours\footnote{https://github.com/zhiyuanpeng/EER-FLOPs.} for LLM-based rerankers: RPP (ranking metrics per PetaFLOP), measuring how much ranking quality (e.g., NDCG or MRR) a method achieves per PetaFLOP, and QPP (queries per PetaFLOP), measuring how many queries can be processed per PetaFLOP. Accompanied by the new metrics, an interpretable FLOPs estimator is developed to estimate the FLOPs of an LLM-based reranker even without running any experiments. Based on the proposed metrics, we conduct comprehensive experiments to evaluate a wide range of LLM-based rerankers with different architectures, studying the efficiency-effectiveness trade-off and bringing this issue to the attention of the research community.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2507.06223

Country:

Europe (0.67)
North America > United States (0.28)
North America > Mexico (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback