AITopics | preference function

Collaborating Authors

preference function

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Preference-Based Batch and Sequential Teaching: Towards a Unified View of Models

Farnam Mansouri, Yuxin Chen, Ara Vartanian, Jerry Zhu, Adish Singla

Neural Information Processing SystemsFeb-12-2026, 03:10:47 GMT

Algorithmic machine teaching studies the interaction between a teacher and a learner where the teacher selects labeled examples aiming at teaching a target hypothesis.

artificial intelligence, learner, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

69eba34671b3ef1ef38ee85caae6b2a1-Supplemental.pdf

Neural Information Processing SystemsFeb-8-2026, 18:32:54 GMT

acquisition function, augmented observation, hyperparameter, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Preference-based Conditional Treatment Effects and Policy Learning

Parnas, Dovid, Even, Mathieu, Josse, Julie, Shalit, Uri

arXiv.org Machine LearningFeb-4-2026

We introduce a new preference-based framework for conditional treatment effect estimation and policy learning, built on the Conditional Preference-based Treatment Effect (CPTE). CPTE requires only that outcomes be ranked under a preference rule, unlocking flexible modeling of heterogeneous effects with multivariate, ordinal, or preference-driven outcomes. This unifies applications such as conditional probability of necessity and sufficiency, conditional Win Ratio, and Generalized Pairwise Comparisons. Despite the intrinsic non-identifiability of comparison-based estimands, CPTE provides interpretable targets and delivers new identifiability conditions for previous unidentifiable estimands. We present estimation strategies via matching, quantile, and distributional regression, and further design efficient influence-function estimators to correct plug-in bias and maximize policy value. Synthetic and semi-synthetic experiments demonstrate clear performance gains and practical impact.

artificial intelligence, machine learning, potential outcome, (16 more...)

arXiv.org Machine Learning

2602.03823

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Tennessee (0.04)
Europe > France > Occitanie > Hérault > Montpellier (0.04)
(2 more...)

Genre:

Research Report > Strength High (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (1.00)
Education (1.00)
Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

Add feedback

Preference-Based Batch and Sequential Teaching: Towards a Unified View of Models

Neural Information Processing SystemsDec-25-2025, 08:44:22 GMT

Algorithmic machine teaching studies the interaction between a teacher and a learner where the teacher selects labeled examples aiming at teaching a target hypothesis. In a quest to lower teaching complexity and to achieve more natural teacher-learner interactions, several teaching models and complexity measures have been proposed for both the batch settings (e.g., worst-case, recursive, preference-based, and non-clashing models) as well as the sequential settings (e.g., local preference-based model). To better understand the connections between these different batch and sequential models, we develop a novel framework which captures the teaching process via preference functions $\Sigma$. In our framework, each function $\sigma \in \Sigma$ induces a teacher-learner pair with teaching complexity as $\TD(\sigma)$. We show that the above-mentioned teaching models are equivalent to specific types/families of preference functions in our framework. This equivalence, in turn, allows us to study the differences between two important teaching models, namely $\sigma$ functions inducing the strongest batch (i.e., non-clashing) model and $\sigma$ functions inducing a weak sequential (i.e., local preference-based) model. Finally, we identify preference functions inducing a novel family of sequential models with teaching complexity linear in the VC dimension of the hypothesis class: this is in contrast to the best known complexity result for the batch models which is quadratic in the VC dimension.

name change, preference-based batch and sequential teaching, unified view, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.99)

Add feedback

Beyond RLHF and NLHF: Population-Proportional Alignment under an Axiomatic Framework

Kim, Kihyun, Zhang, Jiawei, Ozdaglar, Asuman, Parrilo, Pablo A.

arXiv.org Artificial IntelligenceOct-7-2025

Conventional preference learning methods often prioritize opinions held more widely when aggregating preferences from multiple evaluators. This may result in policies that are biased in favor of some types of opinions or groups and susceptible to strategic manipulation. To address this issue, we develop a novel preference learning framework capable of aligning aggregate opinions and policies proportionally with the true population distribution of evaluator preferences. Grounded in social choice theory, our approach infers the feasible set of evaluator population distributions directly from pairwise comparison data. Using these estimates, the algorithm constructs a policy that satisfies foundational axioms from social choice theory, namely monotonicity and Pareto efficiency, as well as our newly-introduced axioms of population-proportional alignment and population-bounded manipulability. Moreover, we propose a soft-max relaxation method that smoothly trade-offs population-proportional alignment with the selection of the Condorcet winner (which beats all other options in pairwise comparisons). Finally, we validate the effectiveness and scalability of our approach through experiments on both tabular recommendation tasks and large language model alignment.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2506.05619

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
(3 more...)

Add feedback

69eba34671b3ef1ef38ee85caae6b2a1-Supplemental.pdf

Neural Information Processing SystemsOct-3-2025, 03:52:27 GMT

artificial intelligence, hyperparameter, machine learning, (19 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Preference-Based Batch and Sequential Teaching: Towards a Unified View of Models

Farnam Mansouri, Yuxin Chen, Ara Vartanian, Jerry Zhu, Adish Singla

Neural Information Processing SystemsOct-2-2025, 16:51:27 GMT

Algorithmic machine teaching studies the interaction between a teacher and a learner where the teacher selects labeled examples aiming at teaching a target hypothesis.

artificial intelligence, machine learning, preference function, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Industry: Education (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.48)

Add feedback

Multi-Turn Puzzles: Evaluating Interactive Reasoning and Strategic Dialogue in LLMs

Badola, Kartikeya, Simon, Jonathan, Hosseini, Arian, Carthy, Sara Marie Mc, Munkhdalai, Tsendsuren, Goyal, Abhimanyu, Kočiský, Tomáš, Upadhyay, Shyam, Fatemi, Bahare, Kazemi, Mehran

arXiv.org Artificial IntelligenceAug-26-2025

Large language models (LLMs) excel at solving problems with clear and complete statements, but often struggle with nuanced environments or interactive tasks which are common in most real-world scenarios. This highlights the critical need for developing LLMs that can effectively engage in logically consistent multi-turn dialogue, seek information and reason with incomplete data. To this end, we introduce a novel benchmark comprising a suite of multi-turn tasks each designed to test specific reasoning, interactive dialogue, and information-seeking abilities. These tasks have deterministic scoring mechanisms, thus eliminating the need for human intervention. Evaluating frontier models on our benchmark reveals significant headroom. Our analysis shows that most errors emerge from poor instruction following, reasoning failures, and poor planning. This benchmark provides valuable insights into the strengths and weaknesses of current LLMs in handling complex, interactive scenarios and offers a robust platform for future research aimed at improving these critical capabilities.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2508.10142

Country:

North America (0.28)
Asia (0.28)

Genre: Research Report (0.42)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.99)

Add feedback

Preference-Based Batch and Sequential Teaching: Towards a Unified View of Models

Neural Information Processing SystemsMay-27-2025, 10:42:06 GMT

preference-based batch and sequential teaching, teaching complexity, teaching model, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.63)

Add feedback

Mitigating Preference Hacking in Policy Optimization with Pessimism

Gupta, Dhawal, Fisch, Adam, Dann, Christoph, Agarwal, Alekh

arXiv.org Artificial IntelligenceMar-9-2025

This work tackles the problem of overoptimization in reinforcement learning from human feedback (RLHF), a prevalent technique for aligning models with human preferences. RLHF relies on reward or preference models trained on \emph{fixed preference datasets}, and these models are unreliable when evaluated outside the support of this preference data, leading to the common reward or preference hacking phenomenon. We propose novel, pessimistic objectives for RLHF which are provably robust to overoptimization through the use of pessimism in the face of uncertainty, and design practical algorithms, P3O and PRPO, to optimize these objectives. Our approach is derived for the general preference optimization setting, but can be used with reward models as well. We evaluate P3O and PRPO on the tasks of fine-tuning language models for document summarization and creating helpful assistants, demonstrating remarkable resilience to overoptimization.

mitigating preference hacking, objective, policy optimization, (13 more...)

arXiv.org Artificial Intelligence

2503.0681

Country:

Caspian Sea (0.04)
North America > United States > California > Riverside County (0.04)
North America > United States > California > Imperial County (0.04)
(9 more...)

Genre: Research Report (0.64)

Industry:

Education > Health & Safety > School Nutrition (1.00)
Health & Medicine > Consumer Health (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Add feedback