Goto

Collaborating Authors

 preference statement


Towards Fast Algorithms for the Preference Consistency Problem Based on Hierarchical Models

George, Anne-Marie, Wilson, Nic, O'Sullivan, Barry

arXiv.org Artificial Intelligence

Such order relations can be, e.g., comparing alternatives by the values of the evaluation functions In this paper, we construct and compare algorithmic approaches lexicographically [15], by Pareto order, weighted sums [6], to solve the Preference Consistency Problem for based on hierarchical models [16] or by conditional preferences preference statements based on hierarchical models. Instances structures as CP-nets [2] and partial lexicographic preference of this problem contain a set of preference statements that are trees [11]. Here, the choice of the order relation can direct comparisons (strict and non-strict) between some alternatives, lead to stronger or weaker inferences and can make solving and a set of evaluation functions by which all alternatives PDP computationally more or less challenging. In a recommender can be rated. An instance is consistent based on hierarchical system or in a multi-objective decision making scenario, preference models, if there exists an hierarchical model the user should only be presented with a relatively small on the evaluation functions that induces an order relation on number of solutions, hence, a strong order relation is required.


Efficient Inference and Computation of Optimal Alternatives for Preference Languages Based On Lexicographic Models

Wilson, Nic, George, Anne-Marie

arXiv.org Artificial Intelligence

We analyse preference inference, through consistency, for general preference languages based on lexicographic models. We identify a property, which we call strong compositionality, that applies for many natural kinds of preference statement, and that allows a greedy algorithm for determining consistency of a set of preference statements. We also consider different natural definitions of optimality, and their relations to each other, for general preference languages based on lexicographic models. Based on our framework, we show that testing consistency, and thus inference, is polynomial for a specific preference language LpqT, which allows strict and non-strict statements, comparisons between outcomes and between partial tuples, both ceteris paribus and strong statements, and their combination. Computing different kinds of optimal sets is also shown to be polynomial; this is backed up by our experimental results.


HelpSteer2-Preference: Complementing Ratings with Preferences

Wang, Zhilin, Bukharin, Alexander, Delalleau, Olivier, Egert, Daniel, Shen, Gerald, Zeng, Jiaqi, Kuchaiev, Oleksii, Dong, Yi

arXiv.org Artificial Intelligence

Reward models are critical for aligning models to follow instructions, and are typically trained following one of two popular paradigms: Bradley-Terry style or Regression style. However, there is a lack of evidence that either approach is better than the other, when adequately matched for data. This is primarily because these approaches require data collected in different (but incompatible) formats, meaning that adequately matched data is not available in existing public datasets. To tackle this problem, we release preference annotations (designed for Bradley-Terry training) to complement existing ratings (designed for Regression style training) in the HelpSteer2 dataset. To improve data interpretability, preference annotations are accompanied with human-written justifications. Using this data, we conduct the first head-to-head comparison of Bradley-Terry and Regression models when adequately matched for data. Based on insights derived from such a comparison, we propose a novel approach to combine Bradley-Terry and Regression reward modeling. A Llama-3.1-70B-Instruct model tuned with this approach scores 94.1 on RewardBench, emerging top of more than 140 reward models as of 1 Oct 2024. We also demonstrate the effectiveness of this reward model at aligning models to follow instructions in RLHF. We open-source this dataset (CC-BY-4.0 license) at https://huggingface.co/datasets/nvidia/HelpSteer2 and openly release the trained Reward Model at https://huggingface.co/nvidia/Llama-3.1-Nemotron-70B-Reward


A CP-Net based Qualitative Composition Approach for an IaaS Provider

Fattah, Sheik Mohammad Mostakim, Bouguettaya, Athman, Mistry, Sajib

arXiv.org Artificial Intelligence

We propose a novel CP-Net based composition approach to qualitatively select an optimal set of consumers for an IaaS provider. The IaaS provider's and consumers' qualitative preferences are captured using CP-Nets. We propose a CP-Net composability model using the semantic congruence property of a qualitative composition. A greedy-based and a heuristic-based consumer selection approaches are proposed that effectively reduce the search space of candidate consumers in the composition. Experimental results prove the feasibility of the proposed composition approach.


A Knowledge Compilation Map for Conditional Preference Statements-based Languages

Fargier, Hélène, Mengin, Jérôme

arXiv.org Artificial Intelligence

Conditional preference statements have been used to compactly represent preferences over combinatorial domains. They are at the core of CP-nets and their generalizations, and lexicographic preference trees. Several works have addressed the complexity of some queries (optimization, dominance in particular). We extend in this paper some of these results, and study other queries which have not been addressed so far, like equivalence, thereby contributing to a knowledge compilation map for languages based on conditional preference statements. We also introduce a new parameterised family of languages, which enables to balance expressiveness against the complexity of some queries.


Learning User Preferences for Trajectories from Brain Signals

Kolkhorst, Henrich, Burgard, Wolfram, Tangermann, Michael

arXiv.org Machine Learning

Robot motions in the presence of humans should not only be feasible and safe, but also conform to human preferences. This, however, requires user feedback on the robot's behavior. In this work, we propose a novel approach to leverage the user's brain signals as a feedback modality in order to decode the judgment of robot trajectories and rank them according to the user's preferences. We show that brain signals measured using electroencephalography during observation of a robotic arm's trajectory as well as in response to preference statements are informative regarding the user's preference. Furthermore, we demonstrate that user feedback from brain signals can be used to reliably infer pairwise trajectory preferences as well as to retrieve the preferred observed trajectories of the user with a performance comparable to explicit behavioral feedback.


Probabilistic Models over Weighted Orderings: Fixed-Parameter Tractable Variable Elimination

Lukasiewicz, Thomas (University of Oxford) | Martinez, Maria Vanina (Universidad Nacional del Sur) | Poole, David (University of British Columbia) | Simari, Gerardo Ignacio (Universidad Nacional del Sur)

AAAI Conferences

Probabilistic models with weighted formulas, known as Markov models or log-linear models, are used in many domains. Recent models of weighted orderings between elements that have been proposed as flexible tools to express preferences under uncertainty, are also potentially useful in applications like planning, temporal reasoning, and user modeling. Their computational properties are very different from those of conventional Markov models; because of the transitivity of the “less than” relation, standard methods that exploit structure of the models, such as variable elimination, are not directly applicable, as there are no conditional independencies between the orderings within connected components. The best known algorithms for general inference inthese models are exponential in the number of statements. Here, we present the first algorithms that exploit the available structure. We begin with the special case of models in the form of chains; we present an exact O(n^3) algorithm, where n is the total number of elements. Next, we generalize this technique to models in which the set of statements are comprised of arbitrary sets of atomic weighted preference formulas (while the query and evidence are conjunctions of atomic preference formulas), and the resulting exact algorithm runs in time O(m * n^2 * n^c), where m is the number of preference formulas, n is the number of elements, and c is the maximum number of elements in a linear cut (which depends both on the structure of the model and the order in which the elements are processed)—therefore, this algorithm is tractable for cases in which c can be bounded to a low value. Finally, we report on the results of an empirical evaluation of both algorithms, showing how they scale with reasonably-sized models.


Modular Systems with Preferences

Ensan, Alireza (Simon Fraser University) | Ternovska, Eugenia (Simon Fraser University)

AAAI Conferences

We propose a versatile framework for combining knowledge bases in modular systems with preferences. In our formalism, each module (knowledge base) can be specified in a different language. We define the notion of a preference-based modular system that includes a formalization of meta-preferences. We prove that our formalism is robust in the sense that the operations for combining modules preserve the notion of  a preference-based modular system. Finally, we formally demonstrate correspondences between our framework and the related preference formalisms of cp-nets and preference-based planning. Our framework allows one to use  these preference formalisms (and others) in combination, in the same modular system.


Computation and Complexity of Preference Inference Based on Hierarchical Models

Wilson, Nic (University College Cork) | George, Anne-Marie (University College Cork) | O' (University College Cork) | Sullivan, Barry

AAAI Conferences

Preference Inference involves inferring additional user preferences from elicited or observed preferences, based on assumptions regarding the form of the user's preference relation. In this paper we consider a situation in which alternatives have an associated vector of costs, each component corresponding to a different criterion, and are compared using a kind of lexicographic order, similar to the way alternatives are compared in a Hierarchical Constraint Logic Programming model. It is assumed that the user has some (unknown) importance ordering on criteria, and that to compare two alternatives, firstly, the combined cost of each alternative with respect to the most important criteria are compared; only if these combined costs are equal, are the next most important criteria considered. The preference inference problem then consists of determining whether a preference statement can be inferred from a set of input preferences. We show that this problem is co-NP-complete, even if one restricts the cardinality of the equal-importance sets to have at most two elements, and one only considers non-strict preferences. However, it is polynomial if it is assumed that the user's ordering of criteria is a total ordering; it is also polynomial if the sets of equally important criteria are all equivalence classes of a given fixed equivalence relation. We give an efficient polynomial algorithm for these cases, which also throws light on the structure of the inference.


asprin: Customizing Answer Set Preferences without a Headache

Brewka, Gerhard (University of Leipzig) | Delgrande, James (Simon Fraser University) | Romero, Javier (University of Potsdam) | Schaub, Torsten (University of Potsdam)

AAAI Conferences

In this paper we describe asprin, a general, flexible, and extensible framework for handling preferences among the stable models of a logic program. We show how complex preference relations can be specified through user-defined preference types and their arguments. We describe how preference specifications are handled internally by so-called preference programs, which are used for dominance testing. We also give algorithms for computing one, or all, optimal stable models of a logic program. Notably, our algorithms depend on the complexity of the dominance tests and make use of multi-shot answer set solving technology.