Goto

Collaborating Authors

 ordinal data



Matrix Completion with Quantified Uncertainty through Low Rank Gaussian Copula

Neural Information Processing Systems

Modern large scale datasets are often plagued with missing entries. For tabular data with missing values, a flurry of imputation algorithms solve for a complete matrix which minimizes some penalized reconstruction error. However, almost none of them can estimate the uncertainty of its imputations. This paper proposes a probabilistic and scalable framework for missing value imputation with quantified uncertainty.


Approximately Unimodal Likelihood Models for Ordinal Regression

Yamasaki, Ryoya

arXiv.org Machine Learning

Ordinal regression (OR, also called ordinal classification) is classification of ordinal data, in which the underlying target variable is categorical and considered to have a natural ordinal relation for the underlying explanatory variable. A key to successful OR models is to find a data structure `natural ordinal relation' common to many ordinal data and reflect that structure into the design of those models. A recent OR study found that many real-world ordinal data show a tendency that the conditional probability distribution (CPD) of the target variable given a value of the explanatory variable will often be unimodal. Several previous studies thus developed unimodal likelihood models, in which a predicted CPD is guaranteed to become unimodal. However, it was also observed experimentally that many real-world ordinal data partly have values of the explanatory variable where the underlying CPD will be non-unimodal, and hence unimodal likelihood models may suffer from a bias for such a CPD. Therefore, motivated to mitigate such a bias, we propose approximately unimodal likelihood models, which can represent up to a unimodal CPD and a CPD that is close to be unimodal. We also verify experimentally that a proposed model can be effective for statistical modeling of ordinal data and OR tasks.


Learning Mixed Multinomial Logit Model from Ordinal Data

Neural Information Processing Systems

Motivated by generating personalized recommendations using ordinal (or preference) data, we study the question of learning a mixture of MultiNomial Logit (MNL) model, a parameterized class of distributions over permutations, from partial ordinal or preference data (e.g.




Beyond Ordinal Preferences: Why Alignment Needs Cardinal Human Feedback

Whitfill, Parker, Slocum, Stewy

arXiv.org Artificial Intelligence

Alignment techniques for LLMs rely on optimizing preference-based objectives -- where these preferences are typically elicited as ordinal, binary choices between responses. Recent work has focused on improving label quality or mitigating particular biases, but we identify a more fundamental limitation: these methods collect the wrong kind of data. We prove an impossibility result: no algorithm relying solely on ordinal comparisons can systematically recover the most preferred model. Intuitively, ordinal data lacks the information needed to resolve tradeoffs -- e.g., fixing a factual error on one prompt versus improving style on another. We show that selecting the optimal model requires recovering preferences over \emph{models} (rather than just responses), which can only be identified given cardinal feedback about response quality. To address this, we collect and publicly release a dataset of 25,000 cardinal judgments using willingness-to-pay elicitations, a well-established tool from experimental economics. Empirically, we find that incorporating cardinal feedback into preference fine-tuning allows models to prioritize high-impact improvements and outperform ordinal-only methods on downstream benchmarks, such as Arena-Hard.


Collaboratively Learning Preferences from Ordinal Data

Neural Information Processing Systems

In personalized recommendation systems, it is important to predict preferences of a user on items that have not been seen by that user yet. Similarly, in revenue management, it is important to predict outcomes of comparisons among those items that have never been compared so far. The MultiNomial Logit model, a popular discrete choice model, captures the structure of the hidden preferences with a low-rank matrix. In order to predict the preferences, we want to learn the underlying model from noisy observations of the low-rank matrix, collected as revealed preferences in various forms of ordinal data. A natural approach to learn such a model is to solve a convex relaxation of nuclear norm minimization. We present the convex relaxation approach in two contexts of interest: collaborative ranking and bundled choice modeling. In both cases, we show that the convex relaxation is minimax optimal. We prove an upper bound on the resulting error with finite samples, and provide a matching information-theoretic lower bound.


A New Approach for Multicriteria Assessment in the Ranking of Alternatives Using Cardinal and Ordinal Data

Liu, Fuh-Hwa Franklin, Shih, Su-Chuan

arXiv.org Artificial Intelligence

Modern methods for multi-criteria assessment (MCA), such as Data Envelopment Analysis (DEA), Stochastic Frontier Analysis (SFA), and Multiple Criteria Decision-Making (MCDM), are utilized to appraise a collection of Decision-Making Units (DMUs), also known as alternatives, based on several criteria. These methodologies inherently rely on assumptions and can be influenced by subjective judgment to effectively tackle the complex evaluation challenges in various fields. In real-world scenarios, it is essential to incorporate both quantitative and qualitative criteria as they consist of cardinal and ordinal data. Despite the inherent variability in the criterion values of different alternatives, the homogeneity assumption is often employed, significantly affecting evaluations. To tackle these challenges and determine the most appropriate alternative, we propose a novel MCA approach that combines two Virtual Gap Analysis (VGA) models. The VGA framework, rooted in linear programming, is pivotal in the MCA methodology. This approach improves efficiency and fairness, ensuring that evaluations are both comprehensive and dependable, thus offering a strong and adaptive solution. Two comprehensive numerical examples demonstrate the accuracy and transparency of our proposed method. The goal is to encourage continued advancement and stimulate progress in automated decision systems and decision support systems.


Modeling Human Responses by Ordinal Archetypal Analysis

Wedenborg, Anna Emilie J., Harborg, Michael Alexander, Bigom, Andreas, Elmgreen, Oliver, Presutti, Marcus, Råskov, Andreas, Glückstad, Fumiko Kano, Schmidt, Mikkel, Mørup, Morten

arXiv.org Artificial Intelligence

This paper introduces a novel framework for Archetypal Analysis (AA) tailored to ordinal data, particularly from questionnaires. Unlike existing methods, the proposed method, Ordinal Archetypal Analysis (OAA), bypasses the two-step process of transforming ordinal data into continuous scales and operates directly on the ordinal data. We extend traditional AA methods to handle the subjective nature of questionnaire-based data, acknowledging individual differences in scale perception. We introduce the Response Bias Ordinal Archetypal Analysis (RBOAA), which learns individualized scales for each subject during optimization. The effectiveness of these methods is demonstrated on synthetic data and the European Social Survey dataset, highlighting their potential to provide deeper insights into human behavior and perception. The study underscores the importance of considering response bias in cross-national research and offers a principled approach to analyzing ordinal data through Archetypal Analysis.