AITopics | partial feedback

Collaborating Authors

partial feedback

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Markov Persuasion Processes: Learning to Persuade From Scratch

Neural Information Processing SystemsJun-17-2026, 22:31:41 GMT

In Bayesian persuasion, an informed sender strategically discloses information to a receiver so as to persuade them to undertake desirable actions. Recently, Markov persuasion processes (MPPs) have been introduced to capture sequential scenarios where a sender faces a stream of myopic receivers in a Markovian environment. The MPPs studied so far in the literature suffer from issues that prevent them from being fully operational in practice, e.g., they assume that the sender knows receivers' rewards. We fix such issues by addressing MPPs where the sender has no knowledge about the environment.

artificial intelligence, machine learning, probability, (17 more...)

Neural Information Processing Systems

Country: Europe > Austria (0.27)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology > Services (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Online Conformal Prediction with Adversarial Semi-bandit Feedback via Regret Minimization

Yang, Junyoung, Kim, Kyungmin, Park, Sangdon

arXiv.org Machine LearningApr-21-2026

Uncertainty quantification is crucial in safety-critical systems, where decisions must be made under uncertainty. In particular, we consider the problem of online uncertainty quantification, where data points arrive sequentially. Online conformal prediction is a principled online uncertainty quantification method that dynamically constructs a prediction set at each time step. While existing methods for online conformal prediction provide long-run coverage guarantees without any distributional assumptions, they typically assume a full feedback setting in which the true label is always observed. In this paper, we propose a novel learning method for online conformal prediction with partial feedback from an adaptive adversary-a more challenging setup where the true label is revealed only when it lies inside the constructed prediction set. Specifically, we formulate online conformal prediction as an adversarial bandit problem by treating each candidate prediction set as an arm. Building on an existing algorithm for adversarial bandits, our method achieves a long-run coverage guarantee by explicitly establishing its connection to the regret of the learner. Finally, we empirically demonstrate the effectiveness of our method in both independent and identically distributed (i.i.d.) and non-i.i.d. settings, showing that it successfully controls the miscoverage rate while maintaining a reasonable size of the prediction set.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Machine Learning

2604.17984

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.66)

Add feedback

Equal Opportunity in Online Classification with Partial Feedback

Neural Information Processing SystemsDec-24-2025, 23:56:50 GMT

We study an online classification problem with partial feedback in which individuals arrive one at a time from a fixed but unknown distribution, and must be classified as positive or negative. Our algorithm only observes the true label of an individual if they are given a positive classification. This setting captures many classification problems for which fairness is a concern: for example, in criminal recidivism prediction, recidivism is only observed if the inmate is released; in lending applications, loan repayment is only observed if the loan is granted. We require that our algorithms satisfy common statistical fairness constraints (such as equalizing false positive or negative rates --- introduced as equal opportunity in Hardt et al. (2016)) at every round, with respect to the underlying distribution. We give upper and lower bounds characterizing the cost of this constraint in terms of the regret rate (and show that it is mild), and give an oracle efficient algorithm that achieves the upper bound.

equal opportunity, name change, online class ification, (4 more...)

Neural Information Processing Systems

Industry:

Education > Educational Setting > Online (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Online Selective Generation with Adversarial Bandit Feedback

Lee, Minjae, Jung, Yoonjae, Park, Sangdon

arXiv.org Artificial IntelligenceOct-14-2025

Large language generative models increasingly interact with humans, while their falsified responses raise concerns. To mitigate this hallucination effect, selectively abstaining from answering, called selective generation, provides an effective way for generators to control the hallucination when uncertain about their answers. However, as selective generators interact under adversarial environments and receive partial feedback from users on selected generation (e.g., thumbs up or down on the selected answer), learning methods for selective generation under such practical setups are crucial but currently missing. To address this limitation, we propose an online learning algorithm for selective generation with partial feedback under an adaptive adversary. In particular, we re-purpose an adversarial bandit algorithm to design an online selective generation method with controllable false discovery rates (FDR), which measures the rate of hallucination. The key building blocks include a novel conversion lemma from regret of any bandit algorithm to the FDR, and the exploitation of a unique structure of selective generation to reuse partial feedback, which we call feedback unlocking. We empirically evaluate the efficacy of the proposed online selective generation algorithm with partial feedback over diverse learning environments, demonstrating its ability to control the FDR, while maintaining reasonable selection efficiency, i.e., the ratio of non-abstaining answers, compared to baselines.

large language model, machine learning, partial feedback, (23 more...)

arXiv.org Artificial Intelligence

2506.14067

Genre: Research Report (0.50)

Industry: Education > Educational Setting (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Data Science > Data Mining > Big Data (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Cost Efficient Fairness Audit Under Partial Feedback

Das, Nirjhar, Sharma, Mohit, Nanavati, Praharsh, Shiragur, Kirankumar, Deshpande, Amit

arXiv.org Machine LearningOct-7-2025

We study the problem of auditing the fairness of a given classifier under partial feedback, where true labels are available only for positively classified individuals, (e.g., loan repayment outcomes are observed only for approved applicants). We introduce a novel cost model for acquiring additional labeled data, designed to more accurately reflect real-world costs such as credit assessment, loan processing, and potential defaults. Our goal is to find optimal fairness audit algorithms that are more cost-effective than random exploration and natural baselines. In our work, we consider two audit settings: a black-box model with no assumptions on the data distribution, and a mixture model, where features and true labels follow a mixture of exponential family distributions. In the black-box setting, we propose a near-optimal auditing algorithm under mild assumptions and show that a natural baseline can be strictly suboptimal. In the mixture model setting, we design a novel algorithm that achieves significantly lower audit cost than the black-box case. Our approach leverages prior work on learning from truncated samples and maximum-a-posteriori oracles, and extends known results on spherical Gaussian mixtures to handle exponential family mixtures, which may be of independent interest. Moreover, our algorithms apply to popular fairness metrics including demographic parity, equal opportunity, and equalized odds. Empirically, we demonstrate strong performance of our algorithms on real-world fair classification datasets like Adult Income and Law School, consistently outperforming natural baselines by around 50% in terms of audit cost.

algorithm, classifier, true label, (15 more...)

arXiv.org Machine Learning

2510.03734

Country:

Asia > India (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Virginia (0.04)
(2 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Banking & Finance > Loans (0.86)
Education > Educational Setting > Higher Education (0.34)
Education > Curriculum > Subject-Specific Education (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Reviews: Equal Opportunity in Online Classification with Partial Feedback

Neural Information Processing SystemsJan-21-2025, 09:48:38 GMT

This paper studies the problem of online classification with partial feedback under the new constraint that the policy satisfies a fairness (equality of false positives) constraint at each round. The paper leverages careful modification of a number of technical tools to prove the O(sqrt(T)) regret with gamma O(T (-1/4)) fairness rate. In particular, they reduce the partial feedback setting to a contextual bandits problem, construct an approximate "fair oracle" using a modification of the reductions approach to fair classification, and then modify ILOVETOCONBANDITS to use this approximate oracle. The relevant inspiration is clearly cited, and the main contribution is combining these tools to effectively handle the fairness constraint in the online learning problem. The proposed algorithm is intuitive: accept everyone in the early rounds to gather data and use this data to determine which classifiers satisfy the constrain.

equal opportunity, online class ification, partial feedback, (6 more...)

Neural Information Processing Systems

Industry:

Education > Educational Setting > Online (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.59)
Information Technology > Data Science > Data Mining > Big Data (0.39)

Add feedback

Equal Opportunity in Online Classification with Partial Feedback

Neural Information Processing SystemsOct-9-2024, 11:53:23 GMT

We study an online classification problem with partial feedback in which individuals arrive one at a time from a fixed but unknown distribution, and must be classified as positive or negative. Our algorithm only observes the true label of an individual if they are given a positive classification. This setting captures many classification problems for which fairness is a concern: for example, in criminal recidivism prediction, recidivism is only observed if the inmate is released; in lending applications, loan repayment is only observed if the loan is granted. We require that our algorithms satisfy common statistical fairness constraints (such as equalizing false positive or negative rates --- introduced as "equal opportunity" in Hardt et al. (2016)) at every round, with respect to the underlying distribution. We give upper and lower bounds characterizing the cost of this constraint in terms of the regret rate (and show that it is mild), and give an oracle efficient algorithm that achieves the upper bound.

equal opportunity, online class ification, partial feedback, (2 more...)

Neural Information Processing Systems

Industry:

Education > Educational Setting > Online (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

On Multilabel Classification and Ranking with Partial Feedback

Neural Information Processing SystemsMar-14-2024, 03:10:00 GMT

artificial intelligence, information management, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Industry: Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Information Management (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

On Multilabel Classification and Ranking with Partial Feedback

Neural Information Processing SystemsFeb-16-2024, 06:30:29 GMT

We present a novel multilabel/ranking algorithm working in partial information settings. The algorithm is based on 2nd-order descent methods, and relies on upper-confidence bounds to trade-off exploration and exploitation. We analyze this algorithm in a partial adversarial setting, where covariates can be adversarial, but multilabel probabilities are ruled by (generalized) linear models. We show O(T {1/2}\log T) regret bounds, which improve in several ways on the existing results. We test the effectiveness of our upper-confidence scheme by contrasting against full-information baselines on real-world multilabel datasets, often obtaining comparable performance.

algorithm, multilabel classification and ranking, partial feedback

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.93)

Add feedback

Markov Persuasion Processes: Learning to Persuade from Scratch

Bacchiocchi, Francesco, Stradi, Francesco Emanuele, Castiglioni, Matteo, Marchesi, Alberto, Gatti, Nicola

arXiv.org Artificial IntelligenceFeb-5-2024

In Bayesian persuasion, an informed sender strategically discloses information to a receiver so as to persuade them to undertake desirable actions. Recently, a growing attention has been devoted to settings in which sender and receivers interact sequentially. Recently, Markov persuasion processes (MPPs) have been introduced to capture sequential scenarios where a sender faces a stream of myopic receivers in a Markovian environment. The MPPs studied so far in the literature suffer from issues that prevent them from being fully operational in practice, e.g., they assume that the sender knows receivers' rewards. We fix such issues by addressing MPPs where the sender has no knowledge about the environment. We design a learning algorithm for the sender, working with partial feedback. We prove that its regret with respect to an optimal information-disclosure policy grows sublinearly in the number of episodes, as it is the case for the loss in persuasiveness cumulated while learning. Moreover, we provide a lower bound for our setting matching the guarantees of our algorithm.

algorithm, probability, receiver, (15 more...)

arXiv.org Artificial Intelligence

2402.03077

Country:

Europe > Austria > Vienna (0.14)
Europe > Kosovo > District of Gjilan > Kamenica (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report (0.81)

Industry: Information Technology (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback