dp 0
Neyman-Pearson multiclass classification under label noise via empirical likelihood
Zhang, Qiong, Tian, Qinglong, Li, Pengfei
In many classification problems, the costs of misclassifying observations from different classes can be highly unequal. The Neyman-Pearson multiclass classification (NPMC) framework addresses this issue by minimizing a weighted misclassification risk while imposing upper bounds on class-specific error probabilities. Existing NPMC methods typically assume that training labels are correctly observed. In practice, however, labels are often corrupted due to measurement error or annotation, and the effect of such label noise on NPMC procedures remains largely unexplored. We study the NPMC problem when only noisy labels are available in the training data. We propose an empirical likelihood (EL)-based method that relates the distributions of noisy and true labels through an exponential tilting density ratio model. The resulting maximum EL estimators recover the class proportions and posterior probabilities of the clean labels required for error control. We establish consistency, asymptotic normality, and optimal convergence rates for these estimators. Under mild conditions, the resulting classifier satisfies NP oracle inequalities with respect to the true labels asymptotically. An expectation-maximization algorithm computes the maximum EL estimators. Simulations show that the proposed method performs comparably to the oracle classifier under clean labels and substantially improves over procedures that ignore label noise.
Entropic Neural Optimal Transport via Diffusion Processes
We propose a novel neural algorithm for the fundamental problem of computing the entropic optimal transport (EOT) plan between continuous probability distributions which are accessible by samples. Our algorithm is based on the saddle point reformulation of the dynamic version of EOT which is known as the Schrödinger Bridge problem. In contrast to the prior methods for large-scale EOT, our algorithm is end-to-end and consists of a single learning step, has fast inference procedure, and allows handling small values of the entropy regularization coefficient which is of particular importance in some applied problems. Empirically, we show the performance of the method on several large-scale EOT tasks.
- Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
- Asia > Russia (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- (2 more...)
- North America > United States (0.14)
- Asia > Middle East > Jordan (0.04)
A Proofs
Let Cost(π) be the value of weak OT functional for a plan π, i.e., Cost( π) We are going to use our Theorem 3.1. As a result, every plan is optimal.Proof of Proposition 3.3. According to our Theorem 3.2, one only has to ensure that Anyway, this is indifferent for us. It remains to upper bound the first term in (23). Formula (12) for the optimal drift follows from [38, Proposition 4.1] From our Proposition 3.3 it follows that For other ϵ > 0, the analogous equivalence holds true.
- Asia > Russia (0.14)
- Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (2 more...)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > Russia (0.14)
- Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Data Science (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)