unbiasedness
Demystifying the Optimal Performance of Multi-Class Classification
Classification is a fundamental task in science and engineering on which machine learning methods have shown outstanding performances. However, it is challenging to determine whether such methods have achieved the Bayes error rate, that is, the lowest error rate attained by any classifier. This is mainly due to the fact that the Bayes error rate is not known in general and hence, effectively estimating it is paramount. Inspired by the work by Ishida et al. (2023), we propose an estimator for the Bayes error rate of supervised multi-class classification problems. We analyze several theoretical aspects of such estimator, including its consistency, unbiasedness, convergence rate, variance, and robustness. We also propose a denoising method that reduces the noise that potentially corrupts the data labels, and we improve the robustness of the proposed estimator to outliers by incorporating the median-of-means estimator. Our analysis demonstrates the consistency, asymptotic unbiasedness, convergence rate, and robustness of the proposed estimators.
TranSUN: A Preemptive Paradigm to Eradicate Retransformation Bias Intrinsically from Regression Models in Recommender Systems
Yu, Jiahao, Liu, Haozhuang, Yang, Yeqiu, Chen, Lu, Wu, Jian, Jiang, Yuning, Zheng, Bo
Regression models are crucial in recommender systems. However, retransformation bias problem has been conspicuously neglected within the community. While many works in other fields have devised effective bias correction methods, all of them are post-hoc cures externally to the model, facing practical challenges when applied to real-world recommender systems. Hence, we propose a preemptive paradigm to eradicate the bias intrinsically from the models via minor model refinement. Specifically, a novel TranSUN method is proposed with a joint bias learning manner to offer theoretically guaranteed unbiasedness under empirical superior convergence. It is further generalized into a novel generic regression model family, termed Generalized TranSUN (GTS), which not only offers more theoretical insights but also serves as a generic framework for flexibly developing various bias-free models. Comprehensive experimental results demonstrate the superiority of our methods across data from various domains, which have been successfully deployed in two real-world industrial recommendation scenarios, i.e. product and short video recommendation scenarios in Guess What You Like business domain in the homepage of Taobao App (a leading e-commerce platform with DAU > 300M), to serve the major online traffic.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Asia > Singapore > Central Region > Singapore (0.04)
- Asia > China > Beijing > Beijing (0.04)
- (9 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
clarity recommendations the reviewers suggest, turning now to the main concerns of each reviewer
We thank the reviewers for their valuable feedback, which will improve the paper. Regarding the reviewer's comments about applications, we chose to limit the number of applications to three because Cauchy, which has unbounded variance), in contrast to our mechanisms. As requested, we will add a discussion about related work on lower bounds for private mechanisms. For the reviewer's main comment on the contributions of this paper with regard to Asi & Duchi 2020, we believe Such general (vector-valued) functions are the main focus of this submission. We thank the reviewer for bringing our attention to the Reimherr & A wan's K-norm mechanism (2019), which certainly We will discuss this work more carefully in the final version.