tiny-imagenet
Towards Accurate and Calibrated Classification: Regularizing Cross-Entropy From A Generative Perspective
Zhan, Qipeng, Zhou, Zhuoping, Shen, Li
Accurate classification requires not only high predictive accuracy but also well-calibrated confidence estimates. Yet, modern deep neural networks (DNNs) are often overconfident, primarily due to overfitting on the negative log-likelihood (NLL). While focal loss variants alleviate this issue, they typically reduce accuracy, revealing a persistent trade-off between calibration and predictive performance. Motivated by the complementary strengths of generative and discriminative classifiers, we propose Generative Cross-Entropy (GCE), which maximizes $p(x|y)$ and is equivalent to cross-entropy augmented with a class-level confidence regularizer. Under mild conditions, GCE is strictly proper. Across CIFAR-10/100, Tiny-ImageNet, and a medical imaging benchmark, GCE improves both accuracy and calibration over cross-entropy, especially in the long-tailed scenario. Combined with adaptive piecewise temperature scaling (ATS), GCE attains calibration competitive with focal-loss variants without sacrificing accuracy.
- Health & Medicine > Therapeutic Area > Neurology (0.68)
- Health & Medicine > Diagnostic Medicine > Imaging (0.48)
- North America > United States (0.04)
- Asia > Middle East > Israel (0.04)
- North America > United States (0.04)
- Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
- Asia > China (0.04)
- South America > Brazil > São Paulo (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- (3 more...)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Asia > Middle East > Jordan (0.04)
- Information Technology (0.67)
- Government > Military (0.46)
- Government > Regional Government (0.46)
- Research Report (0.46)
- Overview (0.40)
8c64bc3f7796d31caa7c3e6b969bf7da-Paper-Conference.pdf
Deep active learning aims to reduce the annotation cost for the training of deep models, which is notoriously data-hungry. Until recently, deep active learning methods were ineffectual inthelow-budgetregime, where only asmall number ofexamples areannotated. Thesituation hasbeen alleviated byrecent advances inrepresentation andself-supervised learning, which impart thegeometry ofthe data representation with rich information about the points.
- Asia > Middle East > Israel (0.04)
- Africa > Ethiopia (0.04)
Appendix A Related Work
For the latter, PT -based methods adaptively extract a matching width-based slimmed-down sub-model from the global model as a local model according to each client's budget, thus averting the requirements for public data. As with FedAvg, PT -based methods require the server to periodically communicate with the clients. Existing PT -based methods focus on how to extract width-based sub-models from the global model. DFKD methods are promising, which transfer knowledge from the teacher model to another student model without any real data. Existing DFKD methods can be broadly classified into non-adversarial and adversarial training methods. They take the quality and/or diversity of the synthetic data as important objectives.
- Information Technology > Security & Privacy (1.00)
- Education (0.66)
- Asia > Taiwan (0.06)
- North America > United States (0.05)
- North America > Canada (0.05)