Goto

Collaborating Authors

 kuzushiji


Appendix A Proofs A.1 Proof of Proposition

Neural Information Processing Systems

For the general backward correction, based on Eq. 11, conducting adversarial training (A T) on the Based on Eqs. 13, 14 and 15, the inequality holds between their empirical formulations: We adversarially train a model with several complementary losses separately on Kuzushiji. The results show the same observation (as in Section 4.2) Note that we only optimize the model using the ones generated by the oracle. Figure 6: The results on four randomly sampled instances from Kuzushiji. A T with CLs, the two-stage method consists of a complementary learning phase and an A T phase, following the setups of complementary learning setups and A T setups (in Section 5), respectively. For CIFAR10 and SVHN, their learning rates are set to 0.01.


Adversarial Training with Complementary Labels: On the Benefit of Gradually Informative Attacks

Neural Information Processing Systems

To push A T towards more practical scenarios, we explore a brand new yet challenging setting, i.e., A T with complementary labels (CLs), which specify a class that a data sample does not belong to.




Learning from Stochastic Labels

Wei, Meng, Li, Zhongnian, Zhou, Yong, Guo, Qiaoyu, Xu, Xinzheng

arXiv.org Artificial Intelligence

Annotating multi-class instances is a crucial task in the field of machine learning. Unfortunately, identifying the correct class label from a long sequence of candidate labels is time-consuming and laborious. To alleviate this problem, we design a novel labeling mechanism called stochastic label. In this setting, stochastic label includes two cases: 1) identify a correct class label from a small number of randomly given labels; 2) annotate the instance with None label when given labels do not contain correct class label. In this paper, we propose a novel suitable approach to learn from these stochastic labels. We obtain an unbiased estimator that utilizes less supervised information in stochastic labels to train a multi-class classifier. Additionally, it is theoretically justifiable by deriving the estimation error bound of the proposed method. Finally, we conduct extensive experiments on widely-used benchmark datasets to validate the superiority of our method by comparing it with existing state-of-the-art methods.


Adversarial Training with Complementary Labels: On the Benefit of Gradually Informative Attacks

Zhou, Jianan, Zhu, Jianing, Zhang, Jingfeng, Liu, Tongliang, Niu, Gang, Han, Bo, Sugiyama, Masashi

arXiv.org Artificial Intelligence

Adversarial training (AT) with imperfect supervision is significant but receives limited attention. To push AT towards more practical scenarios, we explore a brand new yet challenging setting, i.e., AT with complementary labels (CLs), which specify a class that a data sample does not belong to. However, the direct combination of AT with existing methods for CLs results in consistent failure, but not on a simple baseline of two-stage training. In this paper, we further explore the phenomenon and identify the underlying challenges of AT with CLs as intractable adversarial optimization and low-quality adversarial examples. To address the above problems, we propose a new learning strategy using gradually informative attacks, which consists of two critical components: 1) Warm-up Attack (Warm-up) gently raises the adversarial perturbation budgets to ease the adversarial optimization with CLs; 2) Pseudo-Label Attack (PLA) incorporates the progressively informative model predictions into a corrected complementary loss. Extensive experiments are conducted to demonstrate the effectiveness of our method on a range of benchmarked datasets. The code is publicly available at: https://github.com/RoyalSkye/ATCL.


How Machine Learning Can Help Unlock the World of Ancient Japan

#artificialintelligence

Humanity's rich history has left behind an enormous number of historical documents and artifacts. However, virtually none of these documents, containing stories and recorded experiences essential to our cultural heritage, can be understood by non-experts due to language and writing changes over time. For instance, archaeologist have unearthed tens of thousands of clay tablets from ancient Babylon [1], yet only a few hundred specially trained scholars can translate them. The vast majority of these documents have never been read, even if they were uncovered in the 1800s. To give a further illustration of the challenge posed by this scale, a tablet from the Tale of Gilgamesh was collected in an expedition in 1851, but its significance was not brought to light until 1872.


AI Making Ancient Japanese Texts More Accessible NVIDIA Blog

#artificialintelligence

Natural disasters aren't just threats to people and buildings, they can also erase history -- by destroying rare archival documents. As a safeguard, scholars in Japan are digitizing the country's centuries-old paper records, typically by taking a scan or photo of each page. But while this method preserves the content in digital form, it doesn't mean researchers will be able to read it. Millions of physical books and documents were written in an obsolete script called Kuzushiji, legible to fewer than 10 percent of Japanese humanities professors. "We end up with billions of images which will take researchers hundreds of years to look through," said Tarin Clanuwat, researcher at Japan's ROIS-DS Center for Open Data in the Humanities.