AITopics

2306.10656

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Research Report > Strength High (1.00)
Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Consumer Health (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

arXiv.org Artificial IntelligenceMar-10-2023

Diffusion models for missing value imputation in tabular data

Zheng, Shuhan, Charoenphakdee, Nontawat

Missing value imputation in machine learning is the task of estimating the missing values in the dataset accurately using available information. In this task, several deep generative modeling methods have been proposed and demonstrated their usefulness, e.g., generative adversarial imputation networks. Recently, diffusion models have gained popularity because of their effectiveness in the generative modeling task in images, texts, audio, etc. To our knowledge, less attention has been paid to the investigation of the effectiveness of diffusion models for missing value imputation in tabular data. Based on recent development of diffusion models for time-series data imputation, we propose a diffusion model approach called "Conditional Score-based Diffusion Models for Tabular data" (TabCSDI). To effectively handle categorical variables and numerical variables simultaneously, we investigate three techniques: one-hot encoding, analog bits encoding, and feature tokenization. Experimental results on benchmark datasets demonstrated the effectiveness of TabCSDI compared with well-known existing methods, and also emphasized the importance of the categorical embedding techniques.

data mining, imputation, machine learning, (15 more...)

2210.17128

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.94)

Industry: Health & Medicine > Therapeutic Area (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

arXiv.org Machine LearningFeb-1-2022

Is the Performance of My Deep Network Too Good to Be True? A Direct Approach to Estimating the Bayes Error in Binary Classification

Ishida, Takashi, Yamane, Ikko, Charoenphakdee, Nontawat, Niu, Gang, Sugiyama, Masashi

There is a fundamental limitation in the prediction performance that a machine learning model can achieve due to the inevitable uncertainty of the prediction target. In classification problems, this can be characterized by the Bayes error, which is the best achievable error with any classifier. The Bayes error can be used as a criterion to evaluate classifiers with state-of-the-art performance and can be used to detect test set overfitting. We propose a simple and direct Bayes error estimator, where we just take the mean of the labels that show \emph{uncertainty} of the classes. Our flexible approach enables us to perform Bayes error estimation even for weakly supervised data. In contrast to others, our method is model-free and even instance-free. Moreover, it has no hyperparameters and gives a more accurate estimate of the Bayes error than classifier-based baselines. Experiments using our method suggest that a recently proposed classifier, the Vision Transformer, may have already reached the Bayes error for certain benchmark datasets.

artificial intelligence, bayes error, machine learning, (20 more...)

2202.00395

Country:

Asia > Japan (0.46)
North America > United States > Massachusetts (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceSep-9-2021

Cross-lingual Transfer for Text Classification with Dictionary-based Heterogeneous Graph

Chairatanakul, Nuttapong, Sriwatanasakdi, Noppayut, Charoenphakdee, Nontawat, Liu, Xin, Murata, Tsuyoshi

In cross-lingual text classification, it is required that task-specific training data in high-resource source languages are available, where the task is identical to that of a low-resource target language. However, collecting such training data can be infeasible because of the labeling cost, task characteristics, and privacy concerns. This paper proposes an alternative solution that uses only task-independent word embeddings of high-resource languages and bilingual dictionaries. First, we construct a dictionary-based heterogeneous graph (DHG) from bilingual dictionaries. This opens the possibility to use graph neural networks for cross-lingual transfer. The remaining challenge is the heterogeneity of DHG because multiple languages are considered. To address this challenge, we propose dictionary-based heterogeneous graph neural network (DHGNet) that effectively handles the heterogeneity of DHG by two-step aggregations, which are word-level and language-level aggregations. Experimental results demonstrate that our method outperforms pretrained models even though it does not access to large corpora. Furthermore, it can perform well even though dictionaries contain many incorrect translations. Its robustness allows the usage of a wider range of dictionaries such as an automatically constructed dictionary and crowdsourced dictionary, which are convenient for real-world applications.

deep learning, neural network, proceedings, (19 more...)

2109.044

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Texas (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology > Security & Privacy (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

arXiv.org Machine LearningJan-5-2021

A Symmetric Loss Perspective of Reliable Machine Learning

Charoenphakdee, Nontawat, Lee, Jongyeong, Sugiyama, Masashi

When minimizing the empirical risk in binary classification, it is a common practice to replace the zero-one loss with a surrogate loss to make the learning objective feasible to optimize. Examples of well-known surrogate losses for binary classification include the logistic loss, hinge loss, and sigmoid loss. It is known that the choice of a surrogate loss can highly influence the performance of the trained classifier and therefore it should be carefully chosen. Recently, surrogate losses that satisfy a certain symmetric condition (aka., symmetric losses) have demonstrated their usefulness in learning from corrupted labels. In this article, we provide an overview of symmetric losses and their applications. First, we review how a symmetric loss can yield robust classification from corrupted labels in balanced error rate (BER) minimization and area under the receiver operating characteristic curve (AUC) maximization. Then, we demonstrate how the robust AUC maximization method can benefit natural language processing in the problem where we want to learn only from relevant keywords and unlabeled documents. Finally, we conclude this article by discussing future directions, including potential applications of symmetric losses for reliable machine learning and the design of non-symmetric losses that can benefit from the symmetric condition.

deep learning, neural network, symmetric loss, (19 more...)

2101.01366

Country: Asia (0.14)

Genre: Overview (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
(2 more...)

arXiv.org Machine LearningDec-13-2020

On Focal Loss for Class-Posterior Probability Estimation: A Theoretical Perspective

Charoenphakdee, Nontawat, Vongkulbhisal, Jayakorn, Chairatanakul, Nuttapong, Sugiyama, Masashi

The focal loss has demonstrated its effectiveness in many real-world applications such as object detection and image classification, but its theoretical understanding has been limited so far. In this paper, we first prove that the focal loss is classification-calibrated, i.e., its minimizer surely yields the Bayes-optimal classifier and thus the use of the focal loss in classification can be theoretically justified. However, we also prove a negative fact that the focal loss is not strictly proper, i.e., the confidence score of the classifier obtained by focal loss minimization does not match the true class-posterior probability and thus it is not reliable as a class-posterior probability estimator. To mitigate this problem, we next prove that a particular closed-form transformation of the confidence score allows us to recover the true class-posterior probability. Through experiments on benchmark datasets, we demonstrate that our proposed transformation significantly improves the accuracy of class-posterior probability estimation.

deep learning, focal loss, neural network, (17 more...)

2011.09172

Country:

North America > United States (0.14)
Asia (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area (0.67)
Health & Medicine > Diagnostic Medicine > Imaging (0.45)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

arXiv.org Artificial IntelligenceOct-31-2020

Robust Imitation Learning from Noisy Demonstrations

Tangkaratt, Voot, Charoenphakdee, Nontawat, Sugiyama, Masashi

The goal of sequential decision making is to learn a good policy that makes good decisions (Puterman, 1994). Imitation learning (IL) is an approach that learns a policy from demonstrations (i.e., sequences of demonstrators' decisions) (Schaal, 1999). Researchers have shown that a good policy can be learned efficiently from high-quality demonstrations collected from experts (Ng and Russell, 2000; Syed et al., 2008; Ziebart et al., 2010; Ho and Ermon, 2016; Sun et al., 2019). However, demonstrations in the realworld often have lower quality due to noise or insufficient expertise of demonstrators, especially when humans are involved in the data collection process (Mandlekar et al., 2018). This is problematic because low-quality demonstrations can reduce the efficiency of IL both in theory and practice (Tangkaratt et al., 2020). In this paper, we theoretically and experimentally show that IL can perform well even in the presence of noises.

deep learning, demonstration, neural network, (19 more...)

2010.10181

Country: Asia > Japan (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Machine LearningOct-31-2020

Classification with Rejection Based on Cost-sensitive Classification

Charoenphakdee, Nontawat, Cui, Zhenghang, Zhang, Yivan, Sugiyama, Masashi

The goal of classification with rejection is to avoid risky misclassification in error-critical applications such as medical diagnosis and product inspection. In this paper, based on the relationship between classification with rejection and cost-sensitive classification, we propose a novel method of classification with rejection by learning an ensemble of cost-sensitive classifiers, which satisfies all the following properties for the first time: (i) it can avoid estimating class-posterior probabilities, resulting in improved classification accuracy. (ii) it allows a flexible choice of losses including non-convex ones, (iii) it does not require complicated modifications when using different losses, (iv) it is applicable to both binary and multiclass cases, and (v) it is theoretically justifiable for any classification-calibrated loss. Experimental results demonstrate the usefulness of our proposed approach in clean-labeled, noisy-labeled, and positive-unlabeled classification.

classification, deep learning, neural network, (18 more...)

2010.11748

Country:

North America > United States (0.14)
Asia > Japan (0.14)

Genre: Research Report > New Finding (0.65)

Industry: Health & Medicine (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Neural Information Processing SystemsMar-18-2020, 21:31:02 GMT

On the Calibration of Multiclass Classification with Rejection

Ni, Chenri, Charoenphakdee, Nontawat, Honda, Junya, Sugiyama, Masashi

We investigate the problem of multiclass classification with rejection, where a classifier can choose not to make a prediction to avoid critical misclassification. First, we consider an approach based on simultaneous training of a classifier and a rejector, which achieves the state-of-the-art performance in the binary case. We analyze this approach for the multiclass case and derive a general condition for calibration to the Bayes-optimal solution, which suggests that calibration is hard to achieve by general loss functions unlike the binary case. Next, we consider another traditional approach based on confidence scores, in which the existing work focuses on a specific class of losses. We propose rejection criteria for more general losses for this approach and guarantee calibration to the Bayes-optimal solution.

artificial intelligence, calibration, machine learning, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.88)

arXiv.org Machine LearningOct-10-2019

Learning Only from Relevant Keywords and Unlabeled Documents

Charoenphakdee, Nontawat, Lee, Jongyeong, Jin, Yiping, Wanvarie, Dittaya, Sugiyama, Masashi

We consider a document classification problem where document labels are absent but only relevant keywords of a target class and unlabeled documents are given. Although heuristic methods based on pseudo-labeling have been considered, theoretical understanding of this problem has still been limited. Moreover, previous methods cannot easily incorporate well-developed techniques in supervised text classification. In this paper, we propose a theoretically guaranteed learning framework that is simple to implement and has flexible choices of models, e.g., linear models or neural networks. We demonstrate how to optimize the area under the receiver operating characteristic curve (AUC) effectively and also discuss how to adjust it to optimize other well-known evaluation metrics such as the accuracy and F1-measure. Finally, we show the effectiveness of our framework using benchmark datasets.

neural network, null sym auc-corr, survey article, (16 more...)

1910.04385

Country: Asia (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.87)