AITopics | Shimada, Takuya

Collaborating Authors

Shimada, Takuya

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Similarity-based Classification: Connecting Similarity Learning to Binary Classification

Bao, Han, Shimada, Takuya, Xu, Liyuan, Sato, Issei, Sugiyama, Masashi

arXiv.org Machine LearningJun-11-2020

In real-world classification problems, pairwise supervision (i.e., a pair of patterns with a binary label indicating whether they belong to the same class or not) can often be obtained at a lower cost than ordinary class labels. Similarity learning is a general framework to utilize such pairwise supervision to elicit useful representations by inferring the relationship between two data points, which encompasses various important preprocessing tasks such as metric learning, kernel learning, graph embedding, and contrastive representation learning. Although elicited representations are expected to perform well in downstream tasks such as classification, little theoretical insight has been given in the literature so far. In this paper, we reveal that a specific formulation of similarity learning is strongly related to the objective of binary classification, which spurs us to learn a binary classifier without ordinary class labels---by fitting the product of real-valued prediction functions of pairwise patterns to their similarity. Our formulation of similarity learning does not only generalize many existing ones, but also admits an excess risk bound showing an explicit connection to classification. Finally, we empirically demonstrate the practical usefulness of the proposed method on benchmark datasets.

artificial intelligence, classification, machine learning, (14 more...)

arXiv.org Machine Learning

2006.06207

Country:

Asia > Japan (0.14)
North America > United States (0.14)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Data Interpolating Prediction: Alternative Interpretation of Mixup

Shimada, Takuya, Yamaguchi, Shoichiro, Hayashi, Kohei, Kobayashi, Sosuke

arXiv.org Machine LearningJun-19-2019

Data augmentation by mixing samples, such as Mixup, has widely been used typically for classification tasks. However, this strategy is not always effective due to the gap between augmented samples for training and original samples for testing. This gap may prevent a classifier from learning the optimal decision boundary and increase the generalization error. To overcome this problem, we propose an alternative framework called Data Interpolating Prediction (DIP). Unlike common data augmentations, we encapsulate the sample-mixing process in the hypothesis class of a classifier so that train and test samples are treated equally. We derive the generalization bound and show that DIP helps to reduce the original Rademacher complexity. Also, we empirically demonstrate that DIP can outperform existing Mixup.

alternative interpretation, mixup, prediction

arXiv.org Machine Learning

1906.08412

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.53)

Add feedback

Classification from Pairwise Similarities/Dissimilarities and Unlabeled Data via Empirical Risk Minimization

Shimada, Takuya, Bao, Han, Sato, Issei, Sugiyama, Masashi

arXiv.org Machine LearningApr-26-2019

In supervised classification, we need a vast amount of labeled training data to train our classifiers. However, it is often not easy to obtain labels due to high labeling costs [Chapelle et al., 2010], privacy concern [Warner, 1965], social bias [Nederhof, 1985], and difficulty to label data. For such reasons, there is a situation in real-world classification problems, where pairwise similarities (i.e., pairs of samples in the same class) and pairwise dissimilarities (i.e., pairs of samples in different classes) might be easier to collect than fully labeled data. For example, in the task of protein function prediction [Klein et al., 2002], the knowledge about similarities/dissimilarities can be obtained as additional supervision, which can be found by experimental means. To handle such pairwise information, similar-unlabeled (SU) classification [Bao et al., 2018] has been proposed, where the classification risk is estimated in an unbiased fashion from only similar pairs and unlabeled data. Although they assumed that only similar pairs and unlabeled data are available, we may also obtain dissimilar pairs in practice. In this case, a method which can handle all of similarities/dissimilarities and unlabeled data is desirable. Semi-supervised clustering [Wagstaff et al., 2001] is one of the methods that can handle both similar and dissimilar pairs, where must-link pairs (i.e., similar pairs) and cannot-link pairs (i.e., dissimilar pairs) are used to obtain meaningful clusters.

artificial intelligence, classification, machine learning, (18 more...)

arXiv.org Machine Learning

1904.11717

Country: Asia > Japan (0.14)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.93)

Add feedback