AITopics | plugin classifier

Collaborating Authors

plugin classifier

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Binary Classification with Bounded Abstention Rate

Shekhar, Shubhanshu, Ghavamzadeh, Mohammad, Javidi, Tara

arXiv.org Machine LearningMay-23-2019

We consider the problem of binary classification with abstention in the relatively less studied \emph{bounded-rate} setting. We begin by obtaining a characterization of the Bayes optimal classifier for an arbitrary input-label distribution $P_{XY}$. Our result generalizes and provides an alternative proof for the result first obtained by \cite{chow1957optimum}, and then re-derived by \citet{denis2015consistency}, under a continuity assumption on $P_{XY}$. We then propose a plug-in classifier that employs unlabeled samples to decide the region of abstention and derive an upper-bound on the excess risk of our classifier under standard \emph{H\"older smoothness} and \emph{margin} assumptions. Unlike the plug-in rule of \citet{denis2015consistency}, our constructed classifier satisfies the abstention constraint with high probability and can also deal with discontinuities in the empirical cdf. We also derive lower-bounds that demonstrate the minimax near-optimality of our proposed algorithm. To address the excessive complexity of the plug-in classifier in high dimensions, we propose a computationally efficient algorithm that builds upon prior work on convex loss surrogates, and obtain bounds on its excess risk in the \emph{realizable} case. We empirically compare the performance of the proposed algorithm with a baseline on a number of UCI benchmark datasets.

artificial intelligence, classifier, machine learning, (17 more...)

arXiv.org Machine Learning

1905.09561

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.48)

Add feedback

A plug-in approach to maximising precision at the top and recall at the top

Tasche, Dirk

arXiv.org Machine LearningApr-9-2018

Information retrieval and binary classification can be considered equivalent problems in principle. Information retrieval means to mark documents in a set of candidate documents as relevant or non-relevant for some question, on the basis of the properties of the documents. For binary classification, the problem is to distinguish between the'positive' and'negative' instances from a dataset, based on the features of the instances. Hence, from an abstract point of view, information retrieval is a special case of binary classification, with the documents being instances, the document properties being features and'relevant' being translated as'positive'. In practice, however, the general concepts from binary classification are not always helpful for information retrieval applications. The fact that often the proportion of relevant documents in a set of documents subject to a search is small or even very small is only one of the reasons for information retrieval to be considered a field of research for its own. As a consequence, some performance measures for information retrieval methods differ from those in use for binary classifiers or are called by different names. Precision and recall are possibly the most popular performance measures(see Chapter 8 of Manning et al., 2008, for a list of performance measures) for information retrieval methods: - Precision is the proportion of documents (instances) that are truly relevant (positive) among those documents which have been predicted relevant (positive). The term precision is also commonly used (with the same meaning) in binary classification.

classifier, information retrieval, machine learning, (19 more...)

arXiv.org Machine Learning

1804.03077

Country: Asia > Japan (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Pointwise adaptation via stagewise aggregation of local estimates for multiclass classification

Puchkin, Nikita, Spokoiny, Vladimir

arXiv.org Machine LearningApr-8-2018

We consider a problem of multiclass classification, where the training sample $S_n = \{(X_i, Y_i)\}_{i=1}^n$ is generated from the model $\mathbb p(Y = m | X = x) = \theta_m(x)$, $1 \leq m \leq M$, and $\theta_1(x), \dots, \theta_M(x)$ are unknown Lipschitz functions. Given a test point $X$, our goal is to estimate $\theta_1(X), \dots, \theta_M(X)$. An approach based on nonparametric smoothing uses a localization technique, i.e. the weight of observation $(X_i, Y_i)$ depends on the distance between $X_i$ and $X$. However, local estimates strongly depend on localizing scheme. In our solution we fix several schemes $W_1, \dots, W_K$, compute corresponding local estimates $\widetilde\theta^{(1)}, \dots, \widetilde\theta^{(K)}$ for each of them and apply an aggregation procedure. We propose an algorithm, which constructs a convex combination of the estimates $\widetilde\theta^{(1)}, \dots, \widetilde\theta^{(K)}$ such that the aggregated estimate behaves approximately as well as the best one from the collection $\widetilde\theta^{(1)}, \dots, \widetilde\theta^{(K)}$. We also study theoretical properties of the procedure, prove oracle results and establish rates of convergence under mild assumptions.

artificial intelligence, machine learning, spokoiny pointwise adaptation, (16 more...)

arXiv.org Machine Learning

1804.02756

Country:

North America > United States (0.46)
Europe (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)

Add feedback

Scalable Optimization of Multivariate Performance Measures in Multi-instance Multi-label Learning

Aggarwal, Apoorv (Indian Institute of Technology Bombay) | Ghoshal, Sandip (Indian Institute of Technology Bombay) | Shetty, Ankith M. S. (Indian Institute of Technology Bombay) | Sinha, Suhit (Indian Institute of Technology Bombay) | Ramakrishnan, Ganesh (Indian Institute of Technology Bombay) | Kar, Purushottam (Indian Institute of Technology Kanpur) | Jain, Prateek (Microsoft Research )

AAAI ConferencesFeb-14-2017

The problem of multi-instance multi-label learning (MIML) requires a bag of instances to be assigned a set of labels most relevant to the bag as a whole. The problem finds numerous applications in machine learning, computer vision, and natural language processing settings where only partial or distant supervision is available. We present a novel method for optimizing multivariate performance measures in the MIML setting. Our approach MIML-perf uses a novel plug-in technique and offers a seamless way to optimize a vast variety of performance measures such as macro and micro-F measure, average precision, which are performance measures of choice in multi-label learning domains. MIML-perf offers two key benefits over the state of the art. Firstly, across a diverse range of benchmark tasks, ranging from relation extraction to text categorization and scene classification, MIML-perf offers superior performance as compared to state of the art methods designed specifically for these tasks. Secondly, MIML-perf operates with significantly reduced running times as compared to other methods, often by an order of magnitude or more.

artificial intelligence, machine learning, natural language, (16 more...)

AAAI Conferences

Thirty-First AAAI Conference on Artificial Intelligence

Genre: Research Report > Promising Solution (0.68)

Industry: Education > Curriculum > Subject-Specific Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Learning From Non-iid Data: Fast Rates for the One-vs-All Multiclass Plug-in Classifiers

Dinh, Vu, Ho, Lam Si Tung, Cuong, Nguyen Viet, Nguyen, Duy, Nguyen, Binh T.

arXiv.org Machine LearningJan-24-2015

We prove new fast learning rates for the one-vs-all multiclass plug-in classifiers trained either from exponentially strongly mixing data or from data generated by a converging drifting distribution. These are two typical scenarios where training data are not iid. The learning rates are obtained under a multiclass version of Tsybakov's margin assumption, a type of low-noise assumption, and do not depend on the number of classes. Our results are general and include a previous result for binary-class plug-in classifiers with iid data as a special case. In contrast to previous works for least squares SVMs under the binary-class setting, our results retain the optimal learning rate in the iid case.

artificial intelligence, machine learning, plugin classifier, (14 more...)

arXiv.org Machine Learning

1408.2714

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

On a Theory of Nonparametric Pairwise Similarity for Clustering: Connecting Clustering to Classification

Yang, Yingzhen, Liang, Feng, Yan, Shuicheng, Wang, Zhangyang, Huang, Thomas S.

Neural Information Processing SystemsDec-31-2014

Pairwise clustering methods partition the data space into clusters by the pairwise similarity between data points. The success of pairwise clustering largely depends on the pairwise similarity function defined over the data points, where kernel similarity is broadly used. In this paper, we present a novel pairwise clustering framework by bridging the gap between clustering and multi-class classification. This pairwise clustering framework learns an unsupervised nonparametric classifier from each data partition, and search for the optimal partition of the data by minimizing the generalization error of the learned classifiers associated with the data partitions. We consider two nonparametric classifiers in this framework, i.e. the nearest neighbor classifier and the plug-in classifier. Modeling the underlying data distribution by nonparametric kernel density estimation, the generalization error bounds for both unsupervised nonparametric classifiers are the sum of nonparametric pairwise similarity terms between the data points for the purpose of clustering. Under uniform distribution, the nonparametric similarity terms induced by both unsupervised classifiers exhibit a well known form of kernel similarity. We also prove that the generalization error bound for the unsupervised plug-in classifier is asymptotically equal to the weighted volume of cluster boundary for Low Density Separation, a widely used criteria for semi-supervised learning and clustering. Based on the derived nonparametric pairwise similarity using the plug-in classifier, we propose a new nonparametric exemplar-based clustering method with enhanced discriminative capability, whose superiority is evidenced by the experimental results.

artificial intelligence, classifier, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback