AITopics

Country: Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.25)

Genre: Personal > Interview (0.61)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.71)

#artificialintelligenceAug-14-2016, 19:15:39 GMT

1.6. Nearest Neighbors -- scikit-learn 0.17.1 documentation

Unsupervised nearest neighbors is the foundation of many other learning methods, notably manifold learning and spectral clustering. Supervised neighbors-based learning comes in two flavors: classification for data with discrete labels, and regression for data with continuous labels. The principle behind nearest neighbor methods is to find a predefined number of training samples closest in distance to the new point, and predict the label from these. The number of samples can be a user-defined constant (k-nearest neighbor learning), or vary based on the local density of points (radius-based neighbor learning). The distance can, in general, be any metric measure: standard Euclidean distance is the most common choice. Neighbors-based methods are known as non-generalizing machine learning methods, since they simply "remember" all of its training data (possibly transformed into a fast indexing structure such as a Ball Tree or KD Tree.). Despite its simplicity, nearest neighbors has been successful in a large number of classification and regression problems, including handwritten digits or satellite image scenes.

artificial intelligence, machine learning, neighbor, (18 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)

Frossard, Davi E. N., Nunes, Igor O., Krohling, Renato A.

An approach to dealing with missing values in heterogeneous data using k-nearest neighbors

arXiv.org Machine LearningAug-13-2016

Techniques such as clusterization, neural networks and decision making usually rely on algorithms that are not well suited to deal with missing values. However, real world data frequently contains such cases. The simplest solution is to either substitute them by a best guess value or completely disregard the missing values. Unfortunately, both approaches can lead to biased results. In this paper, we propose a technique for dealing with missing values in heterogeneous data using imputation based on the k-nearest neighbors algorithm. It can handle real (which we refer to as crisp henceforward), interval and fuzzy data. The effectiveness of the algorithm is tested on several datasets and the numerical results are promising.

algorithm, artificial intelligence, machine learning, (18 more...)

1608.04037

Country:

South America > Brazil (0.15)
North America > United States (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)

arXiv.org Machine LearningAug-12-2016

Content-based image retrieval tutorial

Mitro, Joani

This paper functions as a tutorial for individuals interested to enter the field of information retrieval but wouldn't know where to begin from. It describes two fundamental yet efficient image retrieval techniques, the first being k - nearest neighbors (knn) and the second support vector machines(svm). The goal is to provide the reader with both the theoretical and practical aspects in order to acquire a better understanding. Along with this tutorial we have also developed the equivalent software1 using the MATLAB environment in order to illustrate the techniques, so that the reader can have a hands-on experience.

algorithm, artificial intelligence, machine learning, (17 more...)

1608.03811

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.54)

Gao, Weihao, Oh, Sewoong, Viswanath, Pramod

Demystifying Fixed k-Nearest Neighbor Information Estimators

arXiv.org Machine LearningAug-10-2016

Estimating mutual information from i.i.d. samples drawn from an unknown joint density function is a basic statistical problem of broad interest with multitudinous applications. The most popular estimator is one proposed by Kraskov and St\"ogbauer and Grassberger (KSG) in 2004, and is nonparametric and based on the distances of each sample to its $k^{\rm th}$ nearest neighboring sample, where $k$ is a fixed small integer. Despite its widespread use (part of scientific software packages), theoretical properties of this estimator have been largely unexplored. In this paper we demonstrate that the estimator is consistent and also identify an upper bound on the rate of convergence of the bias as a function of number of samples. We argue that the superior performance benefits of the KSG estimator stems from a curious "correlation boosting" effect and build on this intuition to modify the KSG estimator in novel ways to construct a superior estimator. As a byproduct of our investigations, we obtain nearly tight rates of convergence of the $\ell_2$ error of the well known fixed $k$ nearest neighbor estimator of differential entropy by Kozachenko and Leonenko.

artificial intelligence, estimator, machine learning, (17 more...)

1604.03006

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.86)

#artificialintelligenceJul-25-2016, 00:21:01 GMT

How To Use Classification Machine Learning Algorithms in Weka - Machine Learning Mastery

Weka makes a large number of classification algorithms available. The large number of machine learning algorithms available is one of the benefits of using the Weka platform to work through your machine learning problems. In this post you will discover how to use 5 top machine learning algorithms in Weka. How To Use Classification Machine Learning Algorithms in Weka Photo by Don Graham, some rights reserved. We are going to take a tour of 5 top classification algorithms in Weka.

algorithm, artificial intelligence, machine learning, (9 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.33)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.33)

Singh, Shashank, Póczos, Barnabás

Analysis of k-Nearest Neighbor Distances with Application to Entropy Estimation

arXiv.org Machine LearningJul-21-2016

Estimating entropy and mutual information consistently is important for many machine learning applications. The Kozachenko-Leonenko (KL) estimator (Kozachenko & Leonenko, 1987) is a widely used nonparametric estimator for the entropy of multivariate continuous random variables, as well as the basis of the mutual information estimator of Kraskov et al. (2004), perhaps the most widely used estimator of mutual information in this setting. Despite the practical importance of these estimators, major theoretical questions regarding their finite-sample behavior remain open. This paper proves finite-sample bounds on the bias and variance of the KL estimator, showing that it achieves the minimax convergence rate for certain classes of smooth functions. In proving these bounds, we analyze finite-sample behavior of k-nearest neighbors (k-NN) distance statistics (on which the KL estimator is based). We derive concentration inequalities for k-NN distances and a general expectation bound for statistics of k-NN distances, which may be useful for other analyses of k-NN methods.

artificial intelligence, estimator, machine learning, (15 more...)

1603.08578

Country: North America > United States (0.46)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)

#artificialintelligenceJun-26-2016, 19:30:12 GMT

K-Nearest Neighbor Machine Learning algorithm

The German credit dataset can be downloaded from UC Irvine, Machine learning community to indicate the predicted outcome if the loan applicant defaulted or not.

artificial intelligence, installment, k-nearest neighbor machine learning algorithm, (2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)

@machinelearnbotJun-11-2016, 20:21:45 GMT

I am looking for Supervised Learning project to work upon? • /r/MachineLearning

Hey, thankyou for the input but I am new to the field. I started it this summer. Until now I have only worked on linear,logistic regression and k-nearest neighbors. I'm looking forward to work on a project on these algorithm before I go deep in field.

artificial intelligence, machine learning, supervised learning project, (1 more...)

@machinelearnbot

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.93)

#artificialintelligenceJun-7-2016, 21:55:57 GMT

Bayesian Optimization of Machine Learning Models

Many predictive and machine learning models have structural or tuning parameters that cannot be directly estimated from the data. For example, when using K-nearest neighbor model, there is no analytical estimator for K (the number of neighbors). Typically, resampling is used to get good performance estimates of the model for a given set of values for K and the one associated with the best results is used. This is basically a grid search procedure. However, there are other approaches that can be used.

artificial intelligence, bayesian optimization, machine learning model, (12 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.56)