AITopics | Hatko, Stan

Collaborating Authors

Hatko, Stan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

k-Nearest Neighbour Classification of Datasets with a Family of Distances

Hatko, Stan

arXiv.org Machine LearningNov-28-2015

The $k$-nearest neighbour ($k$-NN) classifier is one of the oldest and most important supervised learning algorithms for classifying datasets. Traditionally the Euclidean norm is used as the distance for the $k$-NN classifier. In this thesis we investigate the use of alternative distances for the $k$-NN classifier. We start by introducing some background notions in statistical machine learning. We define the $k$-NN classifier and discuss Stone's theorem and the proof that $k$-NN is universally consistent on the normed space $R^d$. We then prove that $k$-NN is universally consistent if we take a sequence of random norms (that are independent of the sample and the query) from a family of norms that satisfies a particular boundedness condition. We extend this result by replacing norms with distances based on uniformly locally Lipschitz functions that satisfy certain conditions. We discuss the limitations of Stone's lemma and Stone's theorem, particularly with respect to quasinorms and adaptively choosing a distance for $k$-NN based on the labelled sample. We show the universal consistency of a two stage $k$-NN type classifier where we select the distance adaptively based on a split labelled sample and the query. We conclude by giving some examples of improvements of the accuracy of classifying various datasets using the above techniques.

artificial intelligence, machine learning, sequence, (16 more...)

arXiv.org Machine Learning

1512.00001

Country:

North America > United States > California (0.14)
North America > Canada > Ontario > National Capital Region > Ottawa (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.42)

Add feedback

Borel Isomorphic Dimensionality Reduction of Data and Supervised Learning

Hatko, Stan

arXiv.org Machine LearningJul-28-2013

In this project we further investigate the idea of reducing the dimensionality of datasets using a Borel isomorphism with the purpose of subsequently applying supervised learning algorithms, as originally suggested by my supervisor V. Pestov (in 2011 Dagstuhl preprint). Any consistent learning algorithm, for example kNN, retains universal consistency after a Borel isomorphism is applied. A series of concrete examples of Borel isomorphisms that reduce the number of dimensions in a dataset is provided, based on multiplying the data by orthogonal matrices before the dimensionality reducing Borel isomorphism is applied. We test the accuracy of the resulting classifier in a lower dimensional space with various data sets. Working with a phoneme voice recognition dataset, of dimension 256 with 5 classes (phonemes), we show that a Borel isomorphic reduction to dimension 16 leads to a minimal drop in accuracy. In conclusion, we discuss further prospects of the method.

artificial intelligence, borel isomorphism, machine learning, (15 more...)

arXiv.org Machine Learning

1307.8333

Country: North America > United States > California (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.41)

Add feedback