AITopics | nearest neighbor method

Statistical Analysis of Nearest Neighbor Methods for Anomaly Detection

Neural Information Processing SystemsDec-25-2025, 15:36:22 GMT

Nearest-neighbor (NN) procedures are well studied and widely used in both supervised and unsupervised learning problems. In this paper we are concerned with investigating the performance of NN-based methods for anomaly detection. We first show through extensive simulations that NN methods compare favorably to some of the other state-of-the-art algorithms for anomaly detection based on a set of benchmark synthetic datasets. We further consider the performance of NN methods on real datasets, and relate it to the dimensionality of the problem. Next, we analyze the theoretical properties of NN-methods for anomaly detection by studying a more general quantity called distance-to-measure (DTM), originally developed in the literature on robust geometric and topological inference. We provide finite-sample uniform guarantees for the empirical DTM and use them to derive misclassification rates for anomalous observations under various settings. In our analysis we rely on Huber's contamination model and formulate mild geometric regularity assumptions on the underlying distribution of the data.

name change, nearest neighbor method, statistical analysis, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

N$^2$: A Unified Python Package and Test Bench for Nearest Neighbor-Based Matrix Completion

Chin, Caleb, Khubchandani, Aashish, Maskara, Harshvardhan, Choi, Kyuseong, Feitelberg, Jacob, Gong, Albert, Paul, Manit, Sadhukhan, Tathagata, Agarwal, Anish, Dwivedi, Raaz

arXiv.org Machine LearningJun-5-2025

Nearest neighbor (NN) methods have re-emerged as competitive tools for matrix completion, offering strong empirical performance and recent theoretical guarantees, including entry-wise error bounds, confidence intervals, and minimax optimality. Despite their simplicity, recent work has shown that NN approaches are robust to a range of missingness patterns and effective across diverse applications. This paper introduces N$^2$, a unified Python package and testbed that consolidates a broad class of NN-based methods through a modular, extensible interface. Built for both researchers and practitioners, N$^2$ supports rapid experimentation and benchmarking. Using this framework, we introduce a new NN variant that achieves state-of-the-art results in several settings. We also release a benchmark suite of real-world datasets, from healthcare and recommender systems to causal inference and LLM evaluation, designed to stress-test matrix completion methods beyond synthetic scenarios. Our experiments demonstrate that while classical methods excel on idealized data, NN-based techniques consistently outperform them in real-world settings.

large language model, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

2506.04166

Country:

North America > United States > California (0.05)
North America > United States > Utah (0.04)
North America > United States > Pennsylvania (0.04)
(5 more...)

Genre: Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Public Health (0.46)
Health & Medicine > Consumer Health (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

A comparative analysis of a neural network with calculated weights and a neural network with random generation of weights based on the training dataset size

Geidarov, Polad

arXiv.org Artificial IntelligenceJun-2-2025

The paper discusses the capabilities of multilayer perceptron neural networks implementing metric recognition methods, for which the values of the weights are calculated analytically by formulas. Comparative experiments in training a neural network with pre-calculated weights and with random initialization of weights on different sizes of the MNIST training dataset are carried out. The results of the experiments show that a multilayer perceptron with pre-calculated weights can be trained much faster and is much more robust to the reduction of the training dataset.

artificial intelligence, machine learning, neural network, (16 more...)

arXiv.org Artificial Intelligence

2505.23876

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.97)

Add feedback

Statistical Analysis of Nearest Neighbor Methods for Anomaly Detection

Neural Information Processing SystemsOct-10-2024, 09:39:21 GMT

Nearest-neighbor (NN) procedures are well studied and widely used in both supervised and unsupervised learning problems. In this paper we are concerned with investigating the performance of NN-based methods for anomaly detection. We first show through extensive simulations that NN methods compare favorably to some of the other state-of-the-art algorithms for anomaly detection based on a set of benchmark synthetic datasets. We further consider the performance of NN methods on real datasets, and relate it to the dimensionality of the problem. Next, we analyze the theoretical properties of NN-methods for anomaly detection by studying a more general quantity called distance-to-measure (DTM), originally developed in the literature on robust geometric and topological inference.

anomaly detection, nearest neighbor method, statistical analysis, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Weighting vectors for machine learning: numerical harmonic analysis applied to boundary detection

Bunch, Eric, Kline, Jeffery, Dickinson, Daniel, Bhat, Suhaas, Fung, Glenn

arXiv.org Machine LearningJun-1-2021

Metric space magnitude, an active field of research in algebraic topology, is a scalar quantity that summarizes the effective number of distinct points that live in a general metric space. The {\em weighting vector} is a closely-related concept that captures, in a nontrivial way, much of the underlying geometry of the original metric space. Recent work has demonstrated that when the metric space is Euclidean, the weighting vector serves as an effective tool for boundary detection. We recast this result and show the weighting vector may be viewed as a solution to a kernelized SVM. As one consequence, we apply this new insight to the task of outlier detection, and we demonstrate performance that is competitive or exceeds performance of state-of-the-art techniques on benchmark data sets. Under mild assumptions, we show the weighting vector, which has computational cost of matrix inversion, can be efficiently approximated in linear time. We show how nearest neighbor methods can approximate solutions to the minimization problems defined by SVMs.

magnitude, vector, weighting vector, (16 more...)

arXiv.org Machine Learning

2106.00827

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Wisconsin > Dane County > Madison (0.05)
(4 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.96)

Add feedback

Community detection, pattern recognition, and hypergraph-based learning: approaches using metric geometry and persistent homology

Nguyen, Dong Quan Ngoc, Xing, Lin, Lin, Lizhen

arXiv.org Machine LearningSep-29-2020

Hypergraph data appear and are hidden in many places in the modern age. They are data structure that can be used to model many real data examples since their structures contain information about higher order relations among data points. One of the main contributions of our paper is to introduce a new topological structure to hypergraph data which bears a resemblance to a usual metric space structure. Using this new topological space structure of hypergraph data, we propose several approaches to study community detection problem, detecting persistent features arising from homological structure of hypergraph data. Also based on the topological space structure of hypergraph data introduced in our paper, we introduce a modified nearest neighbors methods which is a generalization of the classical nearest neighbors methods from machine learning. Our modified nearest neighbors methods have an advantage of being very flexible and applicable even for discrete structures as in hypergraphs. We then apply our modified nearest neighbors methods to study sign prediction problem in hypegraph data constructed using our method.

artificial intelligence, machine learning, pattern recognition, (17 more...)

arXiv.org Machine Learning

2010.00435

Country:

North America > United States > Indiana > St. Joseph County > Notre Dame (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.40)

Add feedback

Statistical Analysis of Nearest Neighbor Methods for Anomaly Detection

Gu, Xiaoyi, Akoglu, Leman, Rinaldo, Alessandro

Neural Information Processing SystemsMar-19-2020, 01:02:53 GMT

Nearest-neighbor (NN) procedures are well studied and widely used in both supervised and unsupervised learning problems. In this paper we are concerned with investigating the performance of NN-based methods for anomaly detection. We first show through extensive simulations that NN methods compare favorably to some of the other state-of-the-art algorithms for anomaly detection based on a set of benchmark synthetic datasets. We further consider the performance of NN methods on real datasets, and relate it to the dimensionality of the problem. Next, we analyze the theoretical properties of NN-methods for anomaly detection by studying a more general quantity called distance-to-measure (DTM), originally developed in the literature on robust geometric and topological inference.

anomaly detection, nearest neighbor method, statistical analysis, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Consistent Classification, Firm and Soft

Baram, Yoram

Neural Information Processing SystemsDec-31-1997

A classifier is called consistent with respect to a given set of classlabeled points if it correctly classifies the set. We consider classifiers defined by unions of local separators and propose algorithms for consistent classifier reduction. The expected complexities of the proposed algorithms are derived along with the expected classifier sizes. In particular, the proposed approach yields a consistent reduction of the nearest neighbor classifier, which performs "firm" classification, assigning each new object to a class, regardless of the data structure. The proposed reduction method suggests a notion of "soft" classification, allowing for indecision with respect to objects which are insufficiently or ambiguously supported by the data. The performances of the proposed classifiers in predicting stock behavior are compared to that achieved by the nearest neighbor method.

classification, classifier, separator, (14 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Asia > Middle East > Israel > Haifa District > Haifa (0.05)
Africa > Middle East > Egypt > Cairo Governorate > Cairo (0.04)

Industry: Banking & Finance > Trading (0.30)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.36)

Add feedback

Consistent Classification, Firm and Soft

Baram, Yoram

Neural Information Processing SystemsDec-31-1997

A classifier is called consistent with respect to a given set of classlabeled points if it correctly classifies the set. We consider classifiers defined by unions of local separators and propose algorithms for consistent classifier reduction. The expected complexities of the proposed algorithms are derived along with the expected classifier sizes. In particular, the proposed approach yields a consistent reduction of the nearest neighbor classifier, which performs "firm" classification, assigning each new object to a class, regardless of the data structure. The proposed reduction method suggests a notion of "soft" classification, allowing for indecision with respect to objects which are insufficiently or ambiguously supported by the data. The performances of the proposed classifiers in predicting stock behavior are compared to that achieved by the nearest neighbor method.

classification, classifier, separator, (14 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Asia > Middle East > Israel > Haifa District > Haifa (0.05)
Africa > Middle East > Egypt > Cairo Governorate > Cairo (0.04)

Industry: Banking & Finance > Trading (0.30)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.36)

Add feedback

Consistent Classification, Firm and Soft

Baram, Yoram

Neural Information Processing SystemsDec-31-1997

A classifier is called consistent with respect to a given set of classlabeled pointsif it correctly classifies the set. We consider classifiers defined by unions of local separators and propose algorithms for consistent classifier reduction. The expected complexities of the proposed algorithms are derived along with the expected classifier sizes. In particular, the proposed approach yields a consistent reduction ofthe nearest neighbor classifier, which performs "firm" classification, assigning each new object to a class, regardless of the data structure. The proposed reduction method suggests a notion of "soft" classification, allowing for indecision with respect to objects which are insufficiently or ambiguously supported by the data. The performances of the proposed classifiers in predicting stockbehavior are compared to that achieved by the nearest neighbor method. 1 Introduction Certain classification problems, such as recognizing the digits of a hand written zipcode, requirethe assignment of each object to a class. Others, involving relatively small amounts of data and high risk, call for indecision until more data become available. Examples in such areas as medical diagnosis, stock trading and radar detection are well known. The training data for the classifier in both cases will correspond to firmly labeled members of the competing classes.

artificial intelligence, classifier, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Israel (0.15)
North America > United States (0.14)

Industry:

Banking & Finance > Trading (0.50)
Health & Medicine (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.36)

Add feedback

Filters

Collaborating Authors

nearest neighbor method

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Statistical Analysis of Nearest Neighbor Methods for Anomaly Detection

N$^2$: A Unified Python Package and Test Bench for Nearest Neighbor-Based Matrix Completion

A comparative analysis of a neural network with calculated weights and a neural network with random generation of weights based on the training dataset size

Statistical Analysis of Nearest Neighbor Methods for Anomaly Detection

Weighting vectors for machine learning: numerical harmonic analysis applied to boundary detection

Community detection, pattern recognition, and hypergraph-based learning: approaches using metric geometry and persistent homology

Statistical Analysis of Nearest Neighbor Methods for Anomaly Detection

Consistent Classification, Firm and Soft

Consistent Classification, Firm and Soft

Consistent Classification, Firm and Soft