AITopics | Nearest Neighbor Methods

Collaborating Authors

Nearest Neighbor Methods

News Overviews Instructional Materials AI-Alerts Classics

Non-parametric, Nearest-neighbor-assisted Fine-tuning for Neural Machine Translation

Wang, Jiayi, Wang, Ke, Zhang, Yuqi, Zhao, Yu, Stenetorp, Pontus

arXiv.org Artificial IntelligenceMay-22-2023

Non-parametric, k-nearest-neighbor algorithms have recently made inroads to assist generative models such as language models and machine translation decoders. We explore whether such non-parametric models can improve machine translation models at the fine-tuning stage by incorporating statistics from the kNN predictions to inform the gradient updates for a baseline translation model. There are multiple methods which could be used to incorporate kNN statistics and we investigate gradient scaling by a gating mechanism, the kNN's ground truth probability, and reinforcement learning. For four standard in-domain machine translation datasets, compared with classic fine-tuning, we report consistent improvements of all of the three methods by as much as 1.45 BLEU and 1.28 BLEU for German-English and English-German translations respectively. Through qualitative analysis, we found particular improvements when it comes to translating grammatical relations or function words, which results in increased fluency of our model.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2305.13648

Country:

Europe (0.93)
North America > United States > Minnesota (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.89)

Add feedback

Understanding the Effect of Data Augmentation on Knowledge Distillation

Wang, Ziqi, Han, Chi, Bao, Wenxuan, Ji, Heng

arXiv.org Artificial IntelligenceMay-21-2023

Knowledge distillation (KD) requires sufficient data to transfer knowledge from large-scale teacher models to small-scale student models. Therefore, data augmentation has been widely used to mitigate the shortage of data under specific scenarios. Classic data augmentation techniques, such as synonym replacement and k-nearest-neighbors, are initially designed for fine-tuning. To avoid severe semantic shifts and preserve task-specific labels, those methods prefer to change only a small proportion of tokens (e.g., changing 10% tokens is generally the best option for fine-tuning). However, such data augmentation methods are sub-optimal for knowledge distillation since the teacher model could provide label distributions and is more tolerant to semantic shifts. We first observe that KD prefers as much data as possible, which is different from fine-tuning that too much data will not gain more performance. Since changing more tokens leads to more semantic shifts, we use the proportion of changed tokens to reflect semantic shift degrees. Then we find that KD prefers augmented data with a larger semantic shift degree (e.g., changing 30% tokens is generally the best option for KD) than fine-tuning (changing 10% tokens). Besides, our findings show that smaller datasets prefer larger degrees until the out-of-distribution problem occurs (e.g., datasets with less than 10k inputs may prefer the 50% degree, and datasets with more than 100k inputs may prefer the 10% degree). Our work sheds light on the preference difference in data augmentation between fine-tuning and knowledge distillation and encourages the community to explore KD-specific data augmentation methods.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2305.12565

Country: North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.54)

Add feedback

Mitigating Catastrophic Forgetting in Task-Incremental Continual Learning with Adaptive Classification Criterion

Luo, Yun, Lin, Xiaotian, Yang, Zhen, Meng, Fandong, Zhou, Jie, Zhang, Yue

arXiv.org Artificial IntelligenceMay-20-2023

Task-incremental continual learning refers to continually training a model in a sequence of tasks while overcoming the problem of catastrophic forgetting (CF). The issue arrives for the reason that the learned representations are forgotten for learning new tasks, and the decision boundary is destructed. Previous studies mostly consider how to recover the representations of learned tasks. It is seldom considered to adapt the decision boundary for new representations and in this paper we propose a Supervised Contrastive learning framework with adaptive classification criterion for Continual Learning (SCCL), In our method, a contrastive loss is used to directly learn representations for different tasks and a limited number of data samples are saved as the classification criterion. During inference, the saved data samples are fed into the current model to obtain updated representations, and a k Nearest Neighbour module is used for classification. In this way, the extensible model can solve the learned tasks with adaptive criteria of saved samples. To mitigate CF, we further use an instance-wise relation distillation regularization term and a memory replay module to maintain the information of previous tasks. Experiments show that SCCL achieves state-of-the-art performance and has a stronger ability to overcome CF compared with the classification baselines.

artificial intelligence, machine learning, representation, (16 more...)

arXiv.org Artificial Intelligence

2305.1227

Country:

Asia > China (0.46)
North America (0.28)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.34)

Add feedback

PromptNER: A Prompting Method for Few-shot Named Entity Recognition via k Nearest Neighbor Search

Zhang, Mozhi, Yan, Hang, Zhou, Yaqian, Qiu, Xipeng

arXiv.org Artificial IntelligenceMay-20-2023

Few-shot Named Entity Recognition (NER) is a task aiming to identify named entities via limited annotated samples. Recently, prototypical networks have shown promising performance in few-shot NER. Most of prototypical networks will utilize the entities from the support set to construct label prototypes and use the query set to compute span-level similarities and optimize these label prototype representations. However, these methods are usually unsuitable for fine-tuning in the target domain, where only the support set is available. In this paper, we propose PromptNER: a novel prompting method for few-shot NER via k nearest neighbor search. We use prompts that contains entity category information to construct label prototypes, which enables our model to fine-tune with only the support set. Our approach achieves excellent transfer learning ability, and extensive experiments on the Few-NERD and CrossNER datasets demonstrate that our model achieves superior performance over state-of-the-art methods.

computational linguistic, information retrieval, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2305.12217

Country:

Europe (0.93)
North America > United States > Minnesota (0.28)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)

Add feedback

gLaSDI: Parametric Physics-informed Greedy Latent Space Dynamics Identification

He, Xiaolong, Choi, Youngsoo, Fries, William D., Belof, Jon, Chen, Jiun-Shyan

arXiv.org Artificial IntelligenceMay-18-2023

A parametric adaptive physics-informed greedy Latent Space Dynamics Identification (gLaSDI) method is proposed for accurate, efficient, and robust data-driven reduced-order modeling of high-dimensional nonlinear dynamical systems. In the proposed gLaSDI framework, an autoencoder discovers intrinsic nonlinear latent representations of high-dimensional data, while dynamics identification (DI) models capture local latent-space dynamics. An interactive training algorithm is adopted for the autoencoder and local DI models, which enables identification of simple latent-space dynamics and enhances accuracy and efficiency of data-driven reduced-order modeling. To maximize and accelerate the exploration of the parameter space for the optimal model performance, an adaptive greedy sampling algorithm integrated with a physics-informed residual-based error indicator and random-subset evaluation is introduced to search for the optimal training samples on the fly. Further, to exploit local latent-space dynamics captured by the local DI models for an improved modeling accuracy with a minimum number of local DI models in the parameter space, a k-nearest neighbor convex interpolation scheme is employed. The effectiveness of the proposed framework is demonstrated by modeling various nonlinear dynamical problems, including Burgers equations, nonlinear heat conduction, and radial advection. The proposed adaptive greedy sampling outperforms the conventional predefined uniform sampling in terms of accuracy. Compared with the high-fidelity models, gLaSDI achieves 17 to 2,658x speed-up with 1 to 5% relative errors.

artificial intelligence, machine learning, maximum relative error, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.jcp.2023.112267

2204.12005

Country:

North America > United States > California > San Diego County (0.14)
North America > United States > Arizona > Pima County > Tucson (0.14)

Genre: Research Report (0.64)

Industry:

Energy > Oil & Gas > Upstream (0.46)
Government > Regional Government > North America Government > United States Government (0.46)
Education > Educational Setting > Online (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.54)

Add feedback

NLP-based Cross-Layer 5G Vulnerabilities Detection via Fuzzing Generated Run-Time Profiling

Wang, Zhuzhu, Wang, Ying

arXiv.org Artificial IntelligenceMay-14-2023

The effectiveness and efficiency of 5G software stack vulnerability and unintended behavior detection are essential for 5G assurance, especially for its applications in critical infrastructures. Scalability and automation are the main challenges in testing approaches and cybersecurity research. In this paper, we propose an innovative approach for automatically detecting vulnerabilities, unintended emergent behaviors, and performance degradation in 5G stacks via run-time profiling documents corresponding to fuzz testing in code repositories. Piloting on srsRAN, we map the run-time profiling via Logging Information (LogInfo) generated by fuzzing test to a high dimensional metric space first and then construct feature spaces based on their timestamp information. Lastly, we further leverage machine learning-based classification algorithms, including Logistic Regression, K-Nearest Neighbors, and Random Forest to categorize the impacts on performance and security attributes. The performance of the proposed approach has high accuracy, ranging from $ 93.4 \% $ to $ 95.9 \% $, in detecting the fuzzing impacts. In addition, the proof of concept could identify and prioritize real-time vulnerabilities on 5G infrastructures and critical applications in various verticals.

artificial intelligence, machine learning, protocol, (14 more...)

arXiv.org Artificial Intelligence

2305.08226

Country: North America > United States (0.14)

Genre:

Research Report > New Finding (0.34)
Research Report > Experimental Study (0.34)

Industry:

Telecommunications (1.00)
Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.53)

Add feedback

Parameter is Not All You Need: Starting from Non-Parametric Networks for 3D Point Cloud Analysis

Zhang, Renrui, Wang, Liuhui, Guo, Ziyu, Wang, Yali, Gao, Peng, Li, Hongsheng, Shi, Jianbo

arXiv.org Artificial IntelligenceMay-10-2023

We present a Non-parametric Network for 3D point cloud analysis, Point-NN, which consists of purely non-learnable components: farthest point sampling (FPS), k-nearest neighbors (k-NN), and pooling operations, with trigonometric functions. Surprisingly, it performs well on various 3D tasks, requiring no parameters or training, and even surpasses existing fully trained models. Starting from this basic non-parametric model, we propose two extensions. First, Point-NN can serve as a base architectural framework to construct Parametric Networks by simply inserting linear layers on top. Given the superior non-parametric foundation, the derived Point-PN exhibits a high performance-efficiency trade-off with only a few learnable parameters. Second, Point-NN can be regarded as a plug-and-play module for the already trained 3D models during inference. Point-NN captures the complementary geometric knowledge and enhances existing methods for different 3D benchmarks without re-training. We hope our work may cast a light on the community for understanding 3D point clouds with non-parametric methods. Code is available at https://github.com/ZrrSkywalker/Point-NN.

artificial intelligence, machine learning, point-nn, (17 more...)

arXiv.org Artificial Intelligence

2303.08134

Country: Asia > China (0.46)

Genre: Research Report (0.50)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.54)

Add feedback

From Association to Generation: Text-only Captioning by Unsupervised Cross-modal Mapping

Wang, Junyang, Yan, Ming, Zhang, Yi, Sang, Jitao

arXiv.org Artificial IntelligenceMay-7-2023

With the development of Vision-Language Pre-training Models (VLPMs) represented by CLIP and ALIGN, significant breakthroughs have been achieved for association-based visual tasks such as image classification and image-text retrieval by the zero-shot capability of CLIP without fine-tuning. However, CLIP is hard to apply to generation-based tasks. This is due to the lack of decoder architecture and pre-training tasks for generation. Although previous works have created generation capacity for CLIP through additional language models, a modality gap between the CLIP representations of different modalities and the inability of CLIP to model the offset of this gap, which fails the concept to transfer across modalities. To solve the problem, we try to map images/videos to the language modality and generate captions from the language modality. In this paper, we propose the K-nearest-neighbor Cross-modality Mapping (Knight), a zero-shot method from association to generation. With text-only unsupervised training, Knight achieves State-of-the-Art performance in zero-shot methods for image captioning and video captioning. Our code is available at https://github.com/junyangwang0410/Knight.

artificial intelligence, caption, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2304.13273

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.57)

Add feedback

Unsupervised anomaly detection algorithms on real-world data: how many do we need?

Bouman, Roel, Bukhsh, Zaharah, Heskes, Tom

arXiv.org Artificial IntelligenceMay-1-2023

In this study we evaluate 32 unsupervised anomaly detection algorithms on 52 real-world multivariate tabular datasets, performing the largest comparison of unsupervised anomaly detection algorithms to date. On this collection of datasets, the $k$-thNN (distance to the $k$-nearest neighbor) algorithm significantly outperforms the most other algorithms. Visualizing and then clustering the relative performance of the considered algorithms on all datasets, we identify two clear clusters: one with ``local'' datasets, and another with ``global'' datasets. ``Local'' anomalies occupy a region with low density when compared to nearby samples, while ``global'' occupy an overall low density region in the feature space. On the local datasets the $k$NN ($k$-nearest neighbor) algorithm comes out on top. On the global datasets, the EIF (extended isolation forest) algorithm performs the best. Also taking into consideration the algorithms' computational complexity, a toolbox with these three unsupervised anomaly detection algorithms suffices for finding anomalies in this representative collection of multivariate datasets. By providing access to code and datasets, our study can be easily reproduced and extended with more algorithms and/or datasets.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2305.00735

Country: Europe > Netherlands (0.28)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.88)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Scalable Data Point Valuation in Decentralized Learning

Pandl, Konstantin D., Huang, Chun-Yin, Beschastnikh, Ivan, Li, Xiaoxiao, Thiebes, Scott, Sunyaev, Ali

arXiv.org Artificial IntelligenceMay-1-2023

Existing research on data valuation in federated and swarm learning focuses on valuing client contributions and works best when data across clients is independent and identically distributed (IID). In practice, data is rarely distributed IID. We develop an approach called DDVal for decentralized data valuation, capable of valuing individual data points in federated and swarm learning. DDVal is based on sharing deep features and approximating Shapley values through a k-nearest neighbor approximation method. This allows for novel applications, for example, to simultaneously reward institutions and individuals for providing data to a decentralized machine learning task. The valuation of data points through DDVal allows to also draw hierarchical conclusions on the contribution of institutions, and we empirically show that the accuracy of DDVal in estimating institutional contributions is higher than existing Shapley value approximation methods for federated learning. Specifically, it reaches a cosine similarity in approximating Shapley values of 99.969 % in both, IID and non-IID data distributions across institutions, compared with 99.301 % and 97.250 % for the best state of the art methods. DDVal scales with the number of data points instead of the number of clients, and has a loglinear complexity. This scales more favorably than existing approaches with an exponential complexity. We show that DDVal is especially efficient in data distribution scenarios with many clients that have few data points - for example, more than 16 clients with 8,000 data points each. By integrating DDVal into a decentralized system, we show that it is not only suitable for centralized federated learning, but also decentralized swarm learning, which aligns well with the research on emerging internet technologies such as web3 to reward users for providing data to algorithms.

artificial intelligence, machine learning, valuation, (17 more...)

arXiv.org Artificial Intelligence

2305.01657

Country:

Europe > Germany (0.28)
North America > Canada (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Banking & Finance > Trading (0.97)
Health & Medicine > Diagnostic Medicine > Imaging (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.68)

Add feedback