AITopics | Nearest Neighbor Methods

Collaborating Authors

Nearest Neighbor Methods

News Overviews Instructional Materials AI-Alerts Classics

Decouple Non-parametric Knowledge Distillation For End-to-end Speech Translation

Zhang, Hao, Si, Nianwen, Chen, Yaqi, Zhang, Wenlin, Yang, Xukui, Qu, Dan, Li, Zhen

arXiv.org Artificial IntelligenceApr-20-2023

Existing techniques often attempt to make knowledge transfer from a powerful machine translation (MT) to speech translation (ST) model with some elaborate techniques, which often requires transcription as extra input during training. However, transcriptions are not always available, and how to improve the ST model performance without transcription, i.e., data efficiency, has rarely been studied in the literature. In this paper, we propose Decoupled Non-parametric Knowledge Distillation (DNKD) from data perspective to improve the data efficiency. Our method follows the knowledge distillation paradigm. However, instead of obtaining the teacher distribution from a sophisticated MT model, we construct it from a non-parametric datastore via k-Nearest-Neighbor (kNN) retrieval, which removes the dependence on transcription and MT model. Then we decouple the classic knowledge distillation loss into target and non-target distillation to enhance the effect of the knowledge among non-target logits, which is the prominent "dark knowledge". Experiments on MuST-C corpus show that, the proposed method can achieve consistent improvement over the strong baseline without requiring any transcription.

machine learning, natural language, translation, (19 more...)

arXiv.org Artificial Intelligence

2304.10295

Country:

North America > United States > Washington > King County > Seattle (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > Belgium (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.55)

Add feedback

How Sentiment Classification works part1(Machine Learning)

#artificialintelligenceApr-8-2023, 20:00:15 GMT

Abstract: With the rapid growth of the use of social media websites, obtaining the users' feedback automatically became a crucial task to evaluate their tendencies and behaviors online. Despite this great availability of information, and the increasing number of Arabic users only few research has managed to treat Arabic dialects. The purpose of this paper is to study the opinion and emotion expressed in real Moroccan texts precisely in the YouTube comments using some well-known and commonly used methods for sentiment analysis. In this paper, we present our work of Moroccan dialect comments classification using Machine Learning (ML) models and based on our collected and manually annotated YouTube Moroccan dialect dataset. By employing many text preprocessing and data representation techniques we aim to compare our classification results utilizing the most commonly used supervised classifiers: k-nearest neighbors (KNN), Support Vector Machine (SVM), Naive Bayes (NB), and deep learning (DL) classifiers such as Convolutional Neural Network (CNN) and Long Short-Term Memory (LTSM).

machine learning, pre-trained model, sentiment classification work part1, (2 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.58)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.58)

Add feedback

Clustering-based Imputation for Dropout Buyers in Large-scale Online Experimentation

Shen, Sumin, Mao, Huiying, Zhang, Zezhong, Chen, Zili, Nie, Keyu, Deng, Xinwei

arXiv.org Artificial IntelligenceApr-7-2023

In online experimentation, appropriate metrics (e.g., purchase) provide strong evidence to support hypotheses and enhance the decision-making process. However, incomplete metrics are frequently occurred in the online experimentation, making the available data to be much fewer than the planned online experiments (e.g., A/B testing). In this work, we introduce the concept of dropout buyers and categorize users with incomplete metric values into two groups: visitors and dropout buyers. For the analysis of incomplete metrics, we propose a clustering-based imputation method using $k$-nearest neighbors. Our proposed imputation method considers both the experiment-specific features and users' activities along their shopping paths, allowing different imputation values for different users. To facilitate efficient imputation of large-scale data sets in online experimentation, the proposed method uses a combination of stratification and clustering. The performance of the proposed method is compared to several conventional methods in both simulation studies and a real online experiment at eBay.

artificial intelligence, information management, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2209.06125

Country:

North America > United States > California > Santa Clara County > San Jose (0.04)
Asia > China > Shanghai > Shanghai (0.04)
North America > United States > Virginia > Montgomery County > Blacksburg (0.04)
North America > United States > California > Alameda County > Oakland (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (1.00)
Information Technology > Services (0.35)

Technology:

Information Technology > Data Science (0.94)
Information Technology > Information Management (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.90)
(2 more...)

Add feedback

Neural Net and Traditional Classifiers

Neural Information Processing SystemsApr-6-2023, 20:04:48 GMT

Previous work on nets with continuous-valued inputs led to generative procedures to construct convex decision regions with two-layer perceptrons (one hidden layer) and arbitrary decision regions with three-layer perceptrons (two hidden layers). Here we demonstrate that two-layer perceptron classifiers trained with back propagation can form both convex and disjoint decision regions. Such classifiers are robust, train rapidly, and provide good performance with simple decision regions. When complex decision regions are required, however, convergence time can be excessively long and performance is often no better than that of k-nearest neighbor classifiers. Three neural net classifiers are presented that provide more rapid training under such situations. Two use fixed weights in the first one or two layers and are similar to classifiers that estimate probability density functions using histograms. A third "feature map classifier" uses both unsupervised and supervised training. It provides good performance with little supervised training in situations such as speech recognition where much unlabeled training data is available. The architecture of this classifier can be used to implement a neural net k-nearest neighbor classifier.

classifier, decision region, neural net and traditional classifier, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Practical Characteristics of Neural Network and Conventional Pattern Classifiers on Artificial and Speech Problems

Neural Information Processing SystemsApr-6-2023, 19:51:52 GMT

Eight neural net and conventional pattern classifiers (Bayesian(cid:173) unimodal Gaussian, k-nearest neighbor, standard back-propagation, adaptive-stepsize back-propagation, hypersphere, feature-map, learn(cid:173) ing vector quantizer, and binary decision tree) were implemented on a serial computer and compared using two speech recognition and two artificial tasks. Error rates were statistically equivalent on almost all tasks, but classifiers differed by orders of magnitude in memory requirements, training time, classification time, and ease of adaptivity. Nearest-neighbor classifiers trained rapidly but re(cid:173) quired the most memory. Tree classifiers provided rapid classifica(cid:173) tion but were complex to adapt. Back-propagation classifiers typ(cid:173) ically required long training times and had intermediate memory requirements.

network and conventional pattern classifier, neural network, practical characteristic, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Asymptotic slowing down of the nearest-neighbor classifier

Neural Information Processing SystemsApr-6-2023, 19:38:53 GMT

Although the value of the coefficient a depends upon the underlying probability distributions, the exponent of M is largely distri(cid:173) bution free. We thus obtain a concise relation between a classifier's ability to generalize from a finite reference sample and the dimensionality of the feature space, as well as an analytic validation of Bellman's well known "curse of dimensionality."

asymptotic, nearest-neighbor classifier, probability distribution, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.46)

Add feedback

A Comparative Study of the Practical Characteristics of Neural Network and Conventional Pattern Classifiers

Neural Information Processing SystemsApr-6-2023, 19:33:52 GMT

Seven different pattern classifiers were implemented on a serial computer and compared using artificial and speech recognition tasks. Two neural network (radial basis function and high order polynomial GMDH network) and five conventional classifiers (Gaussian mixture, linear tree, K nearest neighbor, KD-tree, and condensed K nearest neighbor) were evaluated. Classifiers were chosen to be representative of different approaches to pat(cid:173) tern classification and to complement and extend those evaluated in a previous study (Lee and Lippmann, 1989). This and the previous study both demonstrate that classification error rates can be equivalent across different classifiers when they are powerful enough to form minimum er(cid:173) ror decision regions, when they are properly tuned, and when sufficient training data is available. Practical characteristics such as training time, classification time, and memory requirements, however, can differ by or(cid:173) ders of magnitude.

comparative study, network and conventional pattern classifier, practical characteristic, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)

Add feedback

Using Genetic Algorithms to Improve Pattern Classification Performance

Neural Information Processing SystemsApr-6-2023, 19:28:25 GMT

Genetic algorithms were used to select and create features and to select reference exemplar patterns for machine vision and speech pattern classi(cid:173) fication tasks. For a complex speech recognition task, genetic algorithms required no more computation time than traditional approaches to feature selection but reduced the number of input features required by a factor of five (from 153 to 33 features). On a difficult artificial machine-vision task, genetic algorithms were able to create new features (polynomial functions of the original features) which reduced classification error rates from 19% to almost 0%. Neural net and k nearest neighbor (KNN) classifiers were unable to provide such low error rates using only the original features. Ge(cid:173) netic algorithms were also used to reduce the number of reference exemplar patterns for a KNN classifier.

genetic algorithm, pattern classification performance, reference exemplar pattern, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.62)

Add feedback

Efficient Pattern Recognition Using a New Transformation Distance

Neural Information Processing SystemsApr-6-2023, 19:03:51 GMT

Memory-based classification algorithms such as radial basis func(cid:173) tions or K-nearest neighbors typically rely on simple distances (Eu(cid:173) clidean, dot product ...), which are not particularly meaningful on pattern vectors. More complex, better suited distance measures are often expensive and rather ad-hoc (elastic matching, deformable templates). We propose a new distance measure which (a) can be made locally invariant to any set of transformations of the input and (b) can be computed efficiently. We tested the method on large handwritten character databases provided by the Post Office and the NIST. Using invariances with respect to translation, rota(cid:173) tion, scaling, shearing and line thickness, the method consistently outperformed all other systems tested on the same databases.

distance measure, efficient pattern recognition, new transformation distance, (2 more...)

Neural Information Processing Systems

Industry: Government > Post Office (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.68)

Add feedback

Locally Adaptive Nearest Neighbor Algorithms

Neural Information Processing SystemsApr-6-2023, 18:52:15 GMT

Four versions of a k-nearest neighbor algorithm with locally adap(cid:173) tive k are introduced and compared to the basic k-nearest neigh(cid:173) bor algorithm (kNN). Locally adaptive kNN algorithms choose the value of k that should be used to classify a query by consulting the results of cross-validation computations in the local neighborhood of the query. Local kNN methods are shown to perform similar to kNN in experiments with twelve commonly used data sets. Encour(cid:173) aging results in three constructed tasks show that local methods can significantly outperform kNN in specific applications. Local methods can be recommended for on-line learning and for appli(cid:173) cations where different regions of the input space are covered by patterns solving different sub-tasks.

adaptive nearest neighbor algorithm, cid, knn, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)

Add feedback