AITopics

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.64)

#artificialintelligenceOct-22-2022, 03:35:43 GMT

K-Nearest Neighbours - GeeksforGeeks

K-Nearest Neighbours is one of the most basic yet essential classification algorithms in Machine Learning. It belongs to the supervised learning domain and finds intense application in pattern recognition, data mining and intrusion detection. It is widely disposable in real-life scenarios since it is non-parametric, meaning, it does not make any underlying assumptions about the distribution of data (as opposed to other algorithms such as GMM, which assume a Gaussian distribution of the given data). We are given some prior data (also called training data), which classifies coordinates into groups identified by an attribute. Now, given another set of data points (also called testing data), allocate these points a group by analyzing the training set.

artificial intelligence, k-nearest neighbour, machine learning, (2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.63)

#artificialintelligenceOct-18-2022, 08:25:34 GMT

K-Nearest Neighbors Algorithm for ML

The k-nearest neighbors (kNN) algorithm is a simple tool that can be used for a number of real-world problems in finance, healthcare, recommendation systems, and much more. This blog post will cover what kNN is, how it works, and how to implement it in machine learning projects. The k-nearest neighbors classifier (kNN) is a non-parametric supervised machine learning algorithm. It's distance-based: it classifies objects based on their proximate neighbors' classes. What is a supervised machine learning model?

artificial intelligence, machine learning, neighbor, (14 more...)

Industry: Automobiles & Trucks (0.97)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)

arXiv.org Machine LearningOct-3-2022

Generating Synthetic Data with The Nearest Neighbors Algorithm

Kalay, Ali Furkan

The $k$ nearest neighbor algorithm ($k$NN) is one of the most popular nonparametric methods used for various purposes, such as treatment effect estimation, missing value imputation, classification, and clustering. The main advantage of $k$NN is its simplicity of hyperparameter optimization. It often produces favorable results with minimal effort. This paper proposes a generic semiparametric (or nonparametric if required) approach named Local Resampler (LR). LR utilizes $k$NN to create subsamples from the original sample and then generates synthetic values that are drawn from locally estimated distributions. LR can accurately create synthetic samples, even if the original sample has a non-convex distribution. Moreover, LR shows better or similar performance to other popular synthetic data methods with minimal model optimization with parametric distributional assumptions.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Machine Learning

2210.00884

Country:

Europe (0.14)
Oceania > Australia (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.86)

Chen, Nutan, van der Smagt, Patrick, Cseke, Botond

Local Distance Preserving Auto-encoders using Continuous k-Nearest Neighbours Graphs

arXiv.org Artificial IntelligenceSep-30-2022

Auto-encoder models that preserve similarities in the data are a popular tool in representation learning. In this paper we introduce several auto-encoder models that preserve local distances when mapping from the data space to the latent space. We use a local distance-preserving loss that is based on the continuous k-nearest neighbours graph which is known to capture topological features at all scales simultaneously. To improve training performance, we formulate learning as a constraint optimisation problem with local distance preservation as the main objective and reconstruction accuracy as a constraint. Our method provides state-ofthe-art or comparable performance across several standard datasets and evaluation metrics. Auto-encoders and variational auto-encoders (Kingma & Welling, 2014; Rezende et al., 2014) are often used in machine learning to find meaningful latent representations of the data. What constitutes meaningful usually depends on the application and on the downstream tasks, for example, finding representations that have important factors of variations in the data (disentanglement) (Higgins et al., 2017; Chen et al., 2018), have high mutual information with the data (Chen et al., 2016), or show clustering behaviour w.r.t. These representations are usually incentivised by regularisers or architectural/structural choices. One criterion for finding a meaningful latent representation is geometric faithfulness to the data. This is important for data visualisation or further downstream tasks that involve geometric algorithms such as clustering or kNN classification. The data often lies in a small, sparse, low-dimensional manifold in the space it inhabits and finding a lower dimensional projection that is geometrically faithful to it can help not only in visualisation and interpretability but also in predictive performance and robustness (e.g.

artificial intelligence, local distance preserving auto-encoder, machine learning, (14 more...)

2206.05909

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.60)

#artificialintelligenceSep-17-2022, 11:05:48 GMT

K-Nearest Neighbors Algorithm- A simple overview

Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. K-Nearest Neighbors (KNN) is one of the simplest machine learning algorithms to understand.

artificial intelligence, machine learning, simple overview, (2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)

Gupta, Abhishek, Joshi, Raunak, Kanvinde, Nandan, Gerela, Pinky, Laban, Ronald Melwin

Metric Effects based on Fluctuations in values of k in Nearest Neighbor Regressor

arXiv.org Artificial IntelligenceAug-24-2022

Regression branch of Machine Learning purely focuses on prediction of continuous values. The supervised learning branch has many regression based methods with parametric and non-parametric learning models. In this paper we aim to target a very subtle point related to distance based regression model. The distance based model used is K-Nearest Neighbors Regressor which is a supervised non-parametric method. The point that we want to prove is the effect of k parameter of the model and its fluctuations affecting the metrics. The metrics that we use are Root Mean Squared Error and R-Squared Goodness of Fit with their visual representation of values with respect to k values.

artificial intelligence, international conference, machine learning, (17 more...)

doi: 10.1007/978-981-19-6004-8_12

2208.1154

Country: Asia > India (0.17)

Genre: Research Report (0.51)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.73)

Ramesh, Arun Venkatesh, Li, Xingpeng

Feasibility Layer Aided Machine Learning Approach for Day-Ahead Operations

arXiv.org Artificial IntelligenceAug-13-2022

Day-ahead operations involves a complex and computationally intensive optimization process to determine the generator commitment schedule and dispatch. The optimization process is a mixed-integer linear program (MILP) also known as security-constrained unit commitment (SCUC). Independent system operators (ISOs) run SCUC daily and require state-of-the-art algorithms to speed up the process. Existing patterns in historical information can be leveraged for model reduction of SCUC, which can provide significant time savings. In this paper, machine learning (ML) based classification approaches, namely logistic regression, neural networks, random forest and K-nearest neighbor, were studied for model reduction of SCUC. The ML was then aided with a feasibility layer (FL) and post-process technique to ensure high-quality solutions. The proposed approach is validated on several test systems namely, IEEE 24-Bus system, IEEE-73 Bus system, IEEE 118-Bus system, 500-Bus system, and Polish 2383-Bus system. Moreover, model reduction of a stochastic SCUC (SSCUC) was demonstrated utilizing a modified IEEE 24-Bus system with renewable generation. Simulation results demonstrate a high training accuracy to identify commitment schedule while FL and post-process ensure ML predictions do not lead to infeasible solutions with minimal loss in solution quality.

artificial intelligence, generator, machine learning, (18 more...)

2208.06742

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.68)

Industry:

Energy > Renewable (1.00)
Energy > Power Industry > Utilities (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Cacha, Ignacio Heredia, Díaz, Judith Sainz-Pardo, Melguizo, María Castrillo, García, Álvaro López

Forecasting COVID-19 spreading trough an ensemble of classical and machine learning models: Spain's case study

arXiv.org Artificial IntelligenceAug-12-2022

In this work we evaluate the applicability of an ensemble of population models and machine learning models to predict the near future evolution of the COVID-19 pandemic, with a particular use case in Spain. We rely solely in open and public datasets, fusing incidence, vaccination, human mobility and weather data to feed our machine learning models (Random Forest, Gradient Boosting, k-Nearest Neighbours and Kernel Ridge Regression). We use the incidence data to adjust classic population models (Gompertz, Logistic, Richards, Bertalanffy) in order to be able to better capture the trend of the data. We then ensemble these two families of models in order to obtain a more robust and accurate prediction. Furthermore, we have observed an improvement in the predictions obtained with machine learning models as we add new features (vaccines, mobility, climatic conditions), analyzing the importance of each of them using Shapley Additive Explanation values. As in any other modelling work, data and predictions quality have several limitations and therefore they must be seen from a critical standpoint, as we discuss in the text. Our work concludes that the ensemble use of these models improves the individual predictions (using only machine learning models or only population models) and can be applied, with caution, in cases when compartmental models cannot be utilized due to the lack of relevant data.

artificial intelligence, machine learning, prediction, (19 more...)

doi: 10.1038/s41598-023-33795-8

2207.05753

Country:

North America (1.00)
Asia (1.00)
Europe > Spain > Cantabria (0.14)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area > Vaccines (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.34)

Kumar, Anshul, DiJohnson, Taylor, Edwards, Roger, Walker, Lisa

The application of adaptive minimum match k-nearest neighbors to identify at-risk students in health professions education

arXiv.org Artificial IntelligenceAug-9-2022

Purpose: When a learner fails to reach a milestone, educators often wonder if there had been any warning signs that could have allowed them to intervene sooner. Machine learning can predict which students are at risk of failing a high-stakes certification exam. If predictions can be made well in advance of the exam, then educators can meaningfully intervene before students take the exam to reduce the chances of a failing score. Methods: Using already-collected, first-year student assessment data from five cohorts in a Master of Physician Assistant Studies program, the authors implement an "adaptive minimum match" version of the k-nearest neighbors algorithm (AMMKNN), using changing numbers of neighbors to predict each student's future exam scores on the Physician Assistant National Certifying Examination (PANCE). Validation occurred in two ways: Leave-one-out cross-validation (LOOCV) and evaluating the predictions in a new cohort. Results: AMMKNN achieved an accuracy of 93% in LOOCV. AMMKNN generates a predicted PANCE score for each student, one year before they are scheduled to take the exam. Students can then be classified into extra support, optional extra support, or no extra support groups. The educator then has one year to provide the appropriate customized support to each category of student. Conclusions: Predictive analytics can identify at-risk students, so they can receive additional support or remediation when preparing for high-stakes certification exams. Educators can use the included methods and code to generate predicted test outcomes for students. The authors recommend that educators use this or similar predictive methods responsibly and transparently, as one of many tools used to support students.

artificial intelligence, data mining, machine learning, (19 more...)

doi: 10.1097/JPA.0000000000000513

2108.07709

Country: Europe > Austria > Vienna (0.14)

Genre:

Instructional Material > Course Syllabus & Notes (0.93)
Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Education > Educational Setting (1.00)
Education > Assessment & Standards > Student Performance (0.67)
Education > Curriculum > Subject-Specific Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)