Goto

Collaborating Authors

 Nearest Neighbor Methods


GraphPrompter: Multi-stage Adaptive Prompt Optimization for Graph In-Context Learning

arXiv.org Artificial Intelligence

--Graph In-Context Learning, with the ability to adapt pre-trained graph models to novel and diverse downstream graphs without updating any parameters, has gained much attention in the community. The key to graph in-context learning is to perform downstream graphs conditioned on chosen prompt examples. Existing methods randomly select subgraphs or edges as prompts, leading to noisy graph prompts and inferior model performance. Additionally, due to the gap between pre-training and testing graphs, when the number of classes in the testing graphs is much greater than that in the training, the in-context learning ability will also significantly deteriorate. T o tackle the aforementioned challenges, we develop a multi-stage adaptive prompt optimization method GraphPrompter, which optimizes the entire process of generating, selecting, and using graph prompts for better in-context learning capabilities. Firstly, Prompt Generator introduces a reconstruction layer to highlight the most informative edges and reduce irrelevant noise for graph prompt construction. Furthermore, in the selection stage, Prompt Selector employs the k -nearest neighbors algorithm and pre-trained selection layers to dynamically choose appropriate samples and minimize the influence of irrelevant prompts. Finally, we leverage a Prompt Augmenter with a cache replacement strategy to enhance the generalization capability of the pre-trained model on new datasets. Extensive experiments show that GraphPrompter effectively enhances the in-context learning ability of graph models. One of the most fascinating properties of Large Language Models (LLMs) is its In-Context Learning capability [1], [2]. It refers to the ability of a pre-trained LLM to achieve competitive results on downstream tasks given only a few prompt examples during the prediction phase, without updating the model weights through fine-tuning approaches. Recently, there have been efforts to transfer this In-Context learning capability from large language models to graph models [3]-[5]. Out of these methods, Prodigy [3] and One For All (OFA) [5] stand out as the most effective frameworks that unify diverse levels of graph-related tasks and achieve competitive in-context learning performance. Generally, the graph in-context learning architecture can be divided into two main parts including data/prompt graph construction and task graph prediction (see Figure 1 as an example for edge classification). Figure 1: Graph In-Context Learning (edge classification as an example) with random prompts selection.


Boosting KNNClassifier Performance with Opposition-Based Data Transformation

arXiv.org Artificial Intelligence

In this paper, we introduce a novel data transformation framework based on Opposition-Based Learning (OBL) to boost the performance of traditional classification algorithms. Originally developed to accelerate convergence in optimization tasks, OBL is leveraged here to generate synthetic opposite samples that enrich the training data and improve decision boundary formation. We explore three OBL variants Global OBL, Class-Wise OBL, and Localized Class-Wise OBL and integrate them with K-Nearest Neighbors (KNN). Extensive experiments conducted on 26 heterogeneous and high-dimensional datasets demonstrate that OBL-enhanced classifiers consistently outperform the basic KNN. These findings underscore the potential of OBL as a lightweight yet powerful data transformation strategy for enhancing classification performance, especially in complex or sparse learning environments.


Capturing Symmetry and Antisymmetry in Language Models through Symmetry-Aware Training Objectives

arXiv.org Artificial Intelligence

Capturing symmetric (e.g., country borders another country) and antisymmetric (e.g., parent_of) relations is crucial for a variety of applications. This paper tackles this challenge by introducing a novel Wikidata-derived natural language inference dataset designed to evaluate large language models (LLMs). Our findings reveal that LLMs perform comparably to random chance on this benchmark, highlighting a gap in relational understanding. To address this, we explore encoder retraining via contrastive learning with k-nearest neighbors. The retrained encoder matches the performance of fine-tuned classification heads while offering additional benefits, including greater efficiency in few-shot learning and improved mitigation of catastrophic forgetting.


LayerFlow: Layer-wise Exploration of LLM Embeddings using Uncertainty-aware Interlinked Projections

arXiv.org Artificial Intelligence

Figure 1: LayerFlow supports the analysis of contextual word embedding properties. T o increase the awareness of the potential uncertainty within the transformation, representation, and interpretation steps of the used processing pipeline, we utilize multiple visual components such as cluster convex-hulls, pairwise distances, cluster summaries, projection quality metrics, and connections of k-nearest neighbors.Abstract Large language models (LLMs) represent words through contextual word embeddings encoding different language properties like semantics and syntax. Understanding these properties is crucial, especially for researchers investigating language model capabilities, employing embeddings for tasks related to text similarity, or evaluating the reasons behind token importance as measured through attribution methods. Applications for embedding exploration frequently involve dimensionality reduction techniques, which reduce high-dimensional vectors to two dimensions used as coordinates in a scatterplot. This data transformation step introduces uncertainty that can be propagated to the visual representation and influence users' interpretation of the data. T o communicate such uncertainties, we present LayerFlow - a visual analytics workspace that displays embeddings in an interlinked projection design and communicates the transformation, representation, and interpretation uncertainty. In particular, to hint at potential data distortions and uncertainties, the workspace includes several visual components, such as convex hulls showing 2D and HD clusters, data point pairwise distances, cluster summaries, and projection quality metrics. W e show the usability of the presented workspace through replication and expert case studies that highlight the need to communicate uncertainty through multiple visual components and different data perspectives. CCS Concepts Human-centered computing Visual analytics; Mathematics of computing Dimensionality reduction;1 Introduction In recent years, a large number of deep-learning-based language models (e.g., BERT [DCL T19]) have emerged, demonstrating remarkable performance in natural language processing (NLP) and understanding tasks. These models learn from large text datasets, acquiring language structures in an unsupervised manner. Thereby, they produce contextual word embeddings, representing words through vectors encoding different language properties. Extensive research has been conducted to understand the linguistic properties embedded in these vectors. For instance, research indicates that BERT's middle layers capture syntactic features like dependency trees while early layers encode lexical features [RKR20]. Analyzing these properties helps researchers better understand how language models process data and aids in developing models that generalize well, reducing biases and improving inclusivity.


Adaptive Locally Linear Embedding

arXiv.org Artificial Intelligence

Ali Goli 1, Mahdieh Alizadeh 1, and Hadi Sadoghi Yazdi 1,2 1 Department of Computer Engineering, Ferdowsi University of Mashhad, Mashhad, Iran 2 Center of Excellence in Soft Computing and Intelligent Information Processing, Ferdowsi University of Mashhad, Mashhad, Iran April 10, 2025 Abstract Manifold learning techniques, such as Locally linear embedding (LLE), are designed to preserve the local neighborhood structures of high-dimensional data during dimensionality reduction. Traditional LLE employs Euclidean distance to define neighborhoods, which can struggle to capture the intrinsic geometric relationships within complex data. A novel approach, Adaptive locally linear embedding(ALLE), is introduced to address this limitation by incorporating a dynamic, data-driven metric that enhances topological preservation. This method redefines the concept of proximity by focusing on topological neighborhood inclusion rather than fixed distances. By adapting the metric based on the local structure of the data, it achieves superior neighborhood preservation, particularly for datasets with complex geometries and high-dimensional structures. Experimental results demonstrate that ALLE significantly improves the alignment between neighborhoods in the input and feature spaces, resulting in more accurate and topologically faithful embeddings. Keywords-- Manifold Learning, Adaptive Locally Linear Embedding, Dimensionality Reduction, Topological Preservation, Complex Geometries, High-Dimensional Data, Topological Neighborhood Inclusion, Intrinsic Geometric Relationships 1 Introduction Locally linear embedding(LLE) is a prominent manifold learning technique designed to reduce the dimensionality of high-dimensional datasets while preserving their intrinsic geometric structure. Proposed by Roweis and Saul, LLE operates through a systematic process that includes identifying the K-nearest neighbors for each data point, calculating reconstruction weights to express each point as a linear combination of its neighbors, and ultimately generating a low-dimensional representation that retains local relationships [14]. However, LLE traditionally relies on fixed distance metrics, such as Euclidean distance, which may inadequately represent complex data distributions and fail to capture nuanced topological relationships. In response to these limitations, we introduce a novel approach termed Adaptive LLE(ALLE), which integrates a flexible, data-driven metric into the LLE framework.


Novel sparse PCA method via Runge Kutta numerical method(s) for face recognition

arXiv.org Artificial Intelligence

Face recognition is a crucial topic in data science and biometric security, with applications spanning military, finance, and retail industries. This paper explores the implementation of sparse Principal Component Analysis (PCA) using the Proximal Gradient method (also known as ISTA) and the Runge - Kutta numerical methods. To address the face recognition problem, we integrate sparse PCA with either the k - nearest neighbor method or the kernel ridge regression method. Experimental results demonstrate that combining sparse PCA -- solved via the Proximal Gradient method or the Runge - Kutta numerical approach -- with a classification system yields higher accuracy compared to standard PCA. Additionally, we observe that the Runge - Kutta - based sparse PCA computation consistently outperforms the Proximal Gradient method in terms of speed.


Solve sparse PCA problem by employing Hamiltonian system and leapfrog method

arXiv.org Artificial Intelligence

Principal Component Analysis (PCA) is a widely utilized technique for dimensionality reduction; however, its inherent lack of interpretability-stemming from dense linear combinations of all feature-limits its applicability in many domains. In this paper, we propose a novel sparse PCA algorithm that imposes sparsity through a smooth L1 penalty and leverages a Hamiltonian formulation solved via geometric integration techniques. Specifically, we implement two distinct numerical methods-one based on the Proximal Gradient (ISTA) approach and another employing a leapfrog (fourth-order Runge-Kutta) scheme-to minimize the energy function that balances variance maximization with sparsity enforcement. To extract a subset of sparse principal components, we further incorporate a deflation technique and subsequently transform the original high-dimensional face data into a lower-dimensional feature space. Experimental evaluations on a face recognition dataset-using both k-nearest neighbor and kernel ridge regression classifiers-demonstrate that the proposed sparse PCA methods consistently achieve higher classification accuracy than conventional PCA. Future research will extend this framework to integrate sparse PCA with modern deep learning architectures for multimodal recognition tasks.


k-NN as a Simple and Effective Estimator of Transferability

arXiv.org Artificial Intelligence

How well can one expect transfer learning to work in a new setting where the domain is shifted, the task is different, and the architecture changes? Many transfer learning metrics have been proposed to answer this question. But how accurate are their predictions in a realistic new setting? We conducted an extensive evaluation involving over 42,000 experiments comparing 23 transferability metrics across 16 different datasets to assess their ability to predict transfer performance. Our findings reveal that none of the existing metrics perform well across the board. However, we find that a simple k-nearest neighbor evaluation -- as is commonly used to evaluate feature quality for self-supervision -- not only surpasses existing metrics, but also offers better computational efficiency and ease of implementation.


ML-Based Bidding Price Prediction for Pay-As-Bid Ancillary Services Markets: A Use Case in the German Control Reserve Market

arXiv.org Machine Learning

The increasing integration of renewable energy sources has led to greater volatility and unpredictability in electricity generation, posing challenges to grid stability. Ancillary service markets, such as the German control reserve market, allow industrial consumers and producers to offer flexibility in their power consumption or generation, contributing to grid stability while earning additional income. However, many participants use simple bidding strategies that may not maximize their revenues. This paper presents a methodology for forecasting bidding prices in pay-as-bid ancillary service markets, focusing on the German control reserve market. We evaluate various machine learning models, including Support Vector Regression, Decision Trees, and k-Nearest Neighbors, and compare their performance against benchmark models. To address the asymmetry in the revenue function of pay-as-bid markets, we introduce an offset adjustment technique that enhances the practical applicability of the forecasting models. Our analysis demonstrates that the proposed approach improves potential revenues by 27.43 % to 37.31 % compared to baseline models. When analyzing the relationship between the model forecasting errors and the revenue, a negative correlation is measured for three markets; according to the results, a reduction of 1 EUR/MW model price forecasting error (MAE) statistically leads to a yearly revenue increase between 483 EUR/MW and 3,631 EUR/MW. The proposed methodology enables industrial participants to optimize their bidding strategies, leading to increased earnings and contributing to the efficiency and stability of the electrical grid.


Effective Feature Selection for Predicting Spreading Factor with ML in Large LoRaWAN-based Mobile IoT Networks

arXiv.org Artificial Intelligence

LoRaWAN is a low-power long-range protocol that enables reliable and robust communication. This paper addresses the challenge of predicting the spreading factor (SF) in LoRaWAN networks using machine learning (ML) techniques. Optimal SF allocation is crucial for optimizing data transmission in IoT-enabled mobile devices, yet it remains a challenging task due to the fluctuation in environment and network conditions. We evaluated ML model performance across a large publicly available dataset to explore the best feature across key LoRaWAN features such as RSSI, SNR, frequency, distance between end devices and gateways, and antenna height of the end device, further, we also experimented with 31 different combinations possible for 5 features. We trained and evaluated the model using k-nearest neighbors (k-NN), Decision Tree Classifier (DTC), Random Forest (RF), and Multinomial Logistic Regression (MLR) algorithms. The combination of RSSI and SNR was identified as the best feature set. The finding of this paper provides valuable information for reducing the overall cost of dataset collection for ML model training and extending the battery life of LoRaWAN devices. This work contributes to a more reliable LoRaWAN system by understanding the importance of specific feature sets for optimized SF allocation.