Perceptrons
Morphological Perceptron with Competitive Layer: Training Using Convex-Concave Procedure
Cunha, Iara, Valle, Marcos Eduardo
A morphological perceptron is a multilayer feedforward neural network in which neurons perform elementary operations from mathematical morphology. For multiclass classification tasks, a morphological perceptron with a competitive layer (MPCL) is obtained by integrating a winner-take-all output layer into the standard morphological architecture. The non-differentiability of morphological operators renders gradient-based optimization methods unsuitable for training such networks. Consequently, alternative strategies that do not depend on gradient information are commonly adopted. This paper proposes the use of the convex-concave procedure (CCP) for training MPCL networks. The training problem is formulated as a difference of convex (DC) functions and solved iteratively using CCP, resulting in a sequence of linear programming subproblems. Computational experiments demonstrate the effectiveness of the proposed training method in addressing classification tasks with MPCL networks.
Beyond ROUGE: N-Gram Subspace Features for LLM Hallucination Detection
Li, Jerry, Papalexakis, Evangelos
Large Language Models (LLMs) have demonstrated effectiveness across a wide variety of tasks involving natural language, however, a fundamental problem of hallucinations still plagues these models, limiting their trustworthiness in generating consistent, truthful information. Detecting hallucinations has quickly become an important topic, with various methods such as uncertainty estimation, LLM Judges, retrieval augmented generation (RAG), and consistency checks showing promise. Many of these methods build upon foundational metrics, such as ROUGE, BERTScore, or Perplexity, which often lack the semantic depth necessary to detect hallucinations effectively. In this work, we propose a novel approach inspired by ROUGE that constructs an N-Gram frequency tensor from LLM-generated text. This tensor captures richer semantic structure by encoding co-occurrence patterns, enabling better differentiation between factual and hallucinated content. We demonstrate this by applying tensor decomposition methods to extract singular values from each mode and use these as input features to train a multi-layer perceptron (MLP) binary classifier for hallucinations. Our method is evaluated on the HaluEval dataset and demonstrates significant improvements over traditional baselines, as well as competitive performance against state-of-the-art LLM judges.
QuadKAN: KAN-Enhanced Quadruped Motion Control via End-to-End Reinforcement Learning
Legged robots offer mobility where wheeled platforms fail, such as stairs, rubble, soft substrates, and cluttered indoor-outdoor settings, enabling applications in inspection, search and rescue, agriculture, and planetary exploration [1]. Robust locomotion control is therefore a foundational capability for practical quadrupedal systems, underpinning safe navigation and dependable operation across diverse terrains and disturbances [2]. Deep reinforcement learning (DRL) has emerged as a compelling paradigm for such control because it optimizes closed-loop policies through interaction and can produce adaptive behaviors [3]. A substantial body of prior work has focused on training blind controllers that rely exclusively on proprioceptive inputs such as inertial measurement units (IMUs) and joint feedback [4]. While these blind policies can traverse uneven and unknown terrains through large-scale simulation and domain randomization, they inherently lack foresight: without exteroceptive input, they respond only upon contact and struggle to proactively avoid obstacles or plan foot placement on irregular ground. Vision complements proprioception by providing anticipatory geometric information, enabling early detection of distant obstacles and terrain changes [5]. As a result, cross-modal policies that integrate proprioception with depth imaging have gained prominence, facilitating safer and more efficient locomotion through earlier trajectory adjustments. Most existing cross-modal pipelines adopt multilayer perceptrons (MLPs) for the proprioceptive encoder and for the decision head that fuses proprioception with vision.
Peptidomic-Based Prediction Model for Coronary Heart Disease Using a Multilayer Perceptron Neural Network
Coronary heart disease (CHD) is a leading cause of death worldwide and contributes significantly to annual healthcare expenditures. To develop a non-invasive diagnostic approach, we designed a model based on a multilayer perceptron (MLP) neural network, trained on 50 key urinary peptide biomarkers selected via genetic algorithms. Treatment and control groups, each comprising 345 individuals, were balanced using the Synthetic Minority Over-sampling Technique (SMOTE). The neural network was trained using a stratified validation strategy. Using a network with three hidden layers of 60 neurons each and an output layer of two neurons, the model achieved a precision, sensitivity, and specificity of 95.67 percent, with an F1-score of 0.9565. The area under the ROC curve (AUC) reached 0.9748 for both classes, while the Matthews correlation coefficient (MCC) and Cohen's kappa coefficient were 0.9134 and 0.9131, respectively, demonstrating its reliability in detecting CHD. These results indicate that the model provides a highly accurate and robust non-invasive diagnostic tool for coronary heart disease.
Graph Contrastive Learning versus Untrained Baselines: The Role of Dataset Size
Khanna, Smayan, Gรถkmen, Doruk Efe, Kondor, Risi, Vitelli, Vincenzo
Graph Contrastive Learning (GCL) has emerged as a leading paradigm for self-supervised learning on graphs, with strong performance reported on standardized datasets and growing applications ranging from genomics to drug discovery. We ask a basic question: does GCL actually outperform untrained baselines? We find that GCL's advantage depends strongly on dataset size and task difficulty. On standard datasets, untrained Graph Neural Networks (GNNs), simple multilayer perceptrons, and even handcrafted statistics can rival or exceed GCL. On the large molecular dataset ogbg-molhiv, we observe a crossover: GCL lags at small scales but pulls ahead beyond a few thousand graphs, though this gain eventually plateaus. On synthetic datasets, GCL accuracy approximately scales with the logarithm of the number of graphs and its performance gap (compared with untrained GNNs) varies with respect to task complexity. Moving forward, it is crucial to identify the role of dataset size in benchmarks and applications, as well as to design GCL algorithms that avoid performance plateaus.
An Efficient GNNs-to-KANs Distillation via Self-Attention Dynamic Sampling with Potential for Consumer Electronics Edge Deployment
Cui, Can, Fu, Zilong, Huang, Penghe, Li, Yuanyuan, Deng, Wu, Li, Dongyan
Knowledge distillation (KD) is crucial for deploying deep learning models in resource-constrained edge environments, particularly within the consumer electronics sector, including smart home devices, wearable technology, and mobile terminals. These applications place higher demands on model compression and inference speed, necessitating the transfer of knowledge from Graph Neural Networks (GNNs) to more efficient Multi-Layer Perceptron (MLP) models. However, due to their fixed activation functions and fully connected architecture, MLPs face challenges in rapidly capturing the complex neighborhood dependencies learned by GNNs, thereby limiting their performance in edge environments. To address these limitations, this paper introduces an innovative from GNNs to Kolmogorov-Arnold Networks (KANs) knowledge distillation framework-Self Attention Dynamic Sampling Distillation (SA-DSD). This study improved Fourier KAN (FR-KAN) and replaced MLP with the improved FR-KAN+ as the student model. Through the incorporation of learnable frequency bases and phase-shift mechanisms, along with algorithmic optimization, FR-KAN significantly improves its nonlinear fitting capability while effectively reducing computational complexity. Building on this, a margin-level sampling probability matrix, based on teacher-student prediction consistency, is constructed, and an adaptive weighted loss mechanism is designed to mitigate performance degradation in the student model due to the lack of explicit neighborhood aggregation. Extensive experiments conducted on six real-world datasets demonstrate that SA-DSD achieves performance improvements of 3.05%-3.62% over three GNN teacher models and 15.61% over the FR-KAN+ model. Moreover, when compared with key benchmark models, SA-DSD achieves a 16.96x reduction in parameter count and a 55.75% decrease in inference time.
High-Fidelity Prediction of Perturbed Optical Fields using Fourier Feature Networks
Jandrell, Joshua R., Cox, Mitchell A.
Predicting the effects of physical perturbations on optical channels is critical for advanced photonic devices, but existing modelling techniques are often computationally intensive or require exhaustive characterisation. We present a novel data-efficient machine learning framework that learns the perturbation-dependent transmission matrix of a multimode fibre. To overcome the challenge of modelling the resulting highly oscillatory functions, we encode the perturbation into a Fourier Feature basis, enabling a compact multi-layer perceptron to learn the mapping with high fidelity. On experimental data from a compressed fibre, our model predicts the output field with a 0.995 complex correlation to the ground truth, improving accuracy by an order of magnitude over standard networks while using 85\% fewer parameters. This approach provides a general tool for modelling complex optical systems from sparse measurements.
The Demon is in Ambiguity: Revisiting Situation Recognition with Single Positive Multi-Label Learning
Lin, Yiming, Niu, Yuchen, Wang, Shang, Huang, Kaizhu, Wang, Qiufeng, Jin, Xiao-Bo
--Context recognition (SR) is a fundamental task in computer vision that aims to extract structured semantic summaries from images by identifying key events and their associated entities. Specifically, given an input image, the model must first classify the main visual events (verb classification), then identify the participating entities and their semantic roles (semantic role labeling), and finally localize these entities in the image (semantic role localization). Existing methods treat verb classification as a single-label problem, but we show through a comprehensive analysis that this formulation fails to address the inherent ambiguity in visual event recognition, as multiple verb categories may reasonably describe the same image. This paper makes three key contributions: First, we reveal through empirical analysis that verb classification is inherently a multi-label problem due to the ubiquitous semantic overlap between verb categories. Second, given the impracticality of fully annotating large-scale datasets with multiple labels, we propose to reformulate verb classification as a single positive multi-label learning (SPMLL) problem - a novel perspective in SR research. Third, we design a comprehensive multi-label evaluation benchmark for SR that is carefully designed to fairly evaluate model performance in a multi-label setting. T o address the challenges of SPMLL, we futher develop the Graph Enhanced V erb Multilayer Perceptron (GE-V erbMLP), which combines graph neural networks to capture label correlations and adversarial training to optimize decision boundaries. Extensive experiments on real-world datasets show that our approach achieves more than 3% improvement on the more meaningful multi-label A verage Precision (MAP) metric while remaining competitive on traditional top-1 and top-5 accuracy metrics. T o our knowledge, our research is the first work that the formulate, solving, and evaluating of verb classification in the SPMLL fashion, which provides theoretical insights and practical tools for advancing situation recognition research. Modern multimedia applications increasingly demand systems that can understand images at both the object level (recognizing individual entities) and the event level (comprehending interactions and activities). Situation Recognition (SR) has emerged as a crucial task addressing this need by extracting structured semantic representations from images [25], [26].
A Sobel-Gradient MLP Baseline for Handwritten Character Recognition
We revisit the classical Sobel operator to ask a simple question: Are first-order edge maps sufficient to drive an all-dense multilayer perceptron (MLP) for handwritten character recognition (HCR), as an alternative to convolutional neural networks (CNNs)? Using only horizontal and vertical Sobel derivatives as input, we train an MLP on MNIST and EMNIST Letters. Despite its extreme simplicity, the resulting network reaches 98% accuracy on MNIST digits and 92% on EMNIST letters -- approaching CNNs while offering a smaller memory footprint and transparent features. Our findings highlight that much of the class-discriminative information in handwritten character images is already captured by first-order gradients, making edge-aware MLPs a compelling option for HCR.
Ranked Set Sampling-Based Multilayer Perceptron: Improving Generalization via Variance-Based Bounds
Li, Feijiang, Zhang, Liuya, Wang, Jieting, Yan, Tao, Qian, Yuhua
Multilayer perceptron (MLP), one of the most fundamental neural networks, is extensively utilized for classification and regression tasks. In this paper, we establish a new generalization error bound, which reveals how the variance of empirical loss influences the generalization ability of the learning model. Inspired by this learning bound, we advocate to reduce the variance of empirical loss to enhance the ability of MLP. As is well-known, bagging is a popular ensemble method to realize variance reduction. However, bagging produces the base training data sets by the Simple Random Sampling (SRS) method, which exhibits a high degree of randomness. To handle this issue, we introduce an ordered structure in the training data set by Rank Set Sampling (RSS) to further reduce the variance of loss and develop a RSS-MLP method. Theoretical results show that the variance of empirical exponential loss and the logistic loss estimated by RSS are smaller than those estimated by SRS, respectively. To validate the performance of RSS-MLP, we conduct comparison experiments on twelve benchmark data sets in terms of the two convex loss functions under two fusion methods. Extensive experimental results and analysis illustrate the effectiveness and rationality of the propose method.