AITopics | Perceptrons

Collaborating Authors

Perceptrons

News Overviews Instructional Materials AI-Alerts Classics

Neural Architecture Search with Bayesian Optimisation and Optimal Transport

Neural Information Processing SystemsNov-20-2025, 23:13:24 GMT

Bayesian Optimisation (BO) refers to a class of methods for global optimisation of a function f which is only accessible via point evaluations. It is typically used in settings where f is expensive to evaluate. A common use case for BO in machine learning is model selection, where it is not possible to analytically model the generalisation performance of a statistical model, and we resort to noisy and expensive training and validation procedures to choose the best model. Conventional BO methods have focused on Euclidean and categorical domains, which, in the context of model selection, only permits tuning scalar hyper-parameters of machine learning algorithms. However, with the surge of interest in deep learning, there is an increasing demand to tune neural network architectures. In this work, we develop NASBOT, a Gaussian process based BO framework for neural architecture search. To accomplish this, we develop a distance metric in the space of neural network architectures which can be computed efficiently via an optimal transport program. This distance might be of independent interest to the deep learning community as it may find applications outside of BO. We demonstrate that NASBOT outperforms other alternatives for architecture search in several cross validation based model selection tasks on multi-layer perceptrons and convolutional neural networks.

bayesian optimisation, bayesian optimisation and optimal transport, neural architecture search, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.59)

Add feedback

Recurrent Relational Networks

Neural Information Processing SystemsNov-20-2025, 22:53:33 GMT

This paper is concerned with learning to solve tasks that require a chain of interdependent steps of relational inference, like answering complex questions about the relationships between objects, or solving puzzles where the smaller elements of a solution mutually constrain each other. We introduce the recurrent relational network, a general purpose module that operates on a graph representation of objects. As a generalization of Santoro et al. [2017]'s relational network, it can augment any neural network model with the capacity to do many-step relational reasoning. We achieve state of the art results on the bAbI textual question-answering dataset with the recurrent relational network, consistently solving 20/20 tasks. As bAbI is not particularly challenging from a relational reasoning point of view, we introduce Pretty-CLEVR, a new diagnostic dataset for relational reasoning. In the Pretty-CLEVR set-up, we can vary the question to control for the number of relational reasoning steps that are required to obtain the answer. Using Pretty-CLEVR, we probe the limitations of multi-layer perceptrons, relational and recurrent relational networks. Finally, we show how recurrent relational networks can learn to solve Sudoku puzzles from supervised training data, a challenging task requiring upwards of 64 steps of relational reasoning. We achieve state-of-the-art results amongst comparable methods by solving 96.6% of the hardest Sudoku puzzles.

name change, recurrent relational network, relational reasoning, (4 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games > Sudoku (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.59)

Add feedback

NTK-Guided Implicit Neural Teaching

Zhang, Chen, Zuo, Wei, Cheng, Bingyang, Wang, Yikun, Kou, Wei-Bin, WU, Yik Chung, Wong, Ngai

arXiv.org Artificial IntelligenceNov-20-2025

Implicit Neural Representations (INRs) parameterize continuous signals via multilayer perceptrons (MLPs), enabling compact, resolution-independent modeling for tasks like image, audio, and 3D reconstruction. However, fitting high-resolution signals demands optimizing over millions of coordinates, incurring prohibitive computational costs. To address it, we propose NTK-Guided Implicit Neural Teaching (NINT), which accelerates training by dynamically selecting coordinates that maximize global functional updates. Leveraging the Neural Tangent Kernel (NTK), NINT scores examples by the norm of their NTK-augmented loss gradients, capturing both fitting errors and heterogeneous leverage (self-influence and cross-coordinate coupling). This dual consideration enables faster convergence compared to existing methods. Through extensive experiments, we demonstrate that NINT significantly reduces training time by nearly half while maintaining or improving representation quality, establishing state-of-the-art acceleration among recent sampling-based strategies.

artificial intelligence, machine learning, representation, (17 more...)

arXiv.org Artificial Intelligence

2511.15487

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

Add feedback

Fast Post-Hoc Confidence Fusion for 3-Class Open-Set Aerial Object Detection

Loukovitis, Spyridon, Karampinis, Vasileios, Voulodimos, Athanasios

arXiv.org Artificial IntelligenceNov-20-2025

Developing reliable UAV navigation systems requires robust air-to-air object detectors capable of distinguishing between objects seen during training and previously unseen objects. While many methods address closed-set detection and achieve high-confidence recognition of in-domain (ID) targets, they generally do not tackle open-set detection, which requires simultaneous handling of both ID and out-of-distribution (OOD) objects. Existing open-set approaches typically rely on a single uncertainty score with thresholding, limiting flexibility and often conflating OOD objects with background clutter. In contrast, we propose a lightweight, model-agnostic post-processing framework that explicitly separates background from unknown objects while preserving the base detector's performance. Our approach extends open-set detection beyond binary ID/OOD classification to real-time three-way classification among ID targets, OOD objects, and background. To this end, we employ a fusion scheme that aggregates multiple confidence estimates and per-detection features using a compact multilayer perceptron (MLP). Incorporating different logit variants into the MLP consistently enhances performance across both binary and three-class classification without compromising throughput. Extensive ablation and comparative experiments confirm that our method surpasses threshold-based baselines in two-class classification by an average of 2.7% AUROC, while retaining or improving open-set mAP. Furthermore, our study uniquely enables robust three-class classification, a critical capability for safe UAV navigation, where OOD objects must be actively avoided and background regions safely ignored. Comparative analysis highlights that our method surpasses competitive techniques in AUROC across datasets, while improving closed-set mAP by up to 9 points, an 18% relative gain.

artificial intelligence, detection, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2511.15343

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.54)

Add feedback

Vehicle Routing Problems via Quantum Graph Attention Network Deep Reinforcement Learning

Giang, Le Tung, Viet, Vu Hoang, Tung, Nguyen Xuan, Van Chien, Trinh, Hwang, Won-Joo

arXiv.org Artificial IntelligenceNov-20-2025

The vehicle routing problem (VRP) is a fundamental NP-hard task in intelligent transportation systems with broad applications in logistics and distribution. Deep reinforcement learning (DRL) with Graph Neural Networks (GNNs) has shown promise, yet classical models rely on large multi-layer perceptrons (MLPs) that are parameter-heavy and memory-bound. We propose a Quantum Graph Attention Network (Q-GAT) within a DRL framework, where parameterized quantum circuits (PQCs) replace conventional MLPs at critical readout stages. The hybrid model maintains the expressive capacity of graph attention encoders while reducing trainable parameters by more than 50%. Using proximal policy optimization (PPO) with greedy and stochastic decoding, experiments on VRP benchmarks show that Q-GAT achieves faster convergence and reduces routing cost by about 5% compared with classical GAT baselines. These results demonstrate the potential of PQC-enhanced GNNs as compact and effective solvers for large-scale routing and logistics optimization.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2511.15175

Country: Asia > Vietnam (0.29)

Genre: Research Report > New Finding (0.34)

Industry: Transportation > Freight & Logistics Services (0.72)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

Add feedback

PolyKAN: Efficient Fused GPU Operators for Polynomial Kolmogorov-Arnold Network Variants

Yu, Mingkun, Zhong, Heming, Huang, Dan, Lu, Yutong, Jiang, Jiazhi

arXiv.org Artificial IntelligenceNov-20-2025

Kolmogorov-Arnold Networks (KANs) promise higher expressive capability and stronger interpretability than Multi-Layer Perceptron, particularly in the domain of AI for Science. However, practical adoption has been hindered by low GPU utilization of existing parallel implementations. To address this challenge, we present a GPU-accelerated operator library, named PolyKAN which is the first general open-source implementation of KAN and its variants. PolyKAN fuses the forward and backward passes of polynomial KAN layers into a concise set of optimized CUDA kernels. Four orthogonal techniques underpin the design: (i) \emph{lookup-table} with linear interpolation that replaces runtime expensive math-library functions; (ii) \emph{2D tiling} to expose thread-level parallelism with preserving memory locality; (iii) a \emph{two-stage reduction} scheme converting scattered atomic updates into a single controllable merge step; and (iv) \emph{coefficient-layout reordering} yielding unit-stride reads under the tiled schedule. Using a KAN variant, Chebyshev KAN, as a case-study, PolyKAN delivers $1.2$--$10\times$ faster inference and $1.4$--$12\times$ faster training than a Triton + cuBLAS baseline, with identical accuracy on speech, audio-enhancement, and tabular-regression workloads on both highend GPU and consumer-grade GPU.

kan variant, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2511.14852

Country: Asia > China (0.15)

Genre: Research Report (0.50)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Graphics (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

MTP: Exploring Multimodal Urban Traffic Profiling with Modality Augmentation and Spectrum Fusion

Xiang, Haolong, Wang, Peisi, Xu, Xiaolong, Yi, Kun, Zhang, Xuyun, Sheng, Quanzheng, Beheshti, Amin, Fan, Wei

arXiv.org Artificial IntelligenceNov-18-2025

With rapid urbanization in the modern era, traffic signals from various sensors have been playing a significant role in monitoring the states of cities, which provides a strong foundation in ensuring safe travel, reducing traffic congestion and optimizing urban mobility. Most existing methods for traffic signal modeling often rely on the original data modality, i.e., numerical direct readings from the sensors in cities. However, this unimodal approach overlooks the semantic information existing in multimodal heterogeneous urban data in different perspectives, which hinders a comprehensive understanding of traffic signals and limits the accurate prediction of complex traffic dynamics. To address this problem, we propose a novel Multimodal framework, MTP, for urban Traffic Profiling, which learns multimodal features through numeric, visual, and textual perspectives. The three branches drive for a multimodal perspective of urban traffic signal learning in the frequency domain, while the frequency learning strategies delicately refine the information for extraction. Specifically, we first conduct the visual augmentation for the traffic signals, which transforms the original modality into frequency images and periodicity images for visual learning. Also, we augment descriptive texts for the traffic signals based on the specific topic, background information and item description for textual learning. To complement the numeric information, we utilize frequency multilayer perceptrons for learning on the original modality. We design a hierarchical contrastive learning on the three branches to fuse the spectrum of three modalities. Finally, extensive experiments on six real-world datasets demonstrate superior performance compared with the state-of-the-art approaches.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2511.10218

Country:

Oceania (0.28)
North America > United States > California (0.28)

Genre: Research Report (0.69)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.68)

Add feedback

Catastrophic Forgetting in Kolmogorov-Arnold Networks

Rahman, Mohammad Marufur, Wang, Guanchu, Zhou, Kaixiong, Chen, Minghan, Yang, Fan

arXiv.org Artificial IntelligenceNov-18-2025

Catastrophic forgetting is a longstanding challenge in continual learning, where models lose knowledge from earlier tasks when learning new ones. While various mitigation strategies have been proposed for Multi-Layer Perceptrons (MLPs), recent architectural advances like Kolmogorov-Arnold Networks (KANs) have been suggested to offer intrinsic resistance to forgetting by leveraging localized spline-based activations. However, the practical behavior of KANs under continual learning remains unclear, and their limitations are not well understood. To address this, we present a comprehensive study of catastrophic forgetting in KANs and develop a theoretical framework that links forgetting to activation support overlap and intrinsic data dimension. We validate these analyses through systematic experiments on synthetic and vision tasks, measuring forgetting dynamics under varying model configurations and data complexity. Further, we introduce KAN-LoRA, a novel adapter design for parameter-efficient continual fine-tuning of language models, and evaluate its effectiveness in knowledge editing tasks. Our findings reveal that while KANs exhibit promising retention in low-dimensional algorithmic settings, they remain vulnerable to forgetting in high-dimensional domains such as image classification and language modeling. These results advance the understanding of KANs' strengths and limitations, offering practical insights for continual learning system design.

kan, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2511.12828

Country: Europe (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Batch Matrix-form Equations and Implementation of Multilayer Perceptrons

Wesselink, Wieger, Grooten, Bram, van de Wetering, Huub, Xiao, Qiao, Mocanu, Decebal Constantin

arXiv.org Artificial IntelligenceNov-18-2025

Multilayer perceptrons (MLPs) remain fundamental to modern deep learning, yet their algorithmic details are rarely presented in complete, explicit \emph{batch matrix-form}. Rather, most references express gradients per sample or rely on automatic differentiation. Although automatic differentiation can achieve equally high computational efficiency, the usage of batch matrix-form makes the computational structure explicit, which is essential for transparent, systematic analysis, and optimization in settings such as sparse neural networks. This paper fills that gap by providing a mathematically rigorous and implementation-ready specification of MLPs in batch matrix-form. We derive forward and backward equations for all standard and advanced layers, including batch normalization and softmax, and validate all equations using the symbolic mathematics library SymPy. From these specifications, we construct uniform reference implementations in NumPy, PyTorch, JAX, TensorFlow, and a high-performance C++ backend optimized for sparse operations. Our main contributions are: (1) a complete derivation of batch matrix-form backpropagation for MLPs, (2) symbolic validation of all gradient equations, (3) uniform Python and C++ reference implementations grounded in a small set of matrix primitives, and (4) demonstration of how explicit formulations enable efficient sparse computation. Together, these results establish a validated, extensible foundation for understanding, teaching, and researching neural network algorithms.

artificial intelligence, implementation, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2511.11918

Country: Europe > United Kingdom (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Evaluating Multiple Instance Learning Strategies for Automated Sebocyte Droplet Counting

Adelipour, Maryam, Carneiro, Gustavo, Kim, Jeongkwon

arXiv.org Artificial IntelligenceNov-18-2025

Sebocytes are lipid - secreting cells whose differentiation is marked by the accumulation of intracellular lipid droplets, making their quantification a key readout in sebocyte biology. Manual counting is labor - intensive and subjective, motivating automated solutions. Here, we introduce a simple attention - based multiple instance learning (MIL) framework for sebocyte image analysis. Nile Red - stained sebocyte images were annotated into 14 classes according to droplet counts, expanded via data augmentation to ab out 50,000 cells. Two models were benchmarked: a baseline multi - layer perceptron (MLP) trained on aggregated patch - level counts, and an attention - based MIL model leveraging precomputed ResNet - 50 feature embeddings with trainable instance weighting. Experiments using five - fold cross - validation showed that the baseline MLP achieved more stable performance (mean MAE = 5.6) compared with the attention - based MIL, which was less consistent (mean MAE = 10.7) but occasionally superior in specific folds. The se findings indicate that simple bag - level aggregation provides a robust baseline for slide - level droplet counting, while attention - based MIL requires task - aligned pooling and regularization to fully realize its potential in sebocyte image analysis.

artificial intelligence, droplet, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2509.04895

Country: Asia > South Korea (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Dermatology (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.69)

Add feedback