AITopics

2511.03531

Genre: Research Report > New Finding (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

arXiv.org Machine LearningNov-4-2025

Binary perceptron computational gap -- a parametric fl RDT view

Stojnic, Mihailo

Recent studies suggest that asymmetric binary perceptron (ABP) likely exhibits the so-called statistical-computational gap characterized with the appearance of two phase transitioning constraint density thresholds: \textbf{\emph{(i)}} the \emph{satisfiability threshold} $α_c$, below/above which ABP succeeds/fails to operate as a storage memory; and \textbf{\emph{(ii)}} \emph{algorithmic threshold} $α_a$, below/above which one can/cannot efficiently determine ABP's weight so that it operates as a storage memory. We consider a particular parametric utilization of \emph{fully lifted random duality theory} (fl RDT) [85] and study its potential ABP's algorithmic implications. A remarkable structural parametric change is uncovered as one progresses through fl RDT lifting levels. On the first two levels, the so-called $\c$ sequence -- a key parametric fl RDT component -- is of the (natural) decreasing type. A change of such phenomenology on higher levels is then connected to the $α_c$ -- $α_a$ threshold change. Namely, on the second level concrete numerical values give for the critical constraint density $α=α_c\approx 0.8331$. While progressing through higher levels decreases this estimate, already on the fifth level we observe a satisfactory level of convergence and obtain $α\approx 0.7764$. This allows to draw two striking parallels: \textbf{\emph{(i)}} the obtained constraint density estimate is in a remarkable agrement with range $α\in (0.77,0.78)$ of clustering defragmentation (believed to be responsible for failure of locally improving algorithms) [17,88]; and \textbf{\emph{(ii)}} the observed change of $\c$ sequence phenomenology closely matches the one of the negative Hopfield model for which the existence of efficient algorithms that closely approach similar type of threshold has been demonstrated recently [87].

artificial intelligence, machine learning, perceptron, (19 more...)

arXiv.org Machine Learning

2511.01037

Country:

North America > United States > Colorado > Denver County > Denver (0.04)
Africa > Sudan (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(11 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.63)

arXiv.org Machine LearningNov-4-2025

A Proof of Learning Rate Transfer under $μ$P

Hayou, Soufiane

We provide the first proof of learning rate transfer with width in a linear multi-layer perceptron (MLP) parametrized with $μ$P, a neural network parameterization designed to ``maximize'' feature learning in the infinite-width limit. We show that under $μP$, the optimal learning rate converges to a \emph{non-zero constant} as width goes to infinity, providing a theoretical explanation to learning rate transfer. In contrast, we show that this property fails to hold under alternative parametrizations such as Standard Parametrization (SP) and Neural Tangent Parametrization (NTP). We provide intuitive proofs and support the theoretical findings with extensive empirical results.

artificial intelligence, convergence, machine learning, (17 more...)

arXiv.org Machine Learning

2511.01734

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Ballesteros-Jerez, Javier, Martínez-Gómez, Jesus, García-Varea, Ismael, Orozco-Barbosa, Luis, Castillo-Cara, Manuel

Hybrid Neural Network-Based Indoor Localisation System for Mobile Robots Using CSI Data in a Robotics Simulator

We present a hybrid neural network model for inferring the position of mobile robots using Channel State Information (CSI) data from a Massive MIMO system. By leveraging an existing CSI dataset, our approach integrates a Convolutional Neural Network (CNN) with a Multilayer Perceptron (MLP) to form a Hybrid Neural Network (HyNN) that estimates 2D robot positions. CSI readings are converted into synthetic images using the TINTO tool. The localisation solution is integrated with a robotics simulator, and the Robot Operating System (ROS), which facilitates its evaluation through heterogeneous test cases, and the adoption of state estimators like Kalman filters. Our contributions illustrate the potential of our HyNN model in achieving precise indoor localisation and navigation for mobile robots in complex environments. The study follows, and proposes, a generalisable procedure applicable beyond the specific use case studied, making it adaptable to different scenarios and datasets.

artificial intelligence, deep learning, machine learning, (15 more...)

doi: 10.1007/978-3-032-00986-9_11

2511.01797

Country: Europe > Spain > Castilla-La Mancha (0.15)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)

Jamet, Alexandre Valentin, Vavouliotis, Georgios, Jiménez, Daniel A., Alvarez, Lluc, Casas, Marc

A Two Level Neural Approach Combining Off-Chip Prediction with Adaptive Prefetch Filtering

To alleviate the performance and energy overheads of contemporary applications with large data footprints, we propose the Two Level Perceptron (TLP) predictor, a neural mechanism that effectively combines predicting whether an access will be off-chip with adaptive prefetch filtering at the first-level data cache (L1D). TLP is composed of two connected microarchitectural perceptron predictors, named First Level Predictor (FLP) and Second Level Predictor (SLP). FLP performs accurate off-chip prediction by using several program features based on virtual addresses and a novel selective delay component. The novelty of SLP relies on leveraging off-chip prediction to drive L1D prefetch filtering by using physical addresses and the FLP prediction as features. TLP constitutes the first hardware proposal targeting both off-chip prediction and prefetch filtering using a multi-level perceptron hardware approach. TLP only requires 7KB of storage. To demonstrate the benefits of TLP we compare its performance with state-of-the-art approaches using off-chip prediction and prefetch filtering on a wide range of single-core and multi-core workloads. Our experiments show that TLP reduces the average DRAM transactions by 30.7% and 17.7%, as compared to a baseline using state-of-the-art cache prefetchers but no off-chip prediction mechanism, across the single-core and multi-core workloads, respectively, while recent work significantly increases DRAM transactions. As a result, TLP achieves geometric mean performance speedups of 6.2% and 11.8% across single-core and multi-core workloads, respectively. In addition, our evaluation demonstrates that TLP is effective independently of the L1D prefetching logic.

artificial intelligence, machine learning, prediction, (15 more...)

doi: 10.1109/HPCA57654.2024.00046

2403.15181

Country: North America > United States (0.46)

Genre:

Research Report > Promising Solution (0.48)
Overview > Innovation (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.77)

Zhong, Luoyan, Kim, Heather Jin Hee, Losey, Dylan P., Nunez, Cara M.

Curvature-Aware Calibration of Tactile Sensors for Accurate Force Estimation on Non-Planar Surfaces

Flexible tactile sensors are increasingly used in real-world applications such as robotic grippers, prosthetic hands, wearable gloves, and assistive devices, where they need to conform to curved and irregular surfaces. However, most existing tactile sensors are calibrated only on flat substrates, and their accuracy and consistency degrade once mounted on curved geometries. This limitation restricts their reliability in practical use. To address this challenge, we develop a calibration model for a widely used resistive tactile sensor design that enables accurate force estimation on one-dimensional curved surfaces. We then train a neural network (a multilayer perceptron) to predict local curvature from baseline sensor outputs recorded under no applied load, achieving an R2 score of 0.91. The proposed approach is validated on five daily objects with varying curvatures under forces from 2 N to 8 N. Results show that the curvature-aware calibration maintains consistent force accuracy across all surfaces, while flat-surface calibration underestimates force as curvature increases. Our results demonstrate that curvature-aware modeling improves the accuracy, consistency, and reliability of flexible tactile sensors, enabling dependable performance across real-world applications.

artificial intelligence, curvature, machine learning, (16 more...)

2510.25965

Country:

North America > United States (0.29)
Europe > Austria (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Materials > Chemicals (0.46)
Health & Medicine > Health Care Technology (0.34)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)

Ma, Zixuan, Huang, Chenfeng

Effective Series Decomposition and Components Learning for Time Series Generation

Time series generation focuses on modeling the underlying data distribution and resampling to produce authentic time series data. Key components, such as trend and seasonality, drive temporal fluctuations, yet many existing approaches fail to employ interpretative decomposition methods, limiting their ability to synthesize meaningful trend and seasonal patterns. To address this gap, we introduce Seasonal-Trend Diffusion (STDiffusion), a novel framework for multivariate time series generation that integrates diffusion probabilistic models with advanced learnable series decomposition techniques, enhancing the interpretability of the generation process. Our approach separates the trend and seasonal learning into distinct blocks: a Multi-Layer Perceptron (MLP) structure captures the trend, while adaptive wavelet distillation facilitates effective multi-resolution learning of seasonal components. This decomposition improves the interpretability of the model on multiple scales. In addition, we designed a comprehensive correction mechanism aimed at ensuring that the generated components exhibit a high degree of internal consistency and preserve meaningful interrelationships with one another. Our empirical studies on eight real-world datasets demonstrate that STDiffusion achieves state-of-the-art performance in time series generation tasks. Furthermore, we extend the model's application to multi-window long-sequence time series generation, which delivered reliable results and highlighted its robustness and versatility.

artificial intelligence, machine learning, time sery generation, (15 more...)

2511.00747

Country: North America > United States > California (0.46)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

arXiv.org Artificial IntelligenceNov-3-2025

Balancing Knowledge Updates: Toward Unified Modular Editing in LLMs

Liu, Jiahao, Wang, Zijian, Zhao, Kuo, Hu, Dong

Knowledge editing has emerged as an efficient approach for updating factual knowledge in large language models (LLMs), typically achieved by first locating key knowledge-storage modules and then modifying their parameters. However, most existing methods focus exclusively on updating the weights of Multi-Layer Perceptron (MLP) modules, which are commonly identified as the primary repositories of factual information. Other important components, such as attention (Attn) modules--one of the core modules in LLMs--are often ignored during editing. This biased allocation of updates can leave residual outdated knowledge in the model and limit the effectiveness of knowledge editing. In this paper, we conduct comprehensive and systematic knowledge localization experiments on advanced LLMs, revealing that Attn modules play a substantial role in factual knowledge storage and retrieval, especially in earlier layers. Building on these insights, we propose IntAttn-Edit, a novel method that extends the associative memory paradigm to jointly update both MLP and Attn modules. Our approach employs a knowledge balancing strategy that proportionally allocates update magnitudes based on each module's measured contribution to knowledge storage. Extensive experiments on popular benchmarks demonstrate that IntAttn-Edit consistently achieves superior results over existing methods, delivering higher edit success, improved generalization, and robust knowledge preservation. Further empirical analysis shows that our knowledge balancing strategy enables the editing performance to remain within the optimal range across different settings.

large language model, machine learning, natural language, (18 more...)

2510.274

Genre: Research Report > Promising Solution (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Talpur, Unzela, Syed, Zafi Sherhan, Syed, Muhammad Shehram Shah, Syed, Abbas Shah

Cross-Corpus Validation of Speech Emotion Recognition in Urdu using Domain-Knowledge Acoustic Features

arXiv.org Artificial IntelligenceNov-3-2025

Speech Emotion Recognition (SER) is a key affective computing technology that enables emotionally intelligent artificial intelligence. While SER is challenging in general, it is particularly difficult for low-resource languages such as Urdu. This study investigates Urdu SER in a cross-corpus setting, an area that has remained largely unexplored. We employ a cross-corpus evaluation framework across three different Urdu emotional speech datasets to test model generalization. Two standard domain-knowledge based acoustic feature sets, eGeMAPS and ComParE, are used to represent speech signals as feature vectors which are then passed to Logistic Regression and Multilayer Perceptron classifiers. Classification performance is assessed using unweighted average recall (UAR) whilst considering class-label imbalance. Results show that Self-corpus validation often overestimates performance, with UAR exceeding cross-corpus evaluation by up to 13%, underscoring that cross-corpus evaluation offers a more realistic measure of model robustness. Overall, this work emphasizes the importance of cross-corpus validation for Urdu SER and its implications contribute to advancing affective computing research for underrepresented language communities.

artificial intelligence, dataset, machine learning, (12 more...)

2510.26823

Country:

Europe (0.46)
Asia (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.70)

arXiv.org Artificial IntelligenceOct-31-2025

LiGen: GAN-Augmented Spectral Fingerprinting for Indoor Positioning

Lin, Jie, Lee, Hsun-Yu, Li, Ho-Ming, Wu, Fang-Jing

Accurate and robust indoor localization is critical for smart building applications, yet existing Wi-Fi-based systems are often vulnerable to environmental conditions. This work presents a novel indoor localization system, called LiGen, that leverages the spectral intensity patterns of ambient light as fingerprints, offering a more stable and infrastructure-free alternative to radio signals. To address the limited spectral data, we design a data augmentation framework based on generative adversarial networks (GANs), featuring two variants: PointGAN, which generates fingerprints conditioned on coordinates, and FreeGAN, which uses a weak localization model to label unconditioned samples. Our positioning model, leveraging a Multi-Layer Perceptron (MLP) architecture to train on synthesized data, achieves submeter-level accuracy, outperforming Wi-Fi-based baselines by over 50\%. LiGen also demonstrates strong robustness in cluttered environments. To the best of our knowledge, this is the first system to combine spectral fingerprints with GAN-based data augmentation for indoor localization.

artificial intelligence, fingerprint, machine learning, (18 more...)

2508.03024

Country:

Asia > Taiwan (0.15)
North America > United States (0.14)

Genre: Research Report (0.41)

Industry: Information Technology > Smart Houses & Appliances (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)