AITopics | multilayer perceptron

Collaborating Authors

multilayer perceptron

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Scaling can lead to compositional generalization

Neural Information Processing SystemsJun-17-2026, 05:57:09 GMT

Can neural networks systematically capture discrete, compositional task structure despite their continuous, distributed nature? The impressive capabilities of largescale neural networks suggest that the answer to this question is yes. However, even for the most capable models, there are still frequent failure cases that raise doubts about their compositionality. Here, we seek to understand what it takes for a standard neural network to generalize over tasks that share compositional structure. We find that simply scaling data and model size leads to compositional generalization. We show that this holds across different task encodings as long as the training distribution sufficiently covers the task space. In line with this finding, we prove that standard multilayer perceptrons can approximate a general class of compositional task families to arbitrary precision using only a linear number of neurons with respect to the number of task modules. Finally, we uncover that if networks successfully compositionally generalize, the constituents of a task can be linearly decoded from their hidden activations. We show that this metric correlates with failures of text-to-image generation models to compose known concepts.

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
North America > Mexico (0.28)
Asia > Middle East (0.27)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

f4b31bee138ff5f7b84ce1575a738f95-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 02:52:34 GMT

international conference, neural network, perturbation, (11 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Theoretical Compression Bounds for Wide Multilayer Perceptrons

Cheairi, Houssam El, Gamarnik, David, Mazumder, Rahul

arXiv.org Artificial IntelligenceDec-9-2025

Pruning and quantization techniques have been broadly successful in reducing the number of parameters needed for large neural networks, yet theoretical justification for their empirical success falls short. We consider a randomized greedy compression algorithm for pruning and quantization post-training and use it to rigorously show the existence of pruned/quantized subnetworks of multilayer perceptrons (MLPs) with competitive performance. We further extend our results to structured pruning of MLPs and convolutional neural networks (CNNs), thus providing a unified analysis of pruning in wide networks. Our results are free of data assumptions, and showcase a tradeoff between compressibility and network width. The algorithm we consider bears some similarities with Optimal Brain Damage (OBD) and can be viewed as a post-training randomized version of it. The theoretical results we derive bridge the gap between theory and application for pruning/quantization, and provide a justification for the empirical success of compression in wide multilayer perceptrons.

artificial intelligence, machine learning, pruning, (16 more...)

arXiv.org Artificial Intelligence

2512.06288

Genre: Research Report > New Finding (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)

Add feedback

Batch Matrix-form Equations and Implementation of Multilayer Perceptrons

Wesselink, Wieger, Grooten, Bram, van de Wetering, Huub, Xiao, Qiao, Mocanu, Decebal Constantin

arXiv.org Artificial IntelligenceNov-18-2025

Multilayer perceptrons (MLPs) remain fundamental to modern deep learning, yet their algorithmic details are rarely presented in complete, explicit \emph{batch matrix-form}. Rather, most references express gradients per sample or rely on automatic differentiation. Although automatic differentiation can achieve equally high computational efficiency, the usage of batch matrix-form makes the computational structure explicit, which is essential for transparent, systematic analysis, and optimization in settings such as sparse neural networks. This paper fills that gap by providing a mathematically rigorous and implementation-ready specification of MLPs in batch matrix-form. We derive forward and backward equations for all standard and advanced layers, including batch normalization and softmax, and validate all equations using the symbolic mathematics library SymPy. From these specifications, we construct uniform reference implementations in NumPy, PyTorch, JAX, TensorFlow, and a high-performance C++ backend optimized for sparse operations. Our main contributions are: (1) a complete derivation of batch matrix-form backpropagation for MLPs, (2) symbolic validation of all gradient equations, (3) uniform Python and C++ reference implementations grounded in a small set of matrix primitives, and (4) demonstration of how explicit formulations enable efficient sparse computation. Together, these results establish a validated, extensible foundation for understanding, teaching, and researching neural network algorithms.

artificial intelligence, implementation, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2511.11918

Country: Europe > United Kingdom (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

An MLP Baseline for Handwriting Recognition Using Planar Curvature and Gradient Orientation

Nouri, Azam

arXiv.org Artificial IntelligenceOct-27-2025

This study investigates whether second-order geometric cues - planar curvature magnitude, curvature sign, and gradient orientation - are sufficient on their own to drive a multilayer perceptron (MLP) classifier for handwritten character recognition (HCR), offering an alternative to convolutional neural networks (CNNs). Using these three handcrafted feature maps as inputs, our curvature-orientation MLP achieves 97 percent accuracy on MNIST digits and 89 percent on EMNIST letters. These results underscore the discriminative power of curvature-based representations for handwritten character images and demonstrate that the advantages of deep learning can be realized even with interpretable, hand-engineered features.

machine learning, orientation, recognition, (12 more...)

arXiv.org Artificial Intelligence

doi: 10.5120/ijca2025925791

2508.11803

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.55)

Add feedback

Scaling can lead to compositional generalization

Redhardt, Florian, Akram, Yassir, Schug, Simon

arXiv.org Artificial IntelligenceOct-27-2025

Can neural networks systematically capture discrete, compositional task structure despite their continuous, distributed nature? The impressive capabilities of large-scale neural networks suggest that the answer to this question is yes. However, even for the most capable models, there are still frequent failure cases that raise doubts about their compositionality. Here, we seek to understand what it takes for a standard neural network to generalize over tasks that share compositional structure. We find that simply scaling data and model size leads to compositional generalization. We show that this holds across different task encodings as long as the training distribution sufficiently covers the task space. In line with this finding, we prove that standard multilayer perceptrons can approximate a general class of compositional task families to arbitrary precision using only a linear number of neurons with respect to the number of task modules. Finally, we uncover that if networks successfully compositionally generalize, the constituents of a task can be linearly decoded from their hidden activations. We show that this metric correlates with failures of text-to-image generation models to compose known concepts.

artificial intelligence, compositional generalization, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2507.07207

Country:

Europe (0.67)
North America > United States (0.46)
North America > Mexico (0.28)
Asia > Middle East (0.28)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Information flow in multilayer perceptrons: an in-depth analysis

Armano, Giuliano

arXiv.org Artificial IntelligenceOct-17-2025

Analysing how information flows along the layers of a multilayer perceptron is a topic of paramount importance in the field of artificial neural networks. After framing the problem from the point of view of information theory, in this position article a specific investigation is conducted on the way information is processed, with particular reference to the requirements imposed by supervised learning. To this end, the concept of information matrix is devised and then used as formal framework for understanding the aetiology of optimisation strategies and for studying the information flow. The underlying research for this article has also produced several key outcomes: i) the definition of a parametric optimisation strategy, ii) the finding that the optimisation strategy proposed in the information bottleneck framework shares strong similarities with the one derived from the information matrix, and iii) the insight that a multilayer perceptron serves as a kind of "adaptor", meant to process the input according to the given objective.

artificial intelligence, information, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2510.13846

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)

Add feedback

1cfa81af29c6f2d8cacb44921722e753-Supplemental.pdf

Neural Information Processing SystemsOct-9-2025, 13:30:13 GMT

artificial intelligence, machine learning, translation, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Hybrid Quantum-Classical Policy Gradient for Adaptive Control of Cyber-Physical Systems: A Comparative Study of VQC vs. MLP

Aueawatthanaphisut, Aueaphum, Tun, Nyi Wunna

arXiv.org Artificial IntelligenceOct-8-2025

The comparative evaluation between classical and quantum reinforcement learning (QRL) paradigms was conducted to investigate their convergence behavior, robustness under observational noise, and computational efficiency in a benchmark control environment. The study employed a multilayer perceptron (MLP) agent as a classical baseline and a parameterized variational quantum circuit (VQC) as a quantum counterpart, both trained on the CartPole-v1 environment over 500 episodes. Empirical results demonstrated that the classical MLP achieved near-optimal policy convergence with a mean return of 498.7 +/- 3.2, maintaining stable equilibrium throughout training. In contrast, the VQC exhibited limited learning capability, with an average return of 14.6 +/- 4.8, primarily constrained by circuit depth and qubit connectivity. Noise robustness analysis further revealed that the MLP policy deteriorated gracefully under Gaussian perturbations, while the VQC displayed higher sensitivity at equivalent noise levels. Despite the lower asymptotic performance, the VQC exhibited significantly lower parameter count and marginally increased training time, highlighting its potential scalability for low-resource quantum processors. The results suggest that while classical neural policies remain dominant in current control benchmarks, quantum-enhanced architectures could offer promising efficiency advantages once hardware noise and expressivity limitations are mitigated.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2510.0601

Country: Asia > Thailand (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.56)

Add feedback

Peptidomic-Based Prediction Model for Coronary Heart Disease Using a Multilayer Perceptron Neural Network

Celis-Porras, Jesus

arXiv.org Artificial IntelligenceSep-5-2025

Coronary heart disease (CHD) is a leading cause of death worldwide and contributes significantly to annual healthcare expenditures. To develop a non-invasive diagnostic approach, we designed a model based on a multilayer perceptron (MLP) neural network, trained on 50 key urinary peptide biomarkers selected via genetic algorithms. Treatment and control groups, each comprising 345 individuals, were balanced using the Synthetic Minority Over-sampling Technique (SMOTE). The neural network was trained using a stratified validation strategy. Using a network with three hidden layers of 60 neurons each and an output layer of two neurons, the model achieved a precision, sensitivity, and specificity of 95.67 percent, with an F1-score of 0.9565. The area under the ROC curve (AUC) reached 0.9748 for both classes, while the Matthews correlation coefficient (MCC) and Cohen's kappa coefficient were 0.9134 and 0.9131, respectively, demonstrating its reliability in detecting CHD. These results indicate that the model provides a highly accurate and robust non-invasive diagnostic tool for coronary heart disease.

accuracy, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2509.03884

Country:

North America > United States (0.28)
Europe (0.28)

Genre:

Research Report > Experimental Study (0.69)
Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)

Add feedback