AITopics | mlp baseline

Collaborating Authors

mlp baseline

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

6e79ed05baec2754e25b4eac73a332d2-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-12-2026, 13:06:14 GMT

baseline, mlp baseline, optimisation, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.53)

Add feedback

An MLP Baseline for Handwriting Recognition Using Planar Curvature and Gradient Orientation

Nouri, Azam

arXiv.org Artificial IntelligenceOct-27-2025

This study investigates whether second-order geometric cues - planar curvature magnitude, curvature sign, and gradient orientation - are sufficient on their own to drive a multilayer perceptron (MLP) classifier for handwritten character recognition (HCR), offering an alternative to convolutional neural networks (CNNs). Using these three handcrafted feature maps as inputs, our curvature-orientation MLP achieves 97 percent accuracy on MNIST digits and 89 percent on EMNIST letters. These results underscore the discriminative power of curvature-based representations for handwritten character images and demonstrate that the advantages of deep learning can be realized even with interpretable, hand-engineered features.

machine learning, orientation, recognition, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.5120/ijca2025925791

2508.11803

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.55)

Add feedback

A Primer on Kolmogorov-Arnold Networks (KANs) for Probabilistic Time Series Forecasting

Vaca-Rubio, Cristian J., Pereira, Roberto, Blanco, Luis, Zeydan, Engin, Caus, Màrius

arXiv.org Artificial IntelligenceOct-21-2025

This work introduces Probabilistic Kolmogorov-Arnold Network (P-KAN), a novel probabilistic extension of Kolmogorov-Arnold Networks (KANs) for time series forecasting. By replacing scalar weights with spline-based functional connections and directly parameterizing predictive distributions, P-KANs offer expressive yet parameter-efficient models capable of capturing nonlinear and heavy-tailed dynamics. We evaluate P-KANs on satellite traffic forecasting, where uncertainty-aware predictions enable dynamic thresholding for resource allocation. Results show that P-KANs consistently outperform Multi Layer Perceptron (MLP) baselines in both accuracy and calibration, achieving superior efficiency-risk trade-offs while using significantly fewer parameters. We build up P-KANs on two distributions, namely Gaussian and Student-t distributions. The Gaussian variant provides robust, conservative forecasts suitable for safety-critical scenarios, whereas the Student-t variant yields sharper distributions that improve efficiency under stable demand. These findings establish P-KANs as a powerful framework for probabilistic forecasting with direct applicability to satellite communications and other resource-constrained domains.

allocation, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2510.1694

Country:

North America > United States (0.14)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Telecommunications (0.89)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

Add feedback

Reviewer

Neural Information Processing SystemsOct-2-2025, 23:12:40 GMT

No, they are usually different. We will try to make this clearer. The inner optimisation is nested within the outer optimisation. We will try to make this clearer too. The MLP baseline uses Eq. 1 and 2, just like in existing work that uses MLPs to predict sets.

artificial intelligence, baseline, machine learning, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.53)

Add feedback

From MNIST to ImageNet: Understanding the Scalability Boundaries of Differentiable Logic Gate Networks

Brändle, Sven, Aczel, Till, Plesner, Andreas, Wattenhofer, Roger

arXiv.org Artificial IntelligenceOct-1-2025

Differentiable Logic Gate Networks (DLGNs) are a very fast and energy-efficient alternative to conventional feed-forward networks. With learnable combinations of logical gates, DLGNs enable fast inference by hardware-friendly execution. Since the concept of DLGNs has only recently gained attention, these networks are still in their developmental infancy, including the design and scalability of their output layer. To date, this architecture has primarily been tested on datasets with up to ten classes. This work examines the behavior of DLGNs on large multi-class datasets. We investigate its general expressiveness, its scalability, and evaluate alternative output strategies. Using both synthetic and real-world datasets, we provide key insights into the importance of temperature tuning and its impact on output layer performance. We evaluate conditions under which the Group-Sum layer performs well and how it can be applied to large-scale classification of up to 2000 classes. Figure 1: DLGNs (blue) consistently outperform MLPs (red) across classification tasks with up to 2000 classes. The result illustrates the potential of logic-gate-based architectures to remain effective when applied to large-scale classification problems. Deep artificial neural networks have improved immensely in the last few years, exhibiting impressive performance across a wide range of tasks (Golroudbari & Sabour, 2023; Noor & Ige, 2024; Ekun-dayo & Ezugwu, 2025). However, these improvements come with rapidly growing computational costs (Thompson et al., 2020; Rosenfeld, 2021; Tripp et al., 2024). This constrains their deployment in many real-world environments, particularly on edge devices and mobile phones (Zhang et al., 2020; Zheng, 2025).

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2509.25933

Country:

North America > United States > Massachusetts (0.04)
Europe > Switzerland (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.48)

Add feedback

Reviews: Deep Set Prediction Networks

Neural Information Processing SystemsJan-24-2025, 14:37:34 GMT

Summary: This paper presents an approach for solving machine learning tasks that require the prediction to be presented in the form of a set. The authors propose to use the set encoder (which is composed of permutation-invariant operations) at the prediction phase by finding an output set with an optimization procedure. As the model output is a vector of continuous features for each set element, it can be done by means of nested gradient descent optimization. In order to solve the task of set prediction for external feature vector, the work suggests a combined loss function that encourages the representation of ground truth to be close to obtained features. Results are shown on MNIST and CLEVR datasets and outperform those of an MLP baseline.

baseline, deep set prediction network, prediction, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.56)

Add feedback

QuasiNet: a neural network with trainable product layers

Malinovská, Kristína, Holenda, Slavomír, Malinovský, Ľudovít

arXiv.org Artificial IntelligenceNov-21-2023

Classical neural networks achieve only limited convergence in hard problems such as XOR or parity when the number of hidden neurons is small. With the motivation to improve the success rate of neural networks in these problems, we propose a new neural network model inspired by existing neural network models with so called product neurons and a learning rule derived from classical error backpropagation, which elegantly solves the problem of mutually exclusive situations. Unlike existing product neurons, which have weights that are preset and not adaptable, our product layers of neurons also do learn. We tested the model and compared its success rate to a classical multilayer perceptron in the aforementioned problems as well as in other hard problems such as the two spirals. Our results indicate that our model is clearly more successful than the classical MLP and has the potential to be used in many tasks and applications.

neural network, neuron, quasinet, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-44192-9_32

2401.06137

Country:

North America > United States > New York (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Slovakia > Bratislava > Bratislava (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.56)

Add feedback

HyperSNN: A new efficient and robust deep learning model for resource constrained control applications

Yan, Zhanglu, Wang, Shida, Tang, Kaiwen, Wong, Weng-Fai

arXiv.org Artificial IntelligenceAug-17-2023

In light of the increasing adoption of edge computing in areas such as intelligent furniture, robotics, and smart homes, this paper introduces HyperSNN, an innovative method for control tasks that uses spiking neural networks (SNNs) in combination with hyperdimensional computing. HyperSNN substitutes expensive 32-bit floating point multiplications with 8-bit integer additions, resulting in reduced energy consumption while enhancing robustness and potentially improving accuracy. Our model was tested on AI Gym benchmarks, including Cartpole, Acrobot, MountainCar, and Lunar Lander. HyperSNN achieves control accuracies that are on par with conventional machine learning methods but with only 1.36% to 9.96% of the energy expenditure. Furthermore, our experiments showed increased robustness when using HyperSNN. We believe that HyperSNN is especially suitable for interactive, mobile, and wearable devices, promoting energy-efficient and robust system design. Furthermore, it paves the way for the practical implementation of complex algorithms like model predictive control (MPC) in real-world industrial scenarios.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2308.08222

Country:

Asia > Singapore (0.15)
North America > United States (0.14)

Genre: Research Report > New Finding (0.88)

Industry:

Information Technology (1.00)
Health & Medicine (1.00)
Energy > Oil & Gas > Upstream (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback