AITopics | mlp network

Collaborating Authors

mlp network

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

060b2af0081a460f7f466f7f174d9052-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 11:13:15 GMT

epoch, iteration, posterior, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Simultaneous Weight and Architecture Optimization for Neural Networks

Huang, Zitong, Montazerin, Mansooreh, Srivastava, Ajitesh

arXiv.org Artificial IntelligenceOct-10-2024

Neural networks are trained by choosing an architecture and training the parameters. The choice of architecture is often by trial and error or with Neural Architecture Search (NAS) methods. While NAS provides some automation, it often relies on discrete steps that optimize the architecture and then train the parameters. We introduce a novel neural network training framework that fundamentally transforms the process by learning architecture and parameters simultaneously with gradient descent. With the appropriate setting of the loss function, it can discover sparse and compact neural networks for given datasets. Central to our approach is a multi-scale encoder-decoder, in which the encoder embeds pairs of neural networks with similar functionalities close to each other (irrespective of their architectures and weights). To train a neural network with a given dataset, we randomly sample a neural network embedding in the embedding space and then perform gradient descent using our custom loss function, which incorporates a sparsity penalty to encourage compactness. The decoder generates a neural network corresponding to the embedding. Experiments demonstrate that our framework can discover sparse and compact neural networks maintaining a high performance.

artificial intelligence, machine learning, mlp, (19 more...)

arXiv.org Artificial Intelligence

2410.08339

Country: North America > United States > California (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.56)

Add feedback

Neural Moving Horizon Estimation: A Systematic Literature Review

Mobeen, Surrayya, Cristobal, Jann, Singoji, Shashank, Rassas, Basaam, Izadi, Mohammadreza, Shayan, Zeinab, Yazdanshenas, Amin, Kaur, Harneet, Barnsley, Robert, Elliott, Lana, Faieghi, Reza

arXiv.org Artificial IntelligenceJun-21-2024

The neural moving horizon estimator (NMHE) is a relatively new and powerful state estimator that combines the strengths of neural networks (NNs) and model-based state estimation techniques. Various approaches exist for constructing NMHEs, each with its unique advantages and limitations. However, a comprehensive literature review that consolidates existing knowledge, outlines design guidelines and highlights future research directions is currently lacking. This systematic literature review synthesizes the existing knowledge on NMHE, addressing the above knowledge gap. The paper (1) explains the fundamental principles of NMHE, (2) explores different NMHE architectures, discussing the pros and cons of each, (3) investigates the NN architectures used in NMHE, providing insights for future designs, (4) examines the real-time implementability of current approaches, offering recommendations for practical applications, and (5) discusses the current limitations of NMHE approaches and outlines directions for future research. These insights can significantly improve the design and application of NMHE, which is critical for enhancing state estimation in complex systems.

artificial intelligence, estimation, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2406.15578

Country: North America > Canada (0.28)

Genre:

Overview (1.00)
Research Report > New Finding (0.46)

Industry:

Information Technology (1.00)
Transportation (0.68)
Energy > Oil & Gas (0.48)
Materials > Chemicals (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Leveraging KANs For Enhanced Deep Koopman Operator Discovery

Nehma, George, Tiwari, Madhur

arXiv.org Artificial IntelligenceJun-6-2024

Multi-layer perceptrons (MLP's) have been extensively utilized in discovering Deep Koopman operators for linearizing nonlinear dynamics. With the emergence of Kolmogorov-Arnold Networks (KANs) as a more efficient and accurate alternative to the MLP Neural Network, we propose a comparison of the performance of each network type in the context of learning Koopman operators with control. In this work, we propose a KANs-based deep Koopman framework with applications to an orbital Two-Body Problem (2BP) and the pendulum for data-driven discovery of linear system dynamics. KANs were found to be superior in nearly all aspects of training; learning 31 times faster, being 15 times more parameter efficiency, and predicting 1.25 times more accurately as compared to the MLP Deep Neural Networks (DNNs) in the case of the 2BP. Thus, KANs shows potential for being an efficient tool in the development of Deep Koopman Theory.

kan, koopman operator, operator, (16 more...)

arXiv.org Artificial Intelligence

2406.02875

Country:

North America > United States > Florida > Brevard County > Melbourne (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

Add feedback

Deep Grokking: Would Deep Neural Networks Generalize Better?

Fan, Simin, Pascanu, Razvan, Jaggi, Martin

arXiv.org Machine LearningMay-29-2024

Recent research on the grokking phenomenon has illuminated the intricacies of neural networks' training dynamics and their generalization behaviors. Grokking refers to a sharp rise of the network's generalization accuracy on the test set, which occurs long after an extended overfitting phase, during which the network perfectly fits the training set. While the existing research primarily focus on shallow networks such as 2-layer MLP and 1-layer Transformer, we explore grokking on deep networks (e.g. 12-layer MLP). We empirically replicate the phenomenon and find that deep neural networks can be more susceptible to grokking than its shallower counterparts. Meanwhile, we observe an intriguing multi-stage generalization phenomenon when increase the depth of the MLP model where the test accuracy exhibits a secondary surge, which is scarcely seen on shallow models. We further uncover compelling correspondences between the decreasing of feature ranks and the phase transition from overfitting to the generalization stage during grokking. Additionally, we find that the multi-stage generalization phenomenon often aligns with a double-descent pattern in feature ranks. These observations suggest that internal feature rank could serve as a more promising indicator of the model's generalization behavior compared to the weight-norm. We believe our work is the first one to dive into grokking in deep neural networks, and investigate the relationship of feature rank and generalization performance.

accuracy, feature rank, generalization, (15 more...)

arXiv.org Machine Learning

2405.19454

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

On Optimal Sampling for Learning SDF Using MLPs Equipped with Positional Encoding

Lin, Guying, Yang, Lei, Liu, Yuan, Zhang, Congyi, Hou, Junhui, Jin, Xiaogang, Komura, Taku, Keyser, John, Wang, Wenping

arXiv.org Artificial IntelligenceJan-2-2024

Neural implicit fields, such as the neural signed distance field (SDF) of a shape, have emerged as a powerful representation for many applications, e.g., encoding a 3D shape and performing collision detection. Typically, implicit fields are encoded by Multi-layer Perceptrons (MLP) with positional encoding (PE) to capture high-frequency geometric details. However, a notable side effect of such PE-equipped MLPs is the noisy artifacts present in the learned implicit fields. While increasing the sampling rate could in general mitigate these artifacts, in this paper we aim to explain this adverse phenomenon through the lens of Fourier analysis. We devise a tool to determine the appropriate sampling rate for learning an accurate neural implicit field without undesirable side effects. Specifically, we propose a simple yet effective method to estimate the intrinsic frequency of a given network with randomized weights based on the Fourier analysis of the network's responses. It is observed that a PE-equipped MLP has an intrinsic frequency much higher than the highest frequency component in the PE layer. Sampling against this intrinsic frequency following the Nyquist-Sannon sampling theorem allows us to determine an appropriate training sampling rate. We empirically show in the setting of SDF fitting that this recommended sampling rate is sufficient to secure accurate fitting results, while further increasing the sampling rate would not further noticeably reduce the fitting error. Training PE-equipped MLPs simply with our sampling strategy leads to performances superior to the existing methods.

frequency, pe-equipped mlp, spectrum, (14 more...)

arXiv.org Artificial Intelligence

2401.01391

Country:

North America > United States > Texas > Brazos County > College Station (0.04)
Asia > Middle East > Oman (0.04)
Asia > China > Hong Kong (0.04)
North America > Canada > British Columbia (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.68)

Add feedback

A free from local minima algorithm for training regressive MLP neural networks

Montisci, Augusto

arXiv.org Artificial IntelligenceAug-22-2023

In this article an innovative method for training regressive MLP networks is presented, which is not subject to local minima. The Error-Back-Propagation algorithm, proposed by William-Hinton-Rummelhart, has had the merit of favouring the development of machine learning techniques, which has permeated every branch of research and technology since the mid-1980s. This extraordinary success is largely due to the black-box approach, but this same factor was also seen as a limitation, as soon more challenging problems were approached. One of the most critical aspects of the training algorithms was that of local minima of the loss function, typically the mean squared error of the output on the training set. In fact, as the most popular training algorithms are driven by the derivatives of the loss function, there is no possibility to evaluate if a reached minimum is local or global. The algorithm presented in this paper avoids the problem of local minima, as the training is based on the properties of the distribution of the training set, or better on its image internal to the neural network. The performance of the algorithm is shown for a well-known benchmark.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2308.11532

Country: Europe > Italy > Sardinia > Cagliari (0.04)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.50)

Add feedback

Learning Quadruped Locomotion using Bio-Inspired Neural Networks with Intrinsic Rhythmicity

Yang, Chuanyu, Pu, Can, Wei, Tianqi, Wang, Cong, Li, Zhibin

arXiv.org Artificial IntelligenceMay-12-2023

Abstract-- Biological studies reveal that neural circuits located at the spinal cord called central pattern generator (CPG) oscillates and generates rhythmic signals, which are the underlying mechanism responsible for rhythmic locomotion behaviors of animals. Inspired by CPG's capability to naturally generate rhythmic patterns, researchers have attempted to create mathematical models of CPG and utilize them for the One approach is to add an I. INTRODUCTION The phase Animals are able to adapt their locomotion gait pattern to increment is added to the current phase and is used as the suit the locomotion velocity and ground condition. Efforts phase for the next timestep in a bootstrap manner [6], [7], have been put into discovering the underlying mechanism of [4]. To reproduce the agile locomotion behaviors of animals animal locomotion, and have obtained evidence that legged in legged robots, researchers have looked into animals for locomotion is rhythmic in nature. Inspired by findings of CPGs within the animal evidence of special neurons called central pattern generators nervous system, there have also been attempts in utilizing the in the animal spinal cord.

artificial intelligence, frequency, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2305.073

Country:

Asia > China > Guangdong Province > Shenzhen (0.05)
Europe > United Kingdom > England > Greater London > London (0.04)
Asia > China > Liaoning Province > Shenyang (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Robots > Locomotion (0.90)

Add feedback

Automatic Procurement Fraud Detection with Machine Learning

Bai, Jin, Qiu, Tong

arXiv.org Artificial IntelligenceApr-20-2023

Although procurement fraud is always a critical problem in almost every free market, audit departments still have a strong reliance on reporting from informed sources when detecting them. With our generous cooperator, SF Express, sharing the access to the database related with procurements took place from 2015 to 2017 in their company, our team studies how machine learning techniques could help with the audition of one of the most profound crime among current chinese market, namely procurement frauds. By representing each procurement event as 9 specific features, we construct neural network models to identify suspicious procurements and classify their fraud types. Through testing our models over 50000 samples collected from the procurement database, we have proven that such models -- despite having space for improvements -- are useful in detecting procurement frauds.

artificial intelligence, machine learning, procurement, (14 more...)

arXiv.org Artificial Intelligence

2304.10105

Country:

Asia > China (0.14)
Africa > South Africa > Gauteng > Pretoria (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Law Enforcement & Public Safety > Fraud (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Predicting the Risk of Complications in Coronary Artery Bypass Operations using Neural Networks

Neural Information Processing SystemsApr-6-2023, 18:33:31 GMT

Experiments demonstrated that sigmoid multilayer perceptron (MLP) networks provide slightly better risk prediction than conventional logistic regression when used to predict the risk of death, stroke, and renal failure on 1257 patients who underwent coronary artery bypass operations at the Lahey Clinic. MLP networks with no hidden layer and networks with one hidden layer were trained using stochastic gradient descent with early stopping. MLP networks and logistic regression used the same input features and were evaluated using bootstrap sampling with 50 replications. ROC areas for predicting mortality using preoperative input features were 70.5% for logistic regression and 76.0% for MLP networks. Regularization provided by early stopping was an important component of improved perfonnance.

coronary artery bypass operation, mlp network, neural network, (7 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.65)

Add feedback