AITopics | Perceptrons

Collaborating Authors

Perceptrons

News Overviews Instructional Materials AI-Alerts Classics

Perceptron Collaborative Filtering

arXiv.org Artificial IntelligenceJun-17-2024

While multivariate logistic regression classifiers are a great way of implementing collaborative filtering - a method of making automatic predictions about the interests of a user by collecting preferences or taste information from many other users, we can also achieve similar results using neural networks. A recommender system is a subclass of information filtering system that provide suggestions for items that are most pertinent to a particular user. A perceptron or a neural network is a machine learning model designed for fitting complex datasets using backpropagation and gradient descent. When coupled with advanced optimization techniques, the model may prove to be a great substitute for classical logistic classifiers. The optimizations include feature scaling, mean normalization, regularization, hyperparameter tuning and using stochastic/mini-batch gradient descent instead of regular gradient descent. In this use case, we will use the perceptron in the recommender system to fit the parameters i.e., the data from a multitude of users and use it to predict the preference/interest of a particular user.

gradient descent, hyperparameter, neural network, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.22214/ijraset.2023.49044

2407.00067

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > India > West Bengal > Kolkata (0.04)

Genre: Research Report (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.81)

Add feedback

Jacobian-Enhanced Neural Networks

Berguin, Steven H.

arXiv.org Artificial IntelligenceJun-17-2024

Jacobian-Enhanced Neural Networks (JENN) are densely connected multi-layer perceptrons, whose training process is modified to predict partial derivatives accurately. Their main benefit is better accuracy with fewer training points compared to standard neural networks. These attributes are particularly desirable in the field of computer-aided design, where there is often the need to replace computationally expensive, physics-based models with fast running approximations, known as surrogate models or meta-models. Since a surrogate emulates the original model accurately in near-real time, it yields a speed benefit that can be used to carry out orders of magnitude more function calls quickly. However, in the special case of gradient-enhanced methods, there is the additional value proposition that partial derivatives are accurate, which is a critical property for one important use-case: surrogate-based optimization. This work derives the complete theory and exemplifies its superiority over standard neural nets for surrogate-based optimization.

derivative, neural network, optimization, (15 more...)

arXiv.org Artificial Intelligence

2406.09132

Country:

North America > United States (0.04)
Europe > United Kingdom > England > West Sussex (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)

Add feedback

Addressing Shortcomings in Fair Graph Learning Datasets: Towards a New Benchmark

Qian, Xiaowei, Guo, Zhimeng, Li, Jialiang, Mao, Haitao, Li, Bingheng, Wang, Suhang, Ma, Yao

arXiv.org Artificial IntelligenceJun-17-2024

Fair graph learning plays a pivotal role in numerous practical applications. Recently, many fair graph learning methods have been proposed; however, their evaluation often relies on poorly constructed semi-synthetic datasets or substandard real-world datasets. In such cases, even a basic Multilayer Perceptron (MLP) can outperform Graph Neural Networks (GNNs) in both utility and fairness. In this work, we illustrate that many datasets fail to provide meaningful information in the edges, which may challenge the necessity of using graph structures in these problems. To address these issues, we develop and introduce a collection of synthetic, semi-synthetic, and real-world datasets that fulfill a broad spectrum of requirements. These datasets are thoughtfully designed to include relevant graph structures and bias information crucial for the fair evaluation of models. The proposed synthetic and semi-synthetic datasets offer the flexibility to create data with controllable bias parameters, thereby enabling the generation of desired datasets with user-defined bias values with ease. Moreover, we conduct systematic evaluations of these proposed datasets and establish a unified evaluation approach for fair graph learning models. Our extensive experimental results with fair graph learning methods across our datasets demonstrate their effectiveness in benchmarking the performance of these methods. Our datasets and the code for reproducing our experiments are available at https://github.com/XweiQ/Benchmark-GraphFairness.

dataset, fairness, graph structure, (12 more...)

arXiv.org Artificial Intelligence

2403.06017

Country:

Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.05)
North America > United States > Pennsylvania (0.04)
North America > United States > Michigan (0.04)
(6 more...)

Genre: Research Report (0.82)

Industry: Banking & Finance (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

Add feedback

Graph Knowledge Distillation to Mixture of Experts

Rumiantsev, Pavel, Coates, Mark

arXiv.org Machine LearningJun-17-2024

In terms of accuracy, Graph Neural Networks (GNNs) are the best architectural choice for the node classification task. Their drawback in real-world deployment is the latency that emerges from the neighbourhood processing operation. One solution to the latency issue is to perform knowledge distillation from a trained GNN to a Multi-Layer Perceptron (MLP), where the MLP processes only the features of the node being classified (and possibly some pre-computed structural information). However, the performance of such MLPs in both transductive and inductive settings remains inconsistent for existing knowledge distillation techniques. We propose to address the performance concerns by using a specially-designed student model instead of an MLP. Our model, named Routing-by-Memory (RbM), is a form of Mixture-of-Experts (MoE), with a design that enforces expert specialization. By encouraging each expert to specialize on a certain region on the hidden representation space, we demonstrate experimentally that it is possible to derive considerably more consistent performance across multiple datasets.

baseline, dataset, representation, (15 more...)

arXiv.org Machine Learning

2406.11919

Country: North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (1.00)

Industry: Education (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

Add feedback

Grad-Instructor: Universal Backpropagation with Explainable Evaluation Neural Networks for Meta-learning and AutoML

Ino, Ryohei

arXiv.org Artificial IntelligenceJun-15-2024

This paper presents a novel method for autonomously enhancing deep neural network training. My approach employs an Evaluation Neural Network (ENN) trained via deep reinforcement learning to predict the performance of the target network. The ENN then works as an additional evaluation function during backpropagation. Computational experiments with Multi-Layer Perceptrons (MLPs) demonstrate the method's effectiveness. By processing input data at 0.15^2 times its original resolution, the ENNs facilitated efficient inference. Results indicate that MLPs trained with the proposed method achieved a mean test accuracy of 93.02%, which is 2.8% higher than those trained solely with conventional backpropagation or with L1 regularization. The proposed method's test accuracy is comparable to networks initialized with He initialization while reducing the difference between test and training errors. These improvements are achieved without increasing the number of epochs, thus avoiding the risk of overfitting. Additionally, the proposed method dynamically adjusts gradient magnitudes according to the training stage. The optimal ENN for enhancing MLPs can be predicted, reducing the time spent exploring optimal training methodologies. The explainability of ENNs is also analyzed using Grad-CAM, demonstrating their ability to visualize evaluation bases and supporting the Strong Lottery Ticket hypothesis.

mlp, neural network, test accuracy, (13 more...)

arXiv.org Artificial Intelligence

2406.10559

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.69)

Add feedback

Ada-HGNN: Adaptive Sampling for Scalable Hypergraph Neural Networks

Wang, Shuai, Zhang, David W., Huang, Jia-Hong, Rudinac, Stevan, Kackovic, Monika, Wijnberg, Nachoem, Worring, Marcel

arXiv.org Artificial IntelligenceJun-14-2024

Hypergraphs serve as an effective model for depicting complex connections in various real-world scenarios, from social to biological networks. The development of Hypergraph Neural Networks (HGNNs) has emerged as a valuable method to manage the intricate associations in data, though scalability is a notable challenge due to memory limitations. In this study, we introduce a new adaptive sampling strategy specifically designed for hypergraphs, which tackles their unique complexities in an efficient manner. We also present a Random Hyperedge Augmentation (RHA) technique and an additional Multilayer Perceptron (MLP) module to improve the robustness and generalization capabilities of our approach. Thorough experiments with real-world datasets have proven the effectiveness of our method, markedly reducing computational and memory demands while maintaining performance levels akin to conventional HGNNs and other baseline models. This research paves the way for improving both the scalability and efficacy of HGNNs in extensive applications. We will also make our codebase publicly accessible.

artificial intelligence, hypergraph, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2405.13372

Country:

North America > United States > District of Columbia > Washington (0.05)
North America > United States > New York > New York County > New York City (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
(4 more...)

Genre: Research Report > New Finding (0.87)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.68)

Add feedback

Deep Sketched Output Kernel Regression for Structured Prediction

Ahmad, Tamim El, Yang, Junjie, Laforgue, Pierre, d'Alché-Buc, Florence

arXiv.org Machine LearningJun-13-2024

By leveraging the kernel trick in the output space, kernel-induced losses provide a principled way to define structured output prediction tasks for a wide variety of output modalities. In particular, they have been successfully used in the context of surrogate non-parametric regression, where the kernel trick is typically exploited in the input space as well. However, when inputs are images or texts, more expressive models such as deep neural networks seem more suited than non-parametric methods. In this work, we tackle the question of how to train neural networks to solve structured output prediction tasks, while still benefiting from the versatility and relevance of kernel-induced losses. We design a novel family of deep neural architectures, whose last layer predicts in a data-dependent finite-dimensional subspace of the infinite-dimensional output feature space deriving from the kernel-induced loss. This subspace is chosen as the span of the eigenfunctions of a randomly-approximated version of the empirical kernel covariance operator. Interestingly, this approach unlocks the use of gradient descent algorithms (and consequently of any neural architecture) for structured prediction. Experiments on synthetic tasks as well as real-world supervised graph prediction problems show the relevance of our method.

international conference, kernel, prediction, (9 more...)

arXiv.org Machine Learning

2406.09253

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(6 more...)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.46)

Add feedback

Multi-Agent Reinforcement Learning with Hierarchical Coordination for Emergency Responder Stationing

Sivagnanam, Amutheezan, Pettet, Ava, Lee, Hunter, Mukhopadhyay, Ayan, Dubey, Abhishek, Laszka, Aron

arXiv.org Artificial IntelligenceJun-8-2024

An emergency responder management (ERM) system dispatches responders, such as ambulances, when it receives requests for medical aid. ERM systems can also proactively reposition responders between predesignated waiting locations to cover any gaps that arise due to the prior dispatch of responders or significant changes in the distribution of anticipated requests. Optimal repositioning is computationally challenging due to the exponential number of ways to allocate responders between locations and the uncertainty in future requests. The state-of-the-art approach in proactive repositioning is a hierarchical approach based on spatial decomposition and online Monte Carlo tree search, which may require minutes of computation for each decision in a domain where seconds can save lives. We address the issue of long decision times by introducing a novel reinforcement learning (RL) approach, based on the same hierarchical decomposition, but replacing online search with learning. To address the computational challenges posed by large, variable-dimensional, and discrete state and action spaces, we propose: (1) actor-critic based agents that incorporate transformers to handle variable-dimensional states and actions, (2) projections to fixed-dimensional observations to handle complex states, and (3) combinatorial techniques to map continuous actions to discrete allocations. We evaluate our approach using real-world data from two U.S. cities, Nashville, TN and Seattle, WA. Our experiments show that compared to the state of the art, our approach reduces computation time per decision by three orders of magnitude, while also slightly reducing average ambulance response time by 5 seconds.

average response time, responder, response time, (14 more...)

arXiv.org Artificial Intelligence

2405.13205

Country:

North America > United States > Tennessee > Davidson County > Nashville (0.24)
North America > United States > Washington > King County > Seattle (0.24)
Europe > Austria > Vienna (0.14)
(2 more...)

Genre:

Overview (0.88)
Research Report (0.84)

Industry:

Health & Medicine (1.00)
Transportation > Infrastructure & Services (0.67)
Transportation > Ground > Road (0.67)
Transportation > Passenger (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.46)

Add feedback

Local vs. Global Interpretability: A Computational Complexity Perspective

Bassan, Shahaf, Amir, Guy, Katz, Guy

arXiv.org Artificial IntelligenceJun-7-2024

The local and global interpretability of various ML models has been studied extensively in recent years. However, despite significant progress in the field, many known results remain informal or lack sufficient mathematical rigor. We propose a framework for bridging this gap, by using computational complexity theory to assess local and global perspectives of interpreting ML models. We begin by proposing proofs for two novel insights that are essential for our analysis: (1) a duality between local and global forms of explanations; and (2) the inherent uniqueness of certain global explanation forms. We then use these insights to evaluate the complexity of computing explanations, across three model types representing the extremes of the interpretability spectrum: (1) linear models; (2) decision trees; and (3) neural networks. Our findings offer insights into both the local and global interpretability of these models. For instance, under standard complexity assumptions such as P != NP, we prove that selecting global sufficient subsets in linear models is computationally harder than selecting local subsets. Interestingly, with neural networks and decision trees, the opposite is true: it is harder to carry out this task locally than globally. We believe that our findings demonstrate how examining explainability through a computational complexity lens can help us develop a more rigorous grasp of the inherent interpretability of ML models.

explanation, global sufficient reason, sufficient reason, (14 more...)

arXiv.org Artificial Intelligence

2406.02981

Country:

Europe > Austria > Vienna (0.14)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.31)

Add feedback

TSCMamba: Mamba Meets Multi-View Learning for Time Series Classification

Ahamed, Md Atik, Cheng, Qiang

arXiv.org Artificial IntelligenceJun-6-2024

Time series classification (TSC) on multivariate time series is a critical problem. We propose a novel multi-view approach integrating frequency-domain and time-domain features to provide complementary contexts for TSC. Our method fuses continuous wavelet transform spectral features with temporal convolutional or multilayer perceptron features. We leverage the Mamba state space model for efficient and scalable sequence modeling. We also introduce a novel tango scanning scheme to better model sequence relationships. Experiments on 10 standard benchmark datasets demonstrate our approach achieves an average 6.45% accuracy improvement over state-of-the-art TSC models.

representation, sequence, tscmamba, (14 more...)

arXiv.org Artificial Intelligence

2406.04419

Country:

North America > United States > Kentucky (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.86)

Add feedback