AITopics | Perceptrons

Collaborating Authors

Perceptrons

News Overviews Instructional Materials AI-Alerts Classics

[D] ReLU activated feed-forward network learns from back. Why? • r/MachineLearning

@machinelearnbotMar-11-2018, 19:25:20 GMT

I've been spending some time looking at the convergence behavior of different neural networks trained on MNIST data and cross-entropy loss. I started by training deeper and deeper networks using sigmoid type activations until the learning efficiency got too low before switching to ReLU activations. After switching to ReLU activations, my network converged without too many problems but I noticed that the learning rates exhibited an interesting pattern. In particular, it takes a complete epoch before the loss begins to fall. My weights and biases are initialized uniformly with weights initialized between -0.1 and 0.1.

artificial intelligence, feed-forward network learn, machine learning, (5 more...)

@machinelearnbot

Industry: Media > News (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.40)

Add feedback

Attention-based Graph Neural Network for Semi-supervised Learning

Thekumparampil, Kiran K., Wang, Chong, Oh, Sewoong, Li, Li-Jia

arXiv.org Machine LearningMar-9-2018

Recently popularized graph neural networks achieve the state-of-the-art accuracy on a number of standard benchmark datasets for graph-based semi-supervised learning, improving significantly over existing approaches. These architectures alternate between a propagation layer that aggregates the hidden states of the local neighborhood and a fully-connected layer. Perhaps surprisingly, we show that a linear model, that removes all the intermediate fully-connected layers, is still able to achieve a performance comparable to the state-of-the-art models. This significantly reduces the number of parameters, which is critical for semi-supervised learning where number of labeled examples are small. This in turn allows a room for designing more innovative propagation layers. Based on this insight, we propose a novel graph neural network that removes all the intermediate fully-connected layers, and replaces the propagation layers with attention mechanisms that respect the structure of the graph. The attention mechanism allows us to learn a dynamic and adaptive local summary of the neighborhood to achieve more accurate predictions. In a number of experiments on benchmark citation networks datasets, we demonstrate that our approach outperforms competing methods. By examining the attention weights among neighbors, we show that our model provides some interesting insights on how neighbors influence each other.

artificial intelligence, machine learning, node, (14 more...)

arXiv.org Machine Learning

1803.03735

Country: North America > United States (0.46)

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Machine Learning From Scratch: The Perceptron Model

#artificialintelligenceFeb-26-2018, 03:06:34 GMT

Learn how to build a perceptron model from scratch with Javascript! Super excited to do this series for you guys! Be sure to leave any questions or feedback below, thumbs up, and subscribe for more machine learning!

artificial intelligence, machine learning, perceptron model

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.79)

Add feedback

Perceptron learning algorithm doesn't work

#artificialintelligenceFeb-23-2018, 10:51:30 GMT

However the program runs into infinite loop and weight tends to be very large. What should I do to debug my program? If you can point out what's going wrong, it'd be also appreciated. What I'm doing here is first generate some data points at random and assign label to them according to the linear target function. Then use perceptron learning to learn this linear function.

artificial intelligence, machine learning, perceptron, (1 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.70)

Add feedback

Structured Control Nets for Deep Reinforcement Learning

Srouji, Mario, Zhang, Jian, Salakhutdinov, Ruslan

arXiv.org Artificial IntelligenceFeb-22-2018

In recent years, Deep Reinforcement Learning has made impressive advances in solving several important benchmark problems for sequential decision making. Many control applications use a generic multilayer perceptron (MLP) for non-vision parts of the policy network. In this work, we propose a new neural network architecture for the policy network representation that is simple yet effective. The proposed Structured Control Net (SCN) splits the generic MLP into two separate sub-modules: a nonlinear control module and a linear control module. Intuitively, the nonlinear control is for forward-looking and global control, while the linear control stabilizes the local dynamics around the residual of global control. We hypothesize that this will bring together the benefits of both linear and nonlinear policies: improve training sample efficiency, final episodic reward, and generalization of learned policy, while requiring a smaller network and being generally applicable to different training methods. We validated our hypothesis with competitive results on simulations from OpenAI MuJoCo, Roboschool, Atari, and a custom 2D urban driving environment, with various ablation and generalization tests, trained with multiple black-box and policy gradient training methods. The proposed architecture has the potential to improve upon broader control tasks by incorporating problem specific priors into the architecture. As a case study, we demonstrate much improved performance for locomotion tasks by emulating the biological central pattern generators (CPGs) as the nonlinear part of the architecture.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

1802.08311

Country: North America > United States (0.68)

Genre: Research Report (0.82)

Industry:

Leisure & Entertainment > Games (0.70)
Transportation (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

The Birth of AI and The First AI Hype Cycle

@machinelearnbotFeb-18-2018, 16:19:40 GMT

Every decade seems to have its technological buzzwords: we had personal computers in 1980s; Internet and worldwide web in 1990s; smart phones and social media in 2000s; and Artificial Intelligence (AI) and Machine Learning in this decade. While artificial intelligence (AI) is among today's most popular topics, a commonly forgotten fact is that it was actually born in 1950 and went through a hype cycle between 1956 and 1982. The purpose of this article is to highlight some of the achievements that took place during the boom phase of this cycle and explain what led to its bust phase. The lessons to be learned from this hype cycle should not be overlooked – its successes formed the archetypes for machine learning algorithms used today, and its shortcomings indicated the dangers of overenthusiasm in promising fields of research and development. Although the first computers were developed during World War II [1,2], what seemed to truly spark the field of AI was a question proposed by Alan Turing in 1950 [3]: can a machine imitate human intelligence?

artificial intelligence, computer, machine learning, (15 more...)

@machinelearnbot

Country: Europe > United Kingdom > England (0.14)

Industry:

Information Technology (0.49)
Government (0.48)
Health & Medicine > Therapeutic Area (0.47)

Technology:

Information Technology > Artificial Intelligence > History (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.33)

Add feedback

Generating Neural Networks with Neural Networks

Deutsch, Lior

arXiv.org Machine LearningFeb-17-2018

Hypernetworks are neural networks that transform a random input vector into weights for a specified target neural network. We formulate the hypernetwork training objective as a compromise between accuracy and diversity, where the diversity takes into account trivial symmetry transformations of the target network. We show that this formulation naturally arises as a relaxation of an optimistic probability distribution objective for the generated networks, and we explain how it is related to variational inference. We use multi-layered perceptrons to form the mapping from the low dimensional input random vector to the high dimensional weight space, and demonstrate how to reduce the number of parameters in this mapping by weight sharing. We perform experiments on a four layer convolutional target network which classifies MNIST images, and show that the generated weights are diverse and have interesting distributions.

artificial intelligence, hypernetwork, machine learning, (18 more...)

arXiv.org Machine Learning

1801.01952

Country: North America (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

Neural Granger Causality for Nonlinear Time Series

Tank, Alex, Covert, Ian, Foti, Nicholas, Shojaie, Ali, Fox, Emily

arXiv.org Machine LearningFeb-16-2018

While most classical approaches to Granger causality detection assume linear dynamics, many interactions in applied domains, like neuroscience and genomics, are inherently nonlinear. In these cases, using linear models may lead to inconsistent estimation of Granger causal interactions. We propose a class of nonlinear methods by applying structured multilayer perceptrons (MLPs) or recurrent neural networks (RNNs) combined with sparsity-inducing penalties on the weights. By encouraging specific sets of weights to be zero---in particular through the use of convex group-lasso penalties---we can extract the Granger causal structure. To further contrast with traditional approaches, our framework naturally enables us to efficiently capture long-range dependencies between series either via our RNNs or through an automatic lag selection in the MLP. We show that our neural Granger causality methods outperform state-of-the-art nonlinear Granger causality methods on the DREAM3 challenge data. This data consists of nonlinear gene expression and regulation time courses with only a limited number of time points. The successes we show in this challenging dataset provide a powerful example of how deep learning can be useful in cases that go beyond prediction on large datasets. We likewise demonstrate our methods in detecting nonlinear interactions in a human motion capture dataset.

artificial intelligence, machine learning, penalty, (15 more...)

arXiv.org Machine Learning

1802.05842

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Health & Medicine > Therapeutic Area (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

Add feedback

Artificial Neural Networks – Part 1: The XOr Problem

#artificialintelligenceFeb-13-2018, 08:58:38 GMT

Introduction This is the first in a series of posts exploring artificial neural network (ANN) implementations. The purpose of the article is to help the reader to gain an intuition of the basic concepts prior to moving on to the algorithmic implementations that will follow.

architecture, artificial intelligence, machine learning, (15 more...)

#artificialintelligence

Country: North America > United States > California > San Diego County > San Diego (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.37)

Add feedback

Artificial Neural Networks – Part 2: MLP Implementation for XOr

#artificialintelligenceFeb-13-2018, 08:58:29 GMT

As promised in part one, this second part details a java implementation of a multilayer perceptron (MLP) for the XOr problem. Actually, as you will see, the core classes are designed to implement any MLP implementation with a single hidden layer. First, it will help to introduce a quick overview of how MLP networks can be used to make predictions for the XOr problem. For a more detailed explanation, please review part one of this post. The image at the top of this article depicts the architecture for a multilayer perceptron network designed specifically to solve the XOr problem.

artificial intelligence, implementation, machine learning, (19 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.97)

Add feedback