Goto

Collaborating Authors

 Perceptrons


Artificial Neural Network:

#artificialintelligence

Deep learning is a subfield of machine learning concerned with algorithm inspired by the structure and function of the brain called "Artificial Neural Network". In a nutshell, below is the function of a neuron. Axon: is a stem for processing output. Perceptron's work in a similar way like Neuron, it takes input and perform transformations and produces the results. Inside the perceptron, we typically calculate the step function.


Quantum activation functions for quantum neural networks

arXiv.org Artificial Intelligence

The field of artificial neural networks is expected to strongly benefit from recent developments of quantum computers. In particular, quantum machine learning, a class of quantum algorithms which exploit qubits for creating trainable neural networks, will provide more power to solve problems such as pattern recognition, clustering and machine learning in general. The building block of feed-forward neural networks consists of one layer of neurons connected to an output neuron that is activated according to an arbitrary activation function. The corresponding learning algorithm goes under the name of Rosenblatt perceptron. Quantum perceptrons with specific activation functions are known, but a general method to realize arbitrary activation functions on a quantum computer is still lacking. Here we fill this gap with a quantum algorithm which is capable to approximate any analytic activation functions to any given order of its power series. Unlike previous proposals providing irreversible measurement--based and simplified activation functions, here we show how to approximate any analytic function to any required accuracy without the need to measure the states encoding the information. Thanks to the generality of this construction, any feed-forward neural network may acquire the universal approximation properties according to Hornik's theorem. Our results recast the science of artificial neural networks in the architecture of gate-model quantum computers.


Collaborative Reflection-Augmented Autoencoder Network for Recommender Systems

arXiv.org Artificial Intelligence

As the deep learning techniques have expanded to real-world recommendation tasks, many deep neural network based Collaborative Filtering (CF) models have been developed to project user-item interactions into latent feature space, based on various neural architectures, such as multi-layer perceptron, auto-encoder and graph neural networks. However, the majority of existing collaborative filtering systems are not well designed to handle missing data. Particularly, in order to inject the negative signals in the training phase, these solutions largely rely on negative sampling from unobserved user-item interactions and simply treating them as negative instances, which brings the recommendation performance degradation. To address the issues, we develop a Collaborative Reflection-Augmented Autoencoder Network (CRANet), that is capable of exploring transferable knowledge from observed and unobserved user-item interactions. The network architecture of CRANet is formed of an integrative structure with a reflective receptor network and an information fusion autoencoder module, which endows our recommendation framework with the ability of encoding implicit user's pairwise preference on both interacted and non-interacted items. Additionally, a parametric regularization-based tied-weight scheme is designed to perform robust joint training of the two-stage CRANet model. We finally experimentally validate CRANet on four diverse benchmark datasets corresponding to two recommendation tasks, to show that debiasing the negative signals of user-item interactions improves the performance as compared to various state-of-the-art recommendation techniques. Our source code is available at https://github.com/akaxlh/CRANet.


The Concept of Artificial Neurons (Perceptrons) in Neural Networks

#artificialintelligence

Today, we officially begin our Neural Networks and Deep Learning Course as introduced here. We'll begin with a solid introduction to the concept of artificial neurons (perceptrons) in neural networks. Artificial neurons (also called Perceptrons, Units or Nodes) are the simplest elements or building blocks in a neural network. They are inspired by biological neurons that are found in the human brain. In this article, we'll discuss how perceptrons are inspired by biological neurons, draw the structure of a perceptron, discuss the two mathematical functions inside a perceptron and finally, we'll perform some calculations inside a perceptron.


Robust Linear Predictions: Analyses of Uniform Concentration, Fast Rates and Model Misspecification

arXiv.org Machine Learning

Linear prediction is the cornerstone of a significant group of statistical learning algorithms including linear regression, Support Vector Machines (SVM), regularized regressions (such as ridge, elastic net, lasso, and its variants), logistic regression, Poisson regression, probit models, single-layer perceptrons, and tensor regression, just to name a few. Thus, developing a deeper understanding of the pertinent linear prediction models and generalizing the methods to provide unified theoretical bounds is of critical importance to the machine learning community. For the past few decades, researchers have unveiled different aspects of these linear models. Bartlett and Shawe-Taylor (1999) obtained high confidence generalization error bounds for SVMs and other learning algorithms such as boosting and Bayesian posterior classifier. Vapnik-Chervonenkis (VC) theory (Vapnik, 2013) and Rademacher complexity (Bartlett and Mendelson, 2001, 2002) have been instrumental in the machine learning literature to provide generalization bounds (Shalev-Shwartz and Ben-David, 2014). Theoretical properties of the multiple-instance extensions of SVM were analyzed by Doran and Ray (2014). Joint first authors contributed equally to this work.


Day 48: 60 days of Data Science and Machine Learning Series

#artificialintelligence

Multilayer Perceptron is basically ( or a class of) a feedforward artificial neural network which is composed of an input layer to receive the signal, an output layer that makes a decision or prediction about the input, and an arbitrary number of hidden layers for the computation.


Machine Learning for Zombies

#artificialintelligence

Multilayer Perceptrons (MLP), are complex algorithms that take a lot of compute power and a *ton* of data in order to produce satisfactory results in reasonable timeframes. Let's start with what they're not: neural networks, despite the name and every blog post and intro to machine learning text book you've probably read up till now, are not analogs of the human brain. There are some *very* surface-level similarities, but the actual functionality of a neural network has almost nothing in common with the neurons that make up the approximately three pounds of meat that sits between your ears and defines everything you do and how you experience reality. Just like a lot of other machine learning algorithms, they use the formula "label equals weight times data value plus offset" (or y w*x b) to define where they draw their lines/hyperplanes for making predictions. In machine learning, that slope is called a weight.)


Machine Learning for Zombies

#artificialintelligence

Multilayer Perceptrons (MLP), are complex algorithms that take a lot of compute power and a *ton* of data in order to produce satisfactory results in reasonable timeframes. Let's start with what they're not: neural networks, despite the name and every blog post and intro to machine learning text book you've probably read up till now, are not analogs of the human brain. There are some *very* surface-level similarities, but the actual functionality of a neural network has almost nothing in common with the way the neurons that make up the approximately three pounds of meat that sits between your ears and defines everything you do and how you experience reality. Just like a lot of other machine learning algorithms, they use the formula "label equals weight times data value plus offset" (or y w*x b) to define where they draw their lines/hyperplanes for making predictions. In machine learning, that slope is called a weight.)


RepMLPNet: Hierarchical Vision MLP with Re-parameterized Locality

arXiv.org Artificial Intelligence

Compared to convolutional layers, fully-connected (FC) layers are better at modeling the long-range dependencies but worse at capturing the local patterns, hence usually less favored for image recognition. In this paper, we propose a methodology, Locality Injection, to incorporate local priors into an FC layer via merging the trained parameters of a parallel conv kernel into the FC kernel. Locality Injection can be viewed as a novel Structural Re-parameterization method since it equivalently converts the structures via transforming the parameters. Based on that, we propose a multi-layer-perceptron (MLP) block named RepMLP Block, which uses three FC layers to extract features, and a novel architecture named RepMLPNet. The hierarchical design distinguishes RepMLPNet from the other concurrently proposed vision MLPs. As it produces feature maps of different levels, it qualifies as a backbone model for downstream tasks like semantic segmentation. Our results reveal that 1) Locality Injection is a general methodology for MLP models; 2) RepMLPNet has favorable accuracy-efficiency trade-off compared to the other MLPs; 3) RepMLPNet is the first MLP that seamlessly transfer to Cityscapes semantic segmentation. The code and models are available at https://github.com/DingXiaoH/RepMLP.


Eigenspace Restructuring: a Principle of Space and Frequency in Neural Networks

arXiv.org Artificial Intelligence

Understanding the fundamental principles behind the massive success of neural networks is one of the most important open questions in deep learning. However, due to the highly complex nature of the problem, progress has been relatively slow. In this note, through the lens of infinite-width networks, a.k.a. neural kernels, we present one such principle resulting from hierarchical localities. It is well-known that the eigenstructure of infinite-width multilayer perceptrons (MLPs) depends solely on the concept frequency, which measures the order of interactions. We show that the topologies from deep convolutional networks (CNNs) restructure the associated eigenspaces into finer subspaces. In addition to frequency, the new structure also depends on the concept space, which measures the spatial distance among nonlinear interaction terms. The resulting fine-grained eigenstructure dramatically improves the network's learnability, empowering them to simultaneously model a much richer class of interactions, including Long-Range-Low-Frequency interactions, Short-Range-High-Frequency interactions, and various interpolations and extrapolations in-between. Additionally, model scaling can improve the resolutions of interpolations and extrapolations and, therefore, the network's learnability. Finally, we prove a sharp characterization of the generalization error for infinite-width CNNs of any depth in the high-dimensional setting. Two corollaries follow: (1) infinite-width deep CNNs can break the curse of dimensionality without losing their expressivity, and (2) scaling improves performance in both the finite and infinite data regimes.