Goto

Collaborating Authors

 Perceptrons


Group-Connected Multilayer Perceptron Networks

arXiv.org Machine Learning

Despite the success of deep learning in domains such as image, voice, and graphs, there has been little progress in deep representation learning for domains without a known structure between features. For instance, a tabular dataset of different demographic and clinical factors where the feature interactions are not given as a prior. In this paper, we propose Group-Connected Multilayer Perceptron (GMLP) networks to enable deep representation learning in these domains. GMLP is based on the idea of learning expressive feature combinations (groups) and exploiting them to reduce the network complexity by defining local group-wise operations. During the training phase, GMLP learns a sparse feature grouping matrix using temperature annealing softmax with an added entropy loss term to encourage the sparsity. Furthermore, an architecture is suggested which resembles binary trees, where group-wise operations are followed by pooling operations to combine information; reducing the number of groups as the network grows in depth. To evaluate the proposed method, we conducted experiments on five different real-world datasets covering various application areas. Additionally, we provide visualizations on MNIST and synthesized data. According to the results, GMLP is able to successfully learn and exploit expressive feature combinations and achieve state-of-the-art classification performance on different datasets.


The Mathematics of Data Science: Understanding the foundations of Deep Learning through Linear Regression

#artificialintelligence

In this longish post, I have tried to explain Deep Learning starting from familiar ideas like machine learning. This approach forms a part of my forthcoming book. You can connect with me on Linkedin to know more about the book. I have used this approach in my teaching. It is based on'learning by exception,' i.e. understanding one concept and it's limitations and then understanding how the subsequent concept overcomes that limitation. We thus develop a chain of thought that starts with linear regression and extends to multilayer perceptron (Deep Learning).



Transparent Classification with Multilayer Logical Perceptrons and Random Binarization

arXiv.org Machine Learning

Models with transparent inner structure and high classification performance are required to reduce potential risk and provide trust for users in domains like health care, finance, security, etc. However, existing models are hard to simultaneously satisfy the above two properties. In this paper, we propose a new hierarchical rule-based model for classification tasks, named Concept Rule Sets (CRS), which has both a strong expressive ability and a transparent inner structure. To address the challenge of efficiently learning the non-differentiable CRS model, we propose a novel neural network architecture, Multilayer Logical Perceptron (MLLP), which is a continuous version of CRS. Using MLLP and the Random Binarization (RB) method we proposed, we can search the discrete solution of CRS in continuous space using gradient descent and ensure the discrete CRS acts almost the same as the corresponding continuous MLLP . Experiments on 12 public data sets show that CRS outperforms the state-of-the-art approaches and the complexity of the learned CRS is close to the simple decision tree. Introduction Relying on strong ability of data modeling, machine learning, especially deep learning, becomes the main paradigm for decision-making systems (Goodfellow et al. 2016; Doshi-V elez and Kim 2017). The decision-making systems have widespread usage in important areas such as medicine, finance, politics, as well as law, where people need the explanations why decisions are made to ensure their safety and protect their rights (Goodman and Flaxman 2016; Lipton 2016). As a result, the demand for the transparency of machine learning methods is increasing, which is crucial for earning the trust of users (Doshi-V elez and Kim 2017) and reducing potential risks and bugs (Chu et al. 2018). However, most of the machine learning models can hardly ensure good predictive ability and transparency at the same time, and sacrificing transparency for good performance could result in serious consequences.


All About Perceptron in Deep Learning Why Bias is Used in Neural Networks

#artificialintelligence

Complete Video Series on "Hands on Artificial Intelligence, Machine Learning & Deep Learning using TensorFlow, Keras and Python" I am Gulshan Yadav. An Embedded Systems Development professional with nearly 13 of years R&D experience in design & development of Embedded products in Automotive, IOT and AI domain. About this Video: -------------------------- This video will explain 1. What is Perceptron / Artificial Neuron 2. Basic Building Blocks of Perceptron 3. How Pereptron works? 4. Why Bias is used in Perceptron and Artificial Neural Networks Social Links: Twitter: https://twitter.com/techopcode


Cardiac Arrhythmia Classification by Multi-Layer Perceptron and Convolution Neural Networks

#artificialintelligence

The electrocardiogram (ECG) plays an imperative role in the medical field, as it records heart signal over time and is used to discover numerous cardiovascular diseases. If a documented ECG signal has a certain irregularity in its predefined features, this is called arrhythmia, the types of which include tachycardia, bradycardia, supraventricular arrhythmias, and ventricular, etc. This has encouraged us to do research that consists of distinguishing between several arrhythmias by using deep neural network algorithms such as multi-layer perceptron (MLP) and convolution neural network (CNN). The TensorFlow library that was established by Google for deep learning and machine learning is used in python to acquire the algorithms proposed here. The proposed algorithm consists of four hidden layers with weights, biases in MLP, and four-layer convolution neural networks which map ECG samples to the different classes of arrhythmia.


The invisible workers of the AI era

#artificialintelligence

In the early days of research on Artificial Intelligence, Frank Rosenblatt, a scientist at Cornell University in the United States, invented what he called the "perceptron". The perceptron was an algorithm designed to classify objects it was shown and an ancestor of modern Artificial Intelligence. When Rosenblatt became a little boastful at a press conference in 1958, the New York Times picked up on it and went a little overboard with excitement. "NEW NAVY DEVICE LEARNS BY DOING; Psychologist Shows Embryo of Computer Designed to Read and Grow Wiser", read the title of an article. The Navy said the perceptron would be the first non-living mechanism "capable of receiving, recognizing and identifying its surroundings without any human training or control" Does this tone sound familiar?


What Are Neural Networks?

#artificialintelligence

Many of the biggest advances in AI are driven by artificial neural networks. Artificial Neural Networks (ANNs) are the connection of mathematical functions joined together in a format inspired by the neural networks found in the human brain. These ANNs are capable of extracting complex patterns from data, applying these patterns to unseen data to classify/recognize the data. In this way, the machine "learns". That's a quick rundown on neural networks, but let's take a closer look at neural networks to better understand what they are and how they operate.


Richer priors for infinitely wide multi-layer perceptrons

arXiv.org Machine Learning

It is well-known that the distribution over functions induced through a zero-mean iid prior distribution over the parameters of a multi-layer perceptron (MLP) converges to a Gaussian process (GP), under mild conditions. We extend this result firstly to independent priors with general zero or non-zero means, and secondly to a family of partially exchangeable priors which generalise iid priors. We discuss how the second prior arises naturally when considering an equivalence class of functions in an MLP and through training processes such as stochastic gradient descent. The model resulting from partially exchangeable priors is a GP, with an additional level of inference in the sense that the prior and posterior predictive distributions require marginalisation over hyperparameters. We derive the kernels of the limiting GP in deep MLPs, and show empirically that these kernels avoid certain pathologies present in previously studied priors. We empirically evaluate our claims of convergence by measuring the maximum mean discrepancy between finite width models and limiting models. We compare the performance of our new limiting model to some previously discussed models on synthetic regression problems. We observe increasing ill-conditioning of the marginal likelihood and hyper-posterior as the depth of the model increases, drawing parallels with finite width networks which require notoriously involved optimisation tricks.


Flatsomatic: A Method for Compression of Somatic Mutation Profiles in Cancer

arXiv.org Machine Learning

In this study, we present Flatsomatic - a Variational Auto Encoder (VAE) optimized to compress somatic mutations that allow for unbiased data compression whilst maintaining the signal. We compared two different neural network architectures for the VAE: Multilayer Perceptron (MLP) and bidirectional LSTM. The somatic profiles we used to train our models consisted of 8,062 Pan-Cancer patients from The Cancer Genome Atlas and 989 cell lines from the COSMIC cell line project. The profiles for each patient were represented by the genomic loci where somatic mutations occurred and, to reduce sparsity, the locations with a frequency <5 were removed. We enhanced the VAE performance by changing its evidence lower bound, and devised an F1-score based loss showing that it helps the VAE learn better than with binary cross-entropy. We also employed beta-VAE to weight the variational regularisation term in the loss function and showed the best performance through a preliminary function to increase the weight of the regularisation term with each epoch. We assessed the reconstruction ability of the VAE using the micro F1-score metric and showed that our best performing model was a 2-layer deep MLP VAE. Our analysis also showed that the size of the latent space did not have a significant effect on the VAE learning ability. We compared the Flatsomatic embeddings created to a lower dimension version of the data from principal component analysis, showing superior performance of Flatsomatic, and performed K-means clustering on both datasets to draw comparisons to known cancer types of each profile. Finally, we present results that confirm that the Flatsomatic representations of 64 dimensions maintain the same predictive power as the original 8,298 dimensions vector, through prediction of drug response.