Perceptrons
An interpretable neural network model through piecewise linear approximation
Guo, Mengzhuo, Zhang, Qingpeng, Liao, Xiuwu, Zeng, Daniel Dajun
Most existing interpretable methods explain a black-box model in a post-hoc manner, which uses simpler models or data analysis techniques to interpret the predictions after the model is learned. However, they (a) may derive contradictory explanations on the same predictions given different methods and data samples, and (b) focus on using simpler models to provide higher descriptive accuracy at the sacrifice of prediction accuracy. To address these issues, we propose a hybrid interpretable model that combines a piecewise linear component and a nonlinear component. The first component describes the explicit feature contributions by piecewise linear approximation to increase the expressiveness of the model. The other component uses a multi-layer perceptron to capture feature interactions and implicit nonlinearity, and increase the prediction performance. Different from the post-hoc approaches, the interpretability is obtained once the model is learned in the form of feature shapes. We also provide a variant to explore higher-order interactions among features to demonstrate that the proposed model is flexible for adaptation. Experiments demonstrate that the proposed model can achieve good interpretability by describing feature shapes while maintaining state-of-the-art accuracy.
Memory capacity of neural networks with threshold and ReLU activations
Overwhelming theoretical and empirical evidence shows that mildly overparametrized neural networks -- those with more connections than the size of the training data -- are often able to memorize the training data with $100\%$ accuracy. This was rigorously proved for networks with sigmoid activation functions and, very recently, for ReLU activations. Addressing a 1988 open question of Baum, we prove that this phenomenon holds for general multilayered perceptrons, i.e. neural networks with threshold activation functions, or with any mix of threshold and ReLU activations. Our construction is probabilistic and exploits sparsity.
The differences between Artificial and Biological Neural Networks
Although artificial neurons and perceptrons were inspired by the biological processes scientists were able to observe in the brain back in the 50s, they do differ from their biological counterparts in several ways. Birds have inspired flight and horses have inspired locomotives and cars, yet none of today's transportation vehicles resemble metal skeletons of living-breathing-self replicating animals. Still, our limited machines are even more powerful in their own domains (thus, more useful to us humans), than their animal "ancestors" could ever be. It is easy to draw the wrong conclusions from the possibilities in AI research by anthropomorphizing Deep Neural Networks, but artificial and biological neurons do differ in more ways than just the materials of their containers. The idea behind perceptrons (the predecessors to artificial neurons) is that it is possible to mimic certain parts of neurons, such as dendrites, cell bodies and axons using simplified mathematical models of what limited knowledge we have on their inner workings: signals can be received from dendrites, and sent down the axon once enough signals were received.
Intelligent Road Inspection with Advanced Machine Learning; Hybrid Prediction Models for Smart Mobility and Transportation Maintenance Systems
Karballaeezadeh, Nader, Zaremotekhases, Farah, Shamshirband, Shahaboddin, Mosavi, Amir, Nabipour, Narjes, Csiba, Peter, Varkonyi-Koczy, Annamaria R.
School of the Built Environment, Oxford Brookes University, Oxford OX3 0BP, UK; a. mosavi@brookes.ac.uk Abstract: Prediction models in mobility and transportation maintenance systems have been dramatically improved through using machine learning methods . The traditional road inspecti on systems based on the pavement condition index (PCI) are often associated with the critical safety, energy and cost issues. Alternatively, t he proposed models utilize surface deflection data from falling weight deflectometer (FWD) test s to predict the PC I. Machine learning methods are the single multi - layer perceptron (MLP) and radial basis function (RBF) neural networks as well their hybrids, i.e., L eve nberg - M arquardt (MLP - LM), scaled conjugate gradient (MLP - SCG), imperialist competitive (RBF - ICA), and g enetic algorithms (RBF - GA). Furthermore, the committee machine intelligent systems (CMIS) method was adopted to combine the results and improve the accur acy of the modeling. The results of the analysis have been verified through using four criteria of aver age percent relative error (APRE), average absolute percent relative error (AAPRE), root mean square error (RMSE), and standard error (SD). The CMIS mode l outperforms other models with the promising results of APRE 2.3303, AAPRE 11.6768, RMSE 12.0056, and SD 0.0210. Introduction In road transportation, pavement plays a vital role as th e part of the road that is in direct contact with vehicles . U sers' judgment about the quality of road service is primarily predicated upon pavement conditions. The Maintena nce, Rehabilitation, and Reconstruction (MR&R) program of pavement network is a multidimensional decision - making process that takes into account several consideration s.
Perceptron and its implementation in Python
The dataset that we consider for implementing Perceptron is the Iris flower dataset. This dataset contains 4 features that describe the flower and classify them as belonging to one of the 3 classes. We strip the last 50 rows of the dataset that belongs to the class'Iris-virginica' and use only 2 classes'Iris-setosa' and'Iris-versicolor' because these classes are linearly separable and the algorithm converges to a local minimum by eventually finding the optimal weights. Visualizing the dataset with 2 of the features, we can see that that dataset can be clearly separated by drawing a straight line between them. Our goal is to write an algorithm that finds that line and classifies all of these data points correctly.
Wine quality rapid detection using a compact electronic nose system: application focused on spoilage thresholds by acetic acid
Gamboa, Juan C. Rodriguez, E., Eva Susana Albarracin, da Silva, Adenilton J., Leite, Luciana, Ferreira, Tiago A. E.
It is crucial for the wine industry to have methods like electronic nose systems (E-Noses) for real-time monitoring thresholds of acetic acid in wines, preventing its spoilage or determining its quality. In this paper, we prove that the portable and compact self-developed E-Nose, based on thin film semiconductor (SnO2) sensors and trained with an approach that uses deep Multilayer Perceptron (MLP) neural network, can perform early detection of wine spoilage thresholds in routine tasks of wine quality control. To obtain rapid and online detection, we propose a method of rising-window focused on raw data processing to find an early portion of the sensor signals with the best recognition performance. Our approach was compared with the conventional approach employed in E-Noses for gas recognition that involves feature extraction and selection techniques for preprocessing data, succeeded by a Support Vector Machine (SVM) classifier. The results evidence that is possible to classify three wine spoilage levels in 2.7 seconds after the gas injection point, implying in a methodology 63 times faster than the results obtained with the conventional approach in our experimental setup.
Under the Hood of Deep Learning
The previous image is a simple architecture for a deep neural network. The goal of this post is to understand deep learning details and build your own network, rather than use the existing models as a black box! In this post, we will go over a simple neural network that can learn to recognize hand-written digits (MNIST dataset). Currently, there are various types of neural networks, but for the sake of simplicity, we will start with the vanilla form (aka "Multilayer Perceptron"). Please note that the circles in the previous diagrams called neurons.
Emergence of Network Motifs in Deep Neural Networks
Zambra, Matteo, Testolin, Alberto, Maritan, Amos
Network science can offer fundamental insights into the structural and functional properties of complex systems. For example, it is widely known that neuronal circuits tend to organize into basic functional topological modules, called "network motifs". In this article we show that network science tools can be successfully applied also to the study of artificial neural networks operating according to self-organizing (learning) principles. In particular, we study the emergence of network motifs in multi-layer perceptrons, whose initial connectivity is defined as a stack of fully-connected, bipartite graphs. Our simulations show that the final network topology is primarily shaped by learning dynamics, but can be strongly biased by choosing appropriate weight initialization schemes. Overall, our results suggest that non-trivial initialization strategies can make learning more effective by promoting the development of useful network motifs, which are often surprisingly consistent with those observed in general transduction networks.
Online Algorithms for Multiclass Classification using Partial Labels
Bhattacharjee, Rajarshi, Manwani, Naresh
In this paper, we propose online algorithms for multiclass classification using partial labels. We propose two variant s of Perceptron called Avg Perceptron and Max Perceptron to deal with the par tial labeled data. We also propose Avg Pegasos and Max Pegasos, whic h are extensions of Pegasos algorithm. We also provide mistake bounds for Avg Perceptron and regret bound for Avg Pegasos. We show the effec tiveness of the proposed approaches by experimenting on various data sets and comparing them with the standard Perceptron and Pegasos.
Bandit Multiclass Linear Classification for the Group Linear Separable Case
Fakcharoenphol, Jittat, Prompak, Chayutpong
We consider the online multiclass linear classification under the bandit feedback setting. Beygelzimer, P\'{a}l, Sz\"{o}r\'{e}nyi, Thiruvenkatachari, Wei, and Zhang [ICML'19] considered two notions of linear separability, weak and strong linear separability. When examples are strongly linearly separable with margin $\gamma$, they presented an algorithm based on Multiclass Perceptron with mistake bound $O(K/\gamma^2)$, where $K$ is the number of classes. They employed rational kernel to deal with examples under the weakly linearly separable condition, and obtained the mistake bound of $\min(K\cdot 2^{\tilde{O}(K\log^2(1/\gamma))},K\cdot 2^{\tilde{O}(\sqrt{1/\gamma}\log K)})$. In this paper, we refine the notion of weak linear separability to support the notion of class grouping, called group weak linear separable condition. This situation may arise from the fact that class structures contain inherent grouping. We show that under this condition, we can also use the rational kernel and obtain the mistake bound of $K\cdot 2^{\tilde{O}(\sqrt{1/\gamma}\log L)})$, where $L\leq K$ represents the number of groups.