Perceptrons
Multi-objective Evolutionary Federated Learning
Federated learning is an emerging technique used to prevent the leakage of private information. Unlike centralized learning that needs to collect data from users and store them collectively on a cloud server, federated learning makes it possible to learn a global model while the data are distributed on the users' devices. However, compared with the traditional centralized approach, the federated setting consumes considerable communication resources of the clients, which is indispensable for updating global models and prevents this technique from being widely used. In this paper, we aim to optimize the structure of the neural network models in federated learning using a multi-objective evolutionary algorithm to simultaneously minimize the communication costs and the global model test errors. A scalable method for encoding network connectivity is adapted to federated learning to enhance the efficiency in evolving deep neural networks. Experimental results on both multilayer perceptrons and convolutional neural networks indicate that the proposed optimization method is able to find optimized neural network models that can not only significantly reduce communication costs but also improve the learning performance of federated learning compared with the standard fully connected neural networks.
Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments
Chen, Howard, Suhr, Alane, Misra, Dipendra, Snavely, Noah, Artzi, Yoav
We study the problem of jointly reasoning about language and vision through a navigation and spatial reasoning task. We introduce the Touchdown task and dataset, where an agent must first follow navigation instructions in a real-life visual urban environment to a goal position, and then identify in the observed image a location described in natural language to find a hidden object. The data contains 9,326 examples of English instructions and spatial descriptions paired with demonstrations. We perform qualitative linguistic analysis, and show that the data displays richer use of spatial reasoning compared to related resources. Empirical analysis shows the data presents an open challenge to existing methods.
An Adversarial Approach for Explainable AI in Intrusion Detection Systems
Marino, Daniel L., Wickramasinghe, Chathurika S., Manic, Milos
Despite the growing popularity of modern machine learning techniques (e.g. Deep Neural Networks) in cyber-security applications, most of these models are perceived as a black-box for the user. Adversarial machine learning offers an approach to increase our understanding of these models. In this paper we present an approach to generate explanations for incorrect classifications made by data-driven Intrusion Detection Systems (IDSs). An adversarial approach is used to find the minimum modifications (of the input features) required to correctly classify a given set of misclassified samples. The magnitude of such modifications is used to visualize the most relevant features that explain the reason for the misclassification. The presented methodology generated satisfactory explanations that describe the reasoning behind the mis-classifications, with descriptions that match expert knowledge. The advantages of the presented methodology are: 1) applicable to any classifier with defined gradients. 2) does not require any modification of the classifier model. 3) can be extended to perform further diagnosis (e.g. vulnerability assessment) and gain further understanding of the system. Experimental evaluation was conducted on the NSL-KDD99 benchmark dataset using Linear and Multilayer perceptron classifiers. The results are shown using intuitive visualizations in order to improve the interpretability of the results.
Multi-label classification search space in the MEKA software
de Sรก, Alex G. C., Freitas, Alex A., Pappa, Gisele L.
This technical report describes the multi-label classification (MLC) search space in the MEKA software, including the traditional/meta MLC algorithms, and the traditional/meta/pre-processing single-label classification (SLC) algorithms. The SLC search space is also studied because is part of MLC search space as several methods use problem transformation methods to create a solution (i.e., a classifier) for a MLC problem. This was done in order to understand better the MLC algorithms. Finally, we propose a grammar that formally expresses this understatement.
Breakthrough neural network paves the way for quantum AI
Italian researchers recently developed the first functioning quantum neural network by running a special algorithm on an actual quantum computer. The team, lead by Francesco Tacchino of the University of Pavia in Italy, pre-published their research on ArXiv earlier this month in a research paper titled "An Artificial Neuron Implemented on an Actual Quantum Processor." Basically, they developed a single-layer artificial neural network (ANN) that runs on a quantum computer. This kind of rudimentary ANN is called a perceptron, and it's the basic building block of more robust neural networks. Previous attempts at building a perceptron on a quantum system have involved treating individual qubits as neurons in a network.
Deep into Learning : 4. Perceptron and other alien creatures
In machine learning, the perceptron is an algorithm for supervised learning of binary classifiers. A binary classifier is a function which can decide whether or not an input, represented by a vector of numbers, belongs to some specific class. It is a type of linear classifier, i.e. a classification algorithm that makes its predictions based on a linear predictor function combining a set of weights with the feature vector. In simple words, Perceptron is a Computational Unit (or simply a function) which takes in set of inputs and gives out the result. The role of perceptron is to find a function which will approximately satisfy all the given data. Lets say Xโ (xโ1, xโ2, xโ3) are one of our data in a set of data X (Xโ, Xโ, โฆ, Xโ).
28. Neural Networks
In the context of this course, we view neural networks as "just" another nonlinear hypothesis space. On the practical side, unlike trees and tree-based ensembles (our other major nonlinear hypothesis spaces), neural networks can be fit using gradient-based optimization methods. We discuss the specific case of the multilayer perceptron for multiclass classification, which we view as a generalization of multinomial logistic regression from linear to nonlinear score functions.
Toward Efficient Breast Cancer Diagnosis and Survival Prediction Using L-Perceptron
Mansourifar, Hadi, Shi, Weidong
Breast cancer is the most frequently reported cancer type among the women around the globe and beyond that it has the second highest female fatality rate among all cancer types. Despite all the progresses made in prevention and early intervention, early prognosis and survival prediction rates are still unsatisfactory. In this paper, we propose a novel type of perceptron called L-Perceptron which outperforms all the previous supervised learning methods by reaching 97.42 \% and 98.73 \% in terms of accuracy and sensitivity, respectively in Wisconsin Breast Cancer dataset. Experimental results on Haberman's Breast Cancer Survival dataset, show the superiority of proposed method by reaching 75.18 \% and 83.86 \% in terms of accuracy and F1 score, respectively. The results are the best reported ones obtained in 10-fold cross validation in absence of any preprocessing or feature selection.
Optimal Errors and Phase Transitions in High-Dimensional Generalized Linear Models
Barbier, Jean, Krzakala, Florent, Macris, Nicolas, Miolane, Lรฉo, Zdeborovรก, Lenka
Generalized linear models (GLMs) arise in high-dimensional machine learning, statistics, communications and signal processing. In this paper we analyze GLMs when the data matrix is random, as relevant in problems such as compressed sensing, error-correcting codes or benchmark models in neural networks. We evaluate the mutual information (or "free entropy") from which we deduce the Bayes-optimal estimation and generalization errors. Our analysis applies to the high-dimensional limit where both the number of samples and the dimension are large and their ratio is fixed. Non-rigorous predictions for the optimal errors existed for special cases of GLMs, e.g. for the perceptron, in the field of statistical physics based on the so-called replica method. Our present paper rigorously establishes those decades old conjectures and brings forward their algorithmic interpretation in terms of performance of the generalized approximate message-passing algorithm. Furthermore, we tightly characterize, for many learning problems, regions of parameters for which this algorithm achieves the optimal performance, and locate the associated sharp phase transitions separating learnable and non-learnable regions. We believe that this random version of GLMs can serve as a challenging benchmark for multi-purpose algorithms. This paper is divided in two parts that can be read independently: The first part (main part) presents the model and main results, discusses some applications and sketches the main ideas of the proof. The second part (supplementary informations) is much more detailed and provides more examples as well as all the proofs.
Adaptive Extreme Learning Machine for Recurrent Beta-basis Function Neural Network Training
Chouikhi, Naima, Alimi, Adel M.
Abstract-- Beta Basis Function Neural Network (BBFNN) is a special kind of kernel basis neural networks. It is a feedforward network typified by the use of beta function as a hidden activation function. Beta is a flexible transfer function representing richer forms than the common existing functions. As in every network, the architecture setting as well as the learning method are two main gauntlets faced by BBFNN. In this paper, new architecture and training algorithm are proposed for the BBFNN. An Extreme Learning Machine (ELM) is used as a training approach of BBFNN with the aim of quickening the training process. The peculiarity of ELM is permitting a certain decrement of the computing time and complexity regarding the already used BBFNN learning algorithms such as backpropagation, OLS, etc. For the architectural design, a recurrent structure is added to the common BBFNN architecture in order to make it more able to deal with complex, nonlinear and time varying problems. Throughout this paper, the conceived recurrent ELMtrained BBFNN is tested on a number of tasks related to time series prediction, classification and regression. Experimental results show noticeable achievements of the proposed network compared to common feed-forward and recurrent networks trained by ELM and using hyperbolic tangent as activation function. These achievements are in terms of accuracy and robustness against data breakdowns such as noise signals. HE appeal to machine learning is resurged owing to reasons related to the high popularity of data mining and analysis. In fact, in a world full of available data varieties, computational processing seems to be very useful as it is cheap and powerful and it ensures affordable data handling [1] [2]. The automatic data treatment has provided quick and accurate models which are capable to manipulate much more complex data then deliver more precise results. They perform not only on small data but also on very large scale ones [3].