Perceptrons
A Practical Method for Constructing Equivariant Multilayer Perceptrons for Arbitrary Matrix Groups
Finzi, Marc, Welling, Max, Wilson, Andrew Gordon
Symmetries and equivariance are fundamental to the generalization of neural networks on domains such as images, graphs, and point clouds. Existing work has primarily focused on a small number of groups, such as the translation, rotation, and permutation groups. In this work we provide a completely general algorithm for solving for the equivariant layers of matrix groups. In addition to recovering solutions from other works as special cases, we construct multilayer perceptrons equivariant to multiple groups that have never been tackled before, including $\mathrm{O}(1,3)$, $\mathrm{O}(5)$, $\mathrm{Sp}(n)$, and the Rubik's cube group. Our approach outperforms non-equivariant baselines, with applications to particle physics and dynamical systems. We release our software library to enable researchers to construct equivariant layers for arbitrary matrix groups.
On the Computational Intelligibility of Boolean Classifiers
Audemard, Gilles, Bellart, Steve, Bounia, Louenas, Koriche, Frรฉdรฉric, Lagniez, Jean-Marie, Marquis, Pierre
In this paper, we investigate the computational intelligibility of Boolean classifiers, characterized by their ability to answer XAI queries in polynomial time. The classifiers under consideration are decision trees, DNF formulae, decision lists, decision rules, tree ensembles, and Boolean neural nets. Using 9 XAI queries, including both explanation queries and verification queries, we show the existence of large intelligibility gap between the families of classifiers. On the one hand, all the 9 XAI queries are tractable for decision trees. On the other hand, none of them is tractable for DNF formulae, decision lists, random forests, boosted decision trees, Boolean multilayer perceptrons, and binarized neural networks.
Thermal transmittance prediction based on the application of artificial neural networks on heat flux method results
Gumbareviฤ, Sanjin, Milovanoviฤ, Bojan, Gaลกi, Mergim, Bagariฤ, Marina
Deep energy renovation of building stock came more into focus in the European Union due to energy efficiency related directives. Many buildings that must undergo deep energy renovation are old and may lack design/renovation documentation, or possible degradation of materials might have occurred in building elements over time. Thermal transmittance (i.e. U-value) is one of the most important parameters for determining the transmission heat losses through building envelope elements. It depends on the thickness and thermal properties of all the materials that form a building element. In-situ U-value can be determined by ISO 9869-1 standard (Heat Flux Method - HFM). Still, measurement duration is one of the reasons why HFM is not widely used in field testing before the renovation design process commences. This paper analyzes the possibility of reducing the measurement time by conducting parallel measurements with one heat-flux sensor. This parallelization could be achieved by applying a specific class of the Artificial Neural Network (ANN) on HFM results to predict unknown heat flux based on collected interior and exterior air temperatures. After the satisfying prediction is achieved, HFM sensor can be relocated to another measuring location. Paper shows a comparison of four ANN cases applied to HFM results for a measurement held on one multi-layer wall - multilayer perceptron with three neurons in one hidden layer, long short-term memory with 100 units, gated recurrent unit with 100 units and combination of 50 long short-term memory units and 50 gated recurrent units. The analysis gave promising results in term of predicting the heat flux rate based on the two input temperatures. Additional analysis on another wall showed possible limitations of the method that serves as a direction for further research on this topic.
Asymptotic Freeness of Layerwise Jacobians Caused by Invariance of Multilayer Perceptron: The Haar Orthogonal Case
Collins, Benoit, Hayase, Tomohiro
Free Probability Theory (FPT) provides rich knowledge for handling mathematical difficulties caused by random matrices that appear in researches of deep neural networks (DNNs), such as the dynamical isometry, Fisher information matrix, and training dynamics. FPT suits these researches because the DNN's parameter-Jacobian and input-Jacobian are polynomials of layerwise Jacobians. However, the critical assumption, that is, the layerwise Jacobian's asymptotic freeness, has not been proven completely so far. The asymptotic freeness assumption has foundamental roles in these researches to propagate spectral distributions through the layers. In the present work, we prove the asymptotic freeness of layerwise Jacobian of multilayer perceptrons with Haar distributed orthogonal matrices, which are essential for achieving dynamical isometry.
Learning needle insertion from sample task executions
Automating a robotic task, e.g., robotic suturing can be very complex and time-consuming. Learning a task model to autonomously perform the task is invaluable making the technology, robotic surgery, accessible for a wider community. The data of robotic surgery can be easily logged where the collected data can be used to learn task models. This will result in reduced time and cost of robotic surgery in which a surgeon can supervise the robot operation or give high-level commands instead of low-level control of the tools. We present a data-set of needle insertion in soft tissue with two arms where Arm 1 inserts the needle into the tissue and Arm 2 actively manipulate the soft tissue to ensure the desired and actual exit points are the same. This is important in real-surgery because suturing without active manipulation of tissue may yield failure of the suturing as the stitch may not grip enough tissue to resist the force applied for the suturing. We present a needle insertion dataset including 60 successful trials recorded by 3 pair of stereo cameras. Moreover, we present Deep-robot Learning from Demonstrations that predicts the desired state of the robot at the time step after t (which the optimal action taken at t yields) by looking at the video of the past time steps, i.e. n step time history where N is the memory time window, of the task execution. The experimental results illustrate our proposed deep model architecture is outperforming the existing methods. Although the solution is not yet ready to be deployed on a real robot, the results indicate the possibility of future development for real robot deployment.
Predict Customer Churn with Neural Network
In real-world situations, data scientists often start an analysis with a simple and easy to implement model such as linear or logistic regression. There are various advantages of this approach such as getting a sense of the data with a minimum cost and giving food for thoughts on how to solve a business problem. In this blog post, I decided to start from the opposite side by applying a multilayer perceptron model (neural network) to predict customer churn. I think it is quite fun and exciting to try different algorithms or at least to know how you can solve a problem in a more sophisticated way. Customer churn is when a customer decides to stop using services, content, or products from a company.
Neural Network: How it works and its industry use cases
Neural networks are a series of algorithms that mimic the operations of a human brain to recognize relationships between vast amounts of data. They are used in a variety of applications in financial services, from forecasting and marketing research to fraud detection and risk assessment. A neural network has many layers. Each layer performs a specific function, and the complex the network is, the more the layers are. That's why a neural network is also called a multi-layer perceptron.
Perceptron
Perceptron is one of the most fundamental concepts of deep learning which every data scientist is expected to master. It is a supervised learning algorithm specifically for binary classifiers. Note: If you are more interested in learning concepts in an Audio-Visual format, We have this entire article explained in the video below. If not, you may continue reading. In this article, we will develop a solid intuition about Perceptron with the help of an example. Without any further delay, let's begin!
From perceptrons to deep learning
Have you ever wondered if it's possible to learn all there is to know about machine learning and deep learning from a book? Machine Learning--A Journey to Deep Learning, with Exercises and Answers is designed to give the self-taught student a solid foundation in machine learning with step-by-step solutions to the formative exercises and many concrete examples. By going through this text, readers should become able to apply and understand machine learning algorithms as well as create new ones. The statistical approach leads to the definition of regularization out of the example of regression. Building on regression, we develop the theory of perceptrons and logistic regression.
Proof of the Contiguity Conjecture and Lognormal Limit for the Symmetric Perceptron
Abbe, Emmanuel, Li, Shuangping, Sly, Allan
We consider the symmetric binary perceptron model, a simple model of neural networks that has gathered significant attention in the statistical physics, information theory and probability theory communities, with recent connections made to the performance of learning algorithms in Baldassi et al. '15. We establish that the partition function of this model, normalized by its expected value, converges to a lognormal distribution. As a consequence, this allows us to establish several conjectures for this model: (i) it proves the contiguity conjecture of Aubin et al. '19 between the planted and unplanted models in the satisfiable regime; (ii) it establishes the sharp threshold conjecture; (iii) it proves the frozen 1-RSB conjecture in the symmetric case, conjectured first by Krauth-M\'ezard '89 in the asymmetric case. In a recent concurrent work of Perkins-Xu [PX21], the last two conjectures were also established by proving that the partition function concentrates on an exponential scale. This left open the contiguity conjecture and the lognormal limit characterization, which are established here. In particular, our proof technique relies on a dense counter-part of the small graph conditioning method, which was developed for sparse models in the celebrated work of Robinson and Wormald.