Goto

Collaborating Authors

 Perceptrons


Comparing Multilayer Perceptron and Multiple Regression Models for Predicting Energy Use in the Balkans

arXiv.org Machine Learning

Global demographic and economic changes have a critical impact on the total energy consumption, which is why demographic and economic parameters have to be taken into account when making predictions about the energy consumption. This research is based on the application of a multiple linear regression model and a neural network model, in particular multilayer perceptron, for predicting the energy consumption. Data from five Balkan countries has been considered in the analysis for the period 1995-2014. Gross domestic product, total number of population, and CO2 emission were taken as predictor variables, while the energy consumption was used as the dependent variable. The analyses showed that CO2 emissions have the highest impact on the energy consumption, followed by the gross domestic product, while the population number has the lowest impact. The results from both analyses are then used for making predictions on the same data, after which the obtained values were compared with the real values. It was observed that the multilayer perceptron model predicts better the energy consumption than the regression model.


Consturuct a neural network (multilayer perceptrons) using micro:bit

#artificialintelligence

Let's learn the basics of neural networks using micro:bit. For neural network learning to solve various tasks, back propagation is generally required. Learning with this back propagation requires considerably long time calculations. It is not realistic to do this with such a tiny microbit with low computing power, even though it is not impossible. Therefore, here we will try forward calculation only, using the edge weights of the already learned neural network.


On Breiman's Dilemma in Neural Networks: Phase Transitions of Margin Dynamics

arXiv.org Machine Learning

Margin enlargement over training data has been an important strategy since perceptrons in machine learning for the purpose of boosting the robustness of classifiers toward a good generalization ability. Yet Breiman shows a dilemma (Breiman, 1999) that a uniform improvement on margin distribution \emph{does not} necessarily reduces generalization errors. In this paper, we revisit Breiman's dilemma in deep neural networks with recently proposed spectrally normalized margins. A novel perspective is provided to explain Breiman's dilemma based on phase transitions in dynamics of normalized margin distributions, that reflects the trade-off between expressive power of models and complexity of data. When data complexity is comparable to the model expressiveness in the sense that both training and test data share similar phase transitions in normalized margin dynamics, two efficient ways are derived to predict the trend of generalization or test error via classic margin-based generalization bounds with restricted Rademacher complexities. On the other hand, over-expressive models that exhibit uniform improvements on training margins, as a distinct phase transition to test margin dynamics, may lose such a prediction power and fail to prevent the overfitting. Experiments are conducted to show the validity of the proposed method with some basic convolutional networks, AlexNet, VGG-16, and ResNet-18, on several datasets including Cifar10/100 and mini-ImageNet.


Fast and Faster Convergence of SGD for Over-Parameterized Models and an Accelerated Perceptron

arXiv.org Machine Learning

Modern machine learning focuses on highly expressive models that are able to fit or interpolate the data completely, resulting in zero training loss. For such models, we show that the stochastic gradients of common loss functions satisfy a strong growth condition. Under this condition, we prove that constant step-size stochastic gradient descent (SGD) with Nesterov acceleration matches the convergence rate of the deterministic setting for both convex and strongly-convex functions. In the non-convex setting, this condition implies that SGD can find a first-order stationary point as efficiently as full gradient descent. Under interpolation, we also show that all smooth loss functions with a finite-sum structure satisfy a weaker growth condition. Given this weaker condition, we prove that SGD with a constant step-size attains the deterministic convergence rate in both the strongly-convex and convex settings. Under additional assumptions, the above results enable us to prove an O(1/k^2) mistake bound for $k$ iterations of a stochastic perceptron algorithm using the squared-hinge loss. Finally, we validate our theoretical findings with experiments on synthetic and real datasets.


Binary Matrix Guessing Problem

arXiv.org Artificial Intelligence

We introduce the Binary Matrix Guessing Problem and provide two algorithms to solve this problem. The first algorithm we introduce is Elementwise Probing Algorithm (EPA) which is very fast under a score which utilizes Frobenius Distance. The second algorithm is Additive Reinforcement Learning Algorithm which combines ideas from perceptron algorithm and reinforcement learning algorithm. This algorithm is significantly slower compared to first one, but less restrictive and generalizes better. We compare computational performance of both algorithms and provide numerical results. reason for withdrawal: Paper will be rewritten with experiments replicated on verified and validated hardware and software.


6 Steps To Write Any Machine Learning Algorithm From Scratch: Perceptron Case Study

#artificialintelligence

Writing a machine learning algorithm from scratch is an extremely rewarding learning experience. It provides you with that "ah ha!" moment where it finally clicks, and you understand what's really going on under the hood. Some algorithms are just more complicated than others, so start with something simple, such as the single layer Perceptron. I'll walk you through the following 6-step process to write algorithms from scratch, using the Perceptron as a case-study: This goes back to what I originally stated. If you don't understand the basics, don't tackle an algorithm from scratch. For the Perceptron, let's go ahead and answer these questions: After you have a basic understanding of the model, it's time to start doing your research. Some people learn better with textbooks, some people learn better with video.


Learning to Reason

arXiv.org Artificial Intelligence

Automated theorem proving has long been a key task of artificial intelligence. Proofs form the bedrock of rigorous scientific inquiry. Many tools for both partially and fully automating their derivations have been developed over the last half a century. Some examples of state-of-the-art provers are E (Schulz, 2013), VAMPIRE (Kov\'acs & Voronkov, 2013), and Prover9 (McCune, 2005-2010). Newer theorem provers, such as E, use superposition calculus in place of more traditional resolution and tableau based methods. There have also been a number of past attempts to apply machine learning methods to guiding proof search. Suttner & Ertel proposed a multilayer-perceptron based method using hand-engineered features as far back as 1990; Urban et al (2011) apply machine learning to tableau calculus; and Loos et al (2017) recently proposed a method for guiding the E theorem prover using deep nerual networks. All of this prior work, however, has one common limitation: they all rely on the axioms of classical first-order logic. Very little attention has been paid to automated theorem proving for non-classical logics. One of the only recent examples is McLaughlin & Pfenning (2008) who applied the polarized inverse method to intuitionistic propositional logic. The literature is otherwise mostly silent. This is truly unfortunate, as there are many reasons to desire non-classical proofs over classical. Constructive/intuitionistic proofs should be of particular interest to computer scientists thanks to the well-known Curry-Howard correspondence (Howard, 1980) which tells us that all terminating programs correspond to a proof in intuitionistic logic and vice versa. This work explores using Q-learning (Watkins, 1989) to inform proof search for a specific system called non-classical logic called Core Logic (Tennant, 2017).


Machine learning plasma-surface interface for coupling sputtering and gas-phase transport simulations

arXiv.org Artificial Intelligence

Thin film processing by means of sputter deposition inherently depends on the interaction of energetic particles with a target surface and the subsequent particle transport. The length and time scales of the underlying physical phenomena span orders of magnitudes. A theoretical description which bridges all time and length scales is not practically possible. Advantage can be taken particularly from the well-separated time scales of the fundamental surface and plasma processes. Initially, surface properties may be calculated from a surface model and stored for a number of representative cases. Subsequently, the surface data may be provided to gas-phase transport simulations via appropriate model interfaces (e.g., analytic expressions or look-up tables) and utilized to define insertion boundary conditions. During run-time evaluation, however, the maintained surface data may prove to be not sufficient. In this case, missing data may be obtained by interpolation (common), extrapolation (inaccurate), or be supplied on-demand by the surface model (computationally inefficient). In this work, a potential alternative is established based on machine learning techniques using artificial neural networks. As a proof of concept, a multilayer perceptron network is trained and verified with sputtered particle distributions obtained from transport of ions in matter based simulations for Ar projectiles bombarding a Ti-Al composite. It is demonstrated that the trained network is able to predict the sputtered particle distributions for unknown, arbitrarily shaped incident ion energy distributions. It is consequently argued that the trained network may be readily used as a machine learning based model interface (e.g., by quasi-continuously sampling the desired sputtered particle distributions from the network), which is sufficiently accurate also in scenarios which have not been previously trained.


Deep Quality-Value (DQV) Learning

arXiv.org Machine Learning

We introduce a novel Deep Reinforcement Learning (DRL) algorithm called Deep Quality-Value (DQV) Learning. DQV uses temporal-difference learning to train a Value neural network and uses this network for training a second Quality-value network that learns to estimate state-action values. We first test DQV's update rules with Multilayer Perceptrons as function approximators on two classic RL problems, and then extend DQV with the use of Deep Convolutional Neural Networks, `Experience Replay' and `Target Neural Networks' for tackling four games of the Atari Arcade Learning environment. Our results show that DQV learns significantly faster and better than Deep Q-Learning and Double Deep Q-Learning, suggesting that our algorithm can potentially be a better performing synchronous temporal difference algorithm than what is currently present in DRL.


6 Steps To Write Any Machine Learning Algorithm From Scratch: Perceptron Case Study

#artificialintelligence

This goes back to what I originally stated. If you don't understand the basics, don't tackle an algorithm from scratch. For the Perceptron, let's go ahead and answer these questions: After you have a basic understanding of the model, it's time to start doing your research. Some people learn better with textbooks, some people learn better with video. Personally, I like to bounce around and use various types of sources. For the mathematical details, textbooks do a great job, but for more practical examples, I prefer blog posts and YouTube videos. Now that we've gathered our sources, it's time to start learning.