Goto

Collaborating Authors

 Country


Improving Neural Network Learning Through Dual Variable Learning Rates

arXiv.org Machine Learning

This paper introduces and evaluates a novel training method for neural networks: Dual Variable Learning Rates (DVLR). Building on techniques and insights from behavioral psychology, the dual learning rates are used to emphasize correct and incorrect responses differently, thereby making the feedback to the network more specific. Further, the learning rates are varied as a function of the network's performance, thereby making it more efficient. DVLR was implemented on both a simple feedforward neural network and a convolutional neural network. Both networks are trained faster and achieve an increased accuracy on the MNIST and CIFAR-10 domains demonstrating that DVLR is a promising, psychologically motivated technique for training neural network models.


Very simple statistical evidence that AlphaGo has exceeded human limits in playing GO game

arXiv.org Artificial Intelligence

Deep learning technology is making great progress in solving the challenging problems of artificial intelligence, hence machine learning based on artificial neural networks is in the spotlight again. In some areas, artificial intelligence based on deep learning is beyond human capabilities. It seemed extremely difficult for a machine to beat a human in a Go game, but AlphaGo has shown to beat a professional player in the game. By looking at the statistical distribution of the distance in which the Go stones are laid in succession, we find a clear trace that Alphago has surpassed human abilities. The AlphaGo than professional players and professional players than ordinary players shows the laying of stones in the distance becomes more frequent. In addition, AlphaGo shows a much more pronounced difference than that of ordinary players and professional players.


Breaking Batch Normalization for better explainability of Deep Neural Networks through Layer-wise Relevance Propagation

arXiv.org Artificial Intelligence

The lack of transparency of neural networks stays a major break for their use. The Layerwise Relevance Propagation technique builds heat-maps representing the relevance of each input in the model s decision. The relevance spreads backward from the last to the first layer of the Deep Neural Network. Layer-wise Relevance Propagation does not manage normalization layers, in this work we suggest a method to include normalization layers. Specifically, we build an equivalent network fusing normalization layers and convolutional or fully connected layers. Heatmaps obtained with our method on MNIST and CIFAR 10 datasets are more accurate for convolutional layers. Our study also prevents from using Layerwise Relevance Propagation with networks including a combination of connected layers and normalization layer.


Facets of the PIE Environment for Proving, Interpolating and Eliminating on the Basis of First-Order Logic

arXiv.org Artificial Intelligence

PIE is a Prolog-embedded environment for automated reasoning on the basis of first-order logic. Its main focus is on formulas, as constituents of complex formalizations that are structured through formula macros, and as outputs of reasoning tasks such as second-order quantifier elimination and Craig interpolation. It supports a workflow based on documents that intersperse macro definitions, invocations of reasoners, and LaTeX-formatted natural language text. Starting from various examples, the paper discusses features and application possibilities of PIE along with current limitations and issues for future research.


AMP Chain Graphs: Minimal Separators and Structure Learning Algorithms

arXiv.org Artificial Intelligence

We address the problem of finding a minimal separator in an Andersson-Madigan-Perlman chain graph (AMP CG), namely, finding a set Z of nodes that separate a given non-adjacent pair of nodes such that no proper subset of Z separates that pair. We analyze several versions of this problem and offer polynomial-time algorithms for each. These include finding a minimal separator from a restricted set of nodes, finding a minimal separator for two given disjoint sets, and testing whether a given separator is minimal. We provide an extension of the decomposition approach for learning Bayesian networks (BNs) proposed by (Xie et. al.) to learn AMP CGs, which include BNs as a special case, under the faithfulness assumption and prove its correctness using the minimal separator results. The advantages of this decomposition approach hold in the more general setting: reduced complexity and increased power of computational independence tests. In addition, we show that the PC-like algorithm is order-dependent, in the sense that the output can depend on the order in which the variables are given. We propose two modifications of the PC-like algorithm that remove part or all of this order-dependence. Simulations under a variety of settings demonstrate the competitive performance of our decomposition-based method, called LCD-AMP, in comparison with the (modified version of) PC-like algorithm. In fact, the decomposition-based algorithm usually outperforms the PC-like algorithm. We empirically show that the results of both algorithms are more accurate and stable when the sample size is reasonably large and the underlying graph is sparse.


Declarative Memory-based Structure for the Representation of Text Data

arXiv.org Artificial Intelligence

In the era of intelligent computing, computational progress in text processing is an essential consideration. Many systems have been developed to process text over different languages. Though, there is considerable development, they still lack in understanding of the text, i.e., instead of keeping text as knowledge, many treat text as a data. In this work we introduce a text representation scheme which is influenced by human memory infrastructure. Since texts are declarative in nature, a structural organization would foster efficient computation over text. We exploit long term episodic memory to keep text information observed over time. This not only keep fragments of text in an organized fashion but also reduces redundancy and stores the temporal relation among them. Wordnet has been used to imitate semantic memory, which works at word level to facilitate the understanding about individual words within text. Experimental results of various operation performed over episodic memory and growth of knowledge infrastructure over time is reported.


A Double Q-Learning Approach for Navigation of Aerial Vehicles with Connectivity Constraint

arXiv.org Artificial Intelligence

This paper studies the trajectory optimization problem for an aerial vehicle with the mission of flying between a pair of given initial and final locations. The objective is to minimize the travel time of the aerial vehicle ensuring that the communication connectivity constraint required for the safe operation of the aerial vehicle is satisfied. We consider two different criteria for the connectivity constraint of the aerial vehicle which leads to two different scenarios. In the first scenario, we assume that the maximum continuous time duration that the aerial vehicle is out of the coverage of the ground base stations (GBSs) is limited to a given threshold. In the second scenario, however, we assume that the total time periods that the aerial vehicle is not covered by the GBSs is restricted. Based on these two constraints, we formulate two trajectory optimization problems. To solve these non-convex problems, we use an approach based on the double Q-learning method which is a model-free reinforcement learning technique and unlike the existing algorithms does not need perfect knowledge of the environment. Moreover, in contrast to the well-known Q-learning technique, our double Q-learning algorithm does not suffer from the over-estimation issue. Simulation results show that although our algorithm does not require prior information of the environment, it works well and shows near optimal performance.


Provable Representation Learning for Imitation Learning via Bi-level Optimization

arXiv.org Artificial Intelligence

A common strategy in modern learning systems is to learn a representation that is useful for many tasks, a.k.a. representation learning. We study this strategy in the imitation learning setting for Markov decision processes (MDPs) where multiple experts' trajectories are available. We formulate representation learning as a bi-level optimization problem where the "outer" optimization tries to learn the joint representation and the "inner" optimization encodes the imitation learning setup and tries to learn task-specific parameters. We instantiate this framework for the imitation learning settings of behavior cloning and observation-alone. Theoretically, we show using our framework that representation learning can provide sample complexity benefits for imitation learning in both settings. We also provide proof-of-concept experiments to verify our theory.


Symbolic Learning and Reasoning with Noisy Data for Probabilistic Anchoring

arXiv.org Artificial Intelligence

Robotic agents should be able to learn from sub-symbolic sensor data, and at the same time, be able to reason about objects and communicate with humans on a symbolic level. This raises the question of how to overcome the gap between symbolic and sub-symbolic artificial intelligence. We propose a semantic world modeling approach based on bottom-up object anchoring using an object-centered representation of the world. Perceptual anchoring processes continuous perceptual sensor data and maintains a correspondence to a symbolic representation. We extend the definitions of anchoring to handle multi-modal probability distributions and we couple the resulting symbol anchoring system to a probabilistic logic reasoner for performing inference. Furthermore, we use statistical relational learning to enable the anchoring framework to learn symbolic knowledge in the form of a set of probabilistic logic rules of the world from noisy and sub-symbolic sensor input. The resulting framework, which combines perceptual anchoring and statistical relational learning, is able to maintain a semantic world model of all the objects that have been perceived over time, while still exploiting the expressiveness of logical rules to reason about the state of objects which are not directly observed through sensory input data. To validate our approach we demonstrate, on the one hand, the ability of our system to perform probabilistic reasoning over multi-modal probability distributions, and on the other hand, the learning of probabilistic logical rules from anchored objects produced by perceptual observations. The learned logical rules are, subsequently, used to assess our proposed probabilistic anchoring procedure. We demonstrate our system in a setting involving object interactions where object occlusions arise and where probabilistic inference is needed to correctly anchor objects.


KBSET -- Knowledge-Based Support for Scholarly Editing and Text Processing with Declarative LaTeX Markup and a Core Written in SWI-Prolog

arXiv.org Artificial Intelligence

KBSET is an environment that provides support for scholarly editing in two flavors: First, as a practical tool KBSET/Letters that accompanies the development of editions of correspondences (in particular from the 18th and 19th century), completely from source documents to PDF and HTML presentations. Second, as a prototypical tool KBSET/NER for experimentally investigating novel forms of working on editions that are centered around automated named entity recognition. KBSET can process declarative application-specific markup that is expressed in LaTeX notation and incorporate large external fact bases that are typically provided in RDF. KBSET includes specially developed LaTeX styles and a core system that is written in SWI-Prolog, which is used there in many roles, utilizing that it realizes the potential of Prolog as a unifying language.