Goto

Collaborating Authors

 Mathematical & Statistical Methods


Compiling Optimal Numeric Planning to Mixed Integer Linear Programming

AAAI Conferences

Compilation techniques in planning reformulate a problem into an alternative encoding for which efficient, off-the-shelf solvers are available. In this work, we present a novel mixed-integer linear programming (MILP) compilation for cost-optimal numeric planning with instantaneous actions. While recent works on the problem are restricted to actions that modify variables present in simple numeric conditions, our MILP formulation, in addition, handles linear conditions and linear action effects on numeric state variables. Such problems are particularly challenging due to the state-dependency of the action effects. Experiments show that our approach, in addition to being the state of the art for the more general problem class, is competitive with heuristic search-based planners on domains with only simple numeric conditions.


Dissecting Adam: The Sign, Magnitude and Variance of Stochastic Gradients

arXiv.org Machine Learning

The ADAM optimizer is exceedingly popular in the deep learning community. Often it works very well, sometimes it doesn't. Why? We interpret ADAM as a combination of two aspects: for each weight, the update direction is determined by the sign of stochastic gradients, whereas the update magnitude is determined by an estimate of their relative variance. We disentangle these two aspects and analyze them in isolation, gaining insight into the mechanisms underlying ADAM. This analysis also extends recent results on adverse effects of ADAM on generalization, isolating the sign aspect as the problematic one. Transferring the variance adaptation to SGD gives rise to a novel method, completing the practitioner's toolbox for problems where ADAM fails.


Francy - An Interactive Discrete Mathematics Framework for GAP

arXiv.org Machine Learning

Data visualization and interaction with large data sets is known to be essential and critical in many businesses today, and the same applies to research and teaching, in this case, when exploring large and complex mathematical objects. GAP is a computer algebra system for computational discrete algebra with an emphasis on computational group theory. The existing XGAP package for GAP works exclusively on the X Window System. It lacks abstraction between its mathematical and graphical cores, making it difficult to extend, maintain, or port. In this paper, we present Francy, a graphical semantics package for GAP. Francy is responsible for creating a representational structure that can be rendered using many GUI frameworks independent from any particular programming language or operating system. Building on this, we use state of the art web technologies that take advantage of an improved REPL environment, which is currently under development for GAP. The integration of this project with Jupyter provides a rich graphical environment full of features enhancing the usability and accessibility of GAP.


A Gentle Introduction to the Chi-Squared Test for Machine Learning

#artificialintelligence

Take my free 7-day email crash course now (with sample code). Click to sign-up and also get a free PDF Ebook version of the course.


fastai/numerical-linear-algebra

#artificialintelligence

This course is focused on the question: How do we do matrix computations with acceptable speed and acceptable accuracy? This course was taught in the University of San Francisco's Masters of Science in Analytics program, summer 2017 (for graduate students studying to become data scientists). The course is taught in Python with Jupyter Notebooks, using libraries such as Scikit-Learn and Numpy for most lessons, as well as Numba (a library that compiles Python to C for faster performance) and PyTorch (an alternative to Numpy for the GPU) in a few lessons. Accompanying the notebooks is a playlist of lecture videos, available on YouTube. If you are ever confused by a lecture or it goes too quickly, check out the beginning of the next video, where I review concepts from the previous lecture, often explaining things from a new perspective or with different illustrations, and answer questions.


Finding GEMS: Multi-Scale Dictionaries for High-Dimensional Graph Signals

arXiv.org Machine Learning

Abstract--Modern data introduces new challenges to classic signal processing approaches, leading to a growing interest in the field of graph signal processing. A powerful and well established model for real world signals in various domains is sparse representation over a dictionary, combined with the ability to train the dictionary from signal examples. This model has been successfully applied to graph signals as well by integrating the underlying graph topology into the learned dictionary. Nonetheless, dictionary learning methods for graph signals are typically restricted to small dimensions due to the computational constraints that the dictionary learning problem entails, and due to the direct use of the graph Laplacian matrix. In this paper, we propose a dictionary learning algorithm that applies to a broader class of graph signals, and is capable of handling much higher dimensional data. We incorporate the underlying graph topology both implicitly, by forcing the learned dictionary atoms to be sparse combinations of graph-wavelet functions, and explicitly, by adding direct graph constraints to promote smoothness in both the feature and manifold domains. The resulting atoms are thus adapted to the data of interest while adhering to the underlying graph structure and possessing a desired multi-scale property. Experimental results on several datasets, representing both synthetic and real network data of different nature, demonstrate the effectiveness of the proposed algorithm for graph signal processing even in high dimensions. In recent years, the field of graph signal processing has been gaining momentum. By merging concepts of spectral graph theory and harmonic analysis, it aims at extending classical signal processing approaches to signals having a complex and irregular underlying structure. Such signals emerge in numerous modern applications of diverse sources, such as transportation, energy, biological-, social-, and sensor-networks [1], [2].


PAC-Bayes Control: Synthesizing Controllers that Provably Generalize to Novel Environments

arXiv.org Artificial Intelligence

Our goal is to synthesize controllers for robots that provably generalize well to novel environments given a dataset of example environments. The key technical idea behind our approach is to leverage tools from generalization theory in machine learning by exploiting a precise analogy (which we present in the form of a reduction) between robustness of controllers to novel environments and generalization of hypotheses in supervised learning. In particular, we utilize the Probably Approximately Correct (PAC)-Bayes framework, which allows us to obtain upper bounds (that hold with high probability) on the expected cost of (stochastic) controllers across novel environments. We propose control synthesis algorithms that explicitly seek to minimize this upper bound. The corresponding optimization problem can be solved using convex optimization (Relative Entropy Programming in particular) in the setting where we are optimizing over a finite control policy space. In the more general setting of continuously parameterized controllers, we minimize this upper bound using stochastic gradient descent. We present examples of our approach in the context of obstacle avoidance control with depth measurements. Our simulated examples demonstrate the potential of our approach to provide strong generalization guarantees on controllers for robotic systems with continuous state and action spaces, complicated (e.g., nonlinear) dynamics, and rich sensory inputs (e.g., depth measurements).


Continuous-time Value Function Approximation in Reproducing Kernel Hilbert Spaces

arXiv.org Artificial Intelligence

Motivated by the success of reinforcement learning (RL) for discrete-time tasks such as AlphaGo and Atari games, there has been a recent surge of interest in using RL for continuous-time control of physical systems (cf. many challenging tasks in OpenAI Gym and the DeepMind Control Suite). Since discretization of time is susceptible to error, it is methodologically more desirable to handle the system dynamics directly in continuous time. However, very few techniques exist for continuous-time RL and they lack flexibility in value function approximation. In this paper, we propose a novel framework for continuous-time value function approximation based on reproducing kernel Hilbert spaces. The resulting framework is so flexible that it can accommodate any kind of kernel-based approach, such as Gaussian processes and the adaptive projected subgradient method, and it allows us to handle uncertainties and nonstationarity without prior knowledge about the environment or what basis functions to employ. We demonstrate the validity of the presented framework through experiments.


Free Book: Applied Stochastic Processes

@machinelearnbot

If you look at chapter 5 (six degrees of separation) it applies to Youtube videos as well, in the sense that there is a path involving no more than six links from any Youtube video to any other one. Using a recursive algorithm for (automated) crawling is not a good idea though, as explained in chapter 5. Also, some videos are somewhat disconnected from the vast majority of Youtube videos. For instance, can you start with a video of the Beatles, and end up after any amount of browsing, discovering a machine learning video? Maybe not, and it means that the Youtube graph is not fully connected, and you need a number of seed videos from each connected component when doing your browsing, in order to retrieve all of them.


Data-driven Analytics for Business Architectures: Proposed Use of Graph Theory

arXiv.org Machine Learning

In the age of globalization with fierce competition, dynamic marketplaces and changing customer demands, it is critical that business organizations understand their structures and processses, align business strategy and organization's capabilities and investment, detect and reorganize redundant business capabilities (especially after mergers and acquisitions), and recognize business innovations. Consequently, great efforts have been made to design and improve Business Architecture (BA) models for solving those challenges. Business Architecture (BA) is a blueprint of the enterprise that provides a common understanding of the organization and is used to align strategic objectives and tactical demands, which articulates the structure of an enterprise in terms of its capabilities, governance structure, business processes, and business information [1]. In the past few decades, various BA models have been developed, of which some major models including: ArchiMate [2] that is maintained by the Archimate Foundation and approved as technical standard by the Open Group, and can be used to formally describe business operations; Business Architecture Working Group (BAWG) [3] that is founded as a part of the Objected Management Group (OMG) for establishing industry standards, supporting the creation, and alignment of business blueprints; Business driven analytics is implemented in CBM and explore what and how business insights can be obtained through the data-driven analytics. We conclude the paper by summarizing our current work, and discussing future directions.