Goto

Collaborating Authors

 South America


RG-Flow: A hierarchical and explainable flow model based on renormalization group and sparse prior

arXiv.org Artificial Intelligence

Flow-based generative models have become an important class of unsupervised learning approaches. In this work, we incorporate the key idea of renormalization group (RG) and sparse prior distribution to design a hierarchical flow-based generative model, called RG-Flow, which can separate different scale information of images with disentangle representations at each scale. We demonstrate our method mainly on the CelebA dataset and show that the disentangled representation at different scales enables semantic manipulation and style mixing of the images. To visualize the latent representation, we introduce the receptive fields for flow-based models and find receptive fields learned by RG-Flow are similar to convolutional neural networks. In addition, we replace the widely adopted Gaussian prior distribution by sparse prior distributions to further enhance the disentanglement of representations. One of the most important unsupervised learning tasks is to learn the data distribution and build generative models. Over the past few years, various types of generative models have been proposed. Yet the latent variables are on equal footing and mixed globally. Here, we propose a new flow-based model, RG-Flow, which is inspired by the idea of renormalization group in statistical physics. RG-Flow imposes locality and hierarchical structure in bijective transformations. It allows us to access different scale information in original images by latent variables at different locations, which offers better explainability.


USA tops AI readiness index – Government & civil service news

#artificialintelligence

The USA has been named as the country best prepared to realise the benefits of artificial intelligence (AI) technologies in public service delivery, topping the 2020 Government AI Readiness Index. Meanwhile Singapore, which led the 2019 list, has fallen to sixth place. The index – compiled by UK-based consultants Oxford Insights and Canada's International Development Research Centre (IDRC) – examines how well-placed nations are to take advantage of the benefits of AI in their internal operations and the delivery of public services. This year, 172 countries were reviewed. The ranking measures AI readiness across three criteria: government willingness to adopt AI, and the ability to adapt and innovate to do so; availability of AI expertise and tools from the technology sector; and capabilities in building AI tools, providing them with high-quality data, and building them into public services.


Commentary: Applying AI To Decision-Making In Shipping And Commodities Markets

#artificialintelligence

The views expressed here are solely those of the author and do not necessarily represent the views of FreightWaves or its affiliates. In this installment of the AI in Supply Chain series (#AIinSupplyChain), we explore the topic of decision-making in the shipping and commodities markets. Before we proceed, it is important to note four characteristics of the freight shipping industry that were highlighted by Roar Adland, a professor of shipping economics at the Norwegian School of Economics. In an August 2017 blog post on LinkedIn: 4 things shipping had long before Uber, he noted the following: First, shipping inherently utilizes dynamic pricing because of the volatile nature of rates, and this has been the case for a few centuries. Second, the industry already matches demand and supply in a highly efficient manner.


Google Maps uses AI to better predict when there's a massive traffic jam

#artificialintelligence

Google confirmed this week that more than 1 billion kilometres are driven with Google Maps in more than 220 countries and territories. That's a lot of traffic data to digest. Yet somehow, Google Maps can you show instant traffic updates from the moment you start navigating. Which direction to take, how long the trip will take, whether traffic along the route is heavy or light, and so much more. While all of this appears simple, there's a ton going on behind the scenes which enable Google Maps to crunch the numbers and get you from Point A to Point B safely and on time.


Matrix Shuffle-Exchange Networks for Hard 2D Tasks

arXiv.org Machine Learning

Convolutional neural networks have become the main tools for processing two-dimensional data. They work well for images, yet convolutions have a limited receptive field that prevents its applications to more complex 2D tasks. We propose a new neural model, called Matrix Shuffle-Exchange network, that can efficiently exploit long-range dependencies in 2D data and has comparable speed to a convolutional neural network. It is derived from Neural Shuffle-Exchange network and has $\mathcal{O}( \log{n})$ layers and $\mathcal{O}( n^2 \log{n})$ total time and space complexity for processing a $n \times n$ data matrix. We show that the Matrix Shuffle-Exchange network is well-suited for algorithmic and logical reasoning tasks on matrices and dense graphs, exceeding convolutional and graph neural network baselines. Its distinct advantage is the capability of retaining full long-range dependency modelling when generalizing to larger instances - much larger than could be processed with models equipped with a dense attention mechanism.


Reward Machines: Exploiting Reward Function Structure in Reinforcement Learning

arXiv.org Artificial Intelligence

Reinforcement learning (RL) methods usually treat reward functions as black boxes. As such, these methods must extensively interact with the environment in order to discover rewards and optimal policies. In most RL applications, however, users have to program the reward function and, hence, there is the opportunity to treat reward functions as white boxes instead -- to show the reward function's code to the RL agent so it can exploit its internal structures to learn optimal policies faster. In this paper, we show how to accomplish this idea in two steps. First, we propose reward machines (RMs), a type of finite state machine that supports the specification of reward functions while exposing reward function structure. We then describe different methodologies to exploit such structures, including automated reward shaping, task decomposition, and counterfactual reasoning for data augmentation. Experiments on tabular and continuous domains show the benefits of exploiting reward structure across different tasks and RL agents.


A Simple Framework for Uncertainty in Contrastive Learning

arXiv.org Machine Learning

Contrastive approaches to representation learning have recently shown great promise. In contrast to generative approaches, these contrastive models learn a deterministic encoder with no notion of uncertainty or confidence. In this paper, we introduce a simple approach based on "contrasting distributions" that learns to assign uncertainty for pretrained contrastive representations. In particular, we train a deep network from a representation to a distribution in representation space, whose variance can be used as a measure of confidence. In our experiments, we show that this deep uncertainty model can be used (1) to visually interpret model behavior, (2) to detect new noise in the input to deployed models, (3) to detect anomalies, where we outperform 10 baseline methods across 11 tasks with improvements of up to 14% absolute, and (4) to classify out-of-distribution examples where our fully unsupervised model is competitive with supervised methods. The success of supervised learning relies heavily on large datasets with semantic annotations. But as the prediction tasks we are interested in become increasingly complex -- such as applications in radiology (Irvin et al., 2019), law (Wang et al., 2013), and autonomous driving (Maurer et al., 2016) -- the expense and difficulty of annotation quickly grows to be unmanageable. As such, learning useful representations without human annotation is an important, longstanding problem. These "unsupervised" approaches largely span two categories: generative and discriminative. Generative models seek to capture the data density using ideas from approximate Bayesian inference (Hinton et al., 2006; Kingma & Welling, 2013; Rezende et al., 2014) and game theory (Goodfellow et al., 2014; Dumoulin et al., 2016).


Deep Distributional Time Series Models and the Probabilistic Forecasting of Intraday Electricity Prices

arXiv.org Machine Learning

Recurrent neural networks (RNNs) with rich feature vectors of past values can provide accurate point forecasts for series that exhibit complex serial dependence. We propose two approaches to constructing deep time series probabilistic models based on a variant of RNN called an echo state network (ESN). The first is where the output layer of the ESN has stochastic disturbances and a shrinkage prior for additional regularization. The second approach employs the implicit copula of an ESN with Gaussian disturbances, which is a deep copula process on the feature space. Combining this copula with a non-parametrically estimated marginal distribution produces a deep distributional time series model. The resulting probabilistic forecasts are deep functions of the feature vector and also marginally calibrated. In both approaches, Bayesian Markov chain Monte Carlo methods are used to estimate the models and compute forecasts. The proposed deep time series models are suitable for the complex task of forecasting intraday electricity prices. Using data from the Australian National Electricity Market, we show that our models provide accurate probabilistic price forecasts. Moreover, the models provide a flexible framework for incorporating probabilistic forecasts of electricity demand as additional features. We demonstrate that doing so in the deep distributional time series model in particular, increases price forecast accuracy substantially.


PMI-Masking: Principled masking of correlated spans

arXiv.org Machine Learning

Masking tokens uniformly at random constitutes a common flaw in the pretraining of Masked Language Models (MLMs) such as BERT. We show that such uniform masking allows an MLM to minimize its training objective by latching onto shallow local signals, leading to pretraining inefficiency and suboptimal downstream performance. To address this flaw, we propose PMI-Masking, a principled masking strategy based on the concept of Pointwise Mutual Information (PMI), which jointly masks a token n-gram if it exhibits high collocation over the corpus. PMI-Masking motivates, unifies, and improves upon prior more heuristic approaches that attempt to address the drawback of random uniform token masking, such as whole-word masking, entity/phrase masking, and random-span masking. Specifically, we show experimentally that PMI-Masking reaches the performance of prior masking approaches in half the training time, and consistently improves performance at the end of training. In the couple of years since BERT was introduced in a seminal paper by Devlin et al. (2019a), Masked Language Models (MLMs) have rapidly advanced the NLP frontier (Sun et al., 2019; Liu et al., 2019; Joshi et al., 2020; Raffel et al., 2019). At the heart of the MLM approach is the task of predicting a masked subset of the text given the remaining, unmasked text. The text itself is broken up into tokens, each token consisting of a word or part of a word; thus "chair" constitutes a single token, but out-of-vocabulary words like "eigen-value" are broken up into several sub-word tokens.


Artificial Intelligence: Research Impact on Key Industries; the Upper-Rhine Artificial Intelligence Symposium (UR-AI 2020)

arXiv.org Artificial Intelligence

The TriRhenaTech alliance presents a collection of accepted papers of the cancelled tri-national 'Upper-Rhine Artificial Inteeligence Symposium' planned for 13th May 2020 in Karlsruhe. The TriRhenaTech alliance is a network of universities in the Upper-Rhine Trinational Metropolitan Region comprising of the German universities of applied sciences in Furtwangen, Kaiserslautern, Karlsruhe, and Offenburg, the Baden-Wuerttemberg Cooperative State University Loerrach, the French university network Alsace Tech (comprised of 14 'grandes \'ecoles' in the fields of engineering, architecture and management) and the University of Applied Sciences and Arts Northwestern Switzerland. The alliance's common goal is to reinforce the transfer of knowledge, research, and technology, as well as the cross-border mobility of students.