AITopics

2012.05267

Country:

South America > French Guiana > Guyane > Cayenne (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
(4 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Ben-Younes, Hédi, Zablocki, Éloi, Pérez, Patrick, Cord, Matthieu

Driving Behavior Explanation with Multi-level Fusion

arXiv.org Artificial IntelligenceDec-9-2020

In this era of active development of autonomous vehicles, it becomes crucial to provide driving systems with the capacity to explain their decisions. In this work, we focus on generating high-level driving explanations as the vehicle drives. We present BEEF, for BEhavior Explanation with Fusion, a deep architecture which explains the behavior of a trajectory prediction model. Supervised by annotations of human driving decisions justifications, BEEF learns to fuse features from multiple levels. Leveraging recent advances in the multi-modal fusion literature, BEEF is carefully designed to model the correlations between high-level decisions features and mid-level perceptual features. The flexibility and efficiency of our approach are validated with extensive experiments on the HDD and BDD-X datasets.

explanation, explanation module, traffic, (13 more...)

2012.04983

Country:

North America > United States > Hawaii > Honolulu County > Honolulu (0.05)
Europe > Italy > Veneto > Venice (0.04)
Europe > Germany > Baden-Württemberg > Freiburg (0.04)
(17 more...)

Genre: Research Report (0.64)

Industry:

Transportation > Ground > Road (0.94)
Automobiles & Trucks (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Chen, Ziqi, Min, Martin Renqiang, Parthasarathy, Srinivasan, Ning, Xia

Molecule Optimization via Fragment-based Generative Models

arXiv.org Machine LearningDec-8-2020

In drug discovery, molecule optimization is an important step in order to modify drug candidates into better ones in terms of desired drug properties. With the recent advance of Artificial Intelligence, this traditionally in vitro process has been increasingly facilitated by in silico approaches. We present an innovative in silico approach to computationally optimizing molecules and formulate the problem as to generate optimized molecular graphs via deep generative models. Our generative models follow the key idea of fragment-based drug design, and optimize molecules by modifying their small fragments. Our models learn how to identify the to-be-optimized fragments and how to modify such fragments by learning from the difference of molecules that have good and bad properties. In optimizing a new molecule, our models apply the learned signals to decode optimized fragments at the predicted location of the fragments. We also construct multiple such models into a pipeline such that each of the models in the pipeline is able to optimize one fragment, and thus the entire pipeline is able to modify multiple fragments of molecule if needed. We compare our models with other state-of-the-art methods on benchmark datasets and demonstrate that our methods significantly outperform others with more than 80% property improvement under moderate molecular similarity constraints, and more than 10% property improvement under high molecular similarity constraints.

fragment, molecule, optimization, (15 more...)

2012.04231

Country:

North America > United States > Ohio > Franklin County > Columbus (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Lopez, Ryan, Atzberger, Paul J.

Variational Autoencoders for Learning Nonlinear Dynamics of Physical Systems

arXiv.org Artificial IntelligenceDec-7-2020

We develop data-driven methods for incorporating physical information for priors to learn parsimonious representations of nonlinear systems arising from parameterized PDEs and mechanics. Our approach is based on Variational Autoencoders (VAEs) for learning from observations nonlinear state space models. We develop ways to incorporate geometric and topological priors through general manifold latent space representations. We investigate the performance of our methods for learning low dimensional representations for the nonlinear Burgers equation and constrained mechanical systems.

latent space, manifold, representation, (14 more...)

2012.03448

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
Asia > Middle East > Jordan (0.04)
(6 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Cantelobre, Théophile, Guedj, Benjamin, Pérez-Ortiz, María, Shawe-Taylor, John

A PAC-Bayesian Perspective on Structured Prediction with Implicit Loss Embeddings

arXiv.org Machine LearningDec-7-2020

Many practical machine learning tasks can be framed as Structured prediction problems, where several output variables are predicted and considered interdependent. Recent theoretical advances in structured prediction have focused on obtaining fast rates convergence guarantees, especially in the Implicit Loss Embedding (ILE) framework. PAC-Bayes has gained interest recently for its capacity of producing tight risk bounds for predictor distributions. This work proposes a novel PAC-Bayes perspective on the ILE Structured prediction framework. We present two generalization bounds, on the risk and excess risk, which yield insights into the behavior of ILE predictors. Two learning algorithms are derived from these bounds.

algorithm, pac-bayesian structured prediction, predictor, (10 more...)

2012.0378

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
(9 more...)

Genre:

Overview (1.00)
Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
(2 more...)

Cemgil, A. Taylan, Ghaisas, Sumedh, Dvijotham, Krishnamurthy, Gowal, Sven, Kohli, Pushmeet

Autoencoding Variational Autoencoder

arXiv.org Machine LearningDec-7-2020

Does a Variational AutoEncoder (VAE) consistently encode typical samples generated from its decoder? This paper shows that the perhaps surprising answer to this question is `No'; a (nominally trained) VAE does not necessarily amortize inference for typical samples that it is capable of generating. We study the implications of this behaviour on the learned representations and also the consequences of fixing it by introducing a notion of self consistency. Our approach hinges on an alternative construction of the variational approximation distribution to the true posterior of an extended VAE model with a Markov chain alternating between the encoder and the decoder. The method can be used to train a VAE model from scratch or given an already trained VAE, it can be run as a post processing step in an entirely self supervised way without access to the original training data. Our experimental analysis reveals that encoders trained with our self-consistency approach lead to representations that are robust (insensitive) to perturbations in the input introduced by adversarial attacks. We provide experimental results on the ColorMnist and CelebA benchmark datasets that quantify the properties of the learned representations and compare the approach with a baseline that is specifically trained for the desired property.

decoder, encoder, representation, (16 more...)

2012.03715

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
Asia > China (0.04)

Genre: Research Report (0.81)

Industry: Information Technology (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Machine LearningDec-7-2020

DiffPrune: Neural Network Pruning with Deterministic Approximate Binary Gates and $L_0$ Regularization

Shulman, Yaniv

Modern neural network architectures typically have many millions of parameters and can be pruned significantly without substantial loss in effectiveness which demonstrates they are over-parameterized. The contribution of this work is two-fold. The first is a method for approximating a multivariate Bernoulli random variable by means of a deterministic and differentiable transformation of any real-valued multivariate random variable. The second is a method for model selection by element-wise multiplication of parameters with approximate binary gates that may be computed deterministically or stochastically and take on exact zero values. Sparsity is encouraged by the inclusion of a surrogate regularization to the $L_0$ loss. Since the method is differentiable it enables straightforward and efficient learning of model architectures by an empirical risk minimization procedure with stochastic gradient descent and theoretically enables conditional computation during training. The method also supports any arbitrary group sparsity over parameters or activations and therefore offers a framework for unstructured or flexible structured model pruning. To conclude experiments are performed to demonstrate the effectiveness of the proposed approach.

neural network, pruning, random variable, (15 more...)

2012.03653

Country:

Europe > France (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(5 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Paster, Keiran, McIlraith, Sheila A., Ba, Jimmy

Planning from Pixels using Inverse Dynamics Models

arXiv.org Artificial IntelligenceDec-4-2020

Learning task-agnostic dynamics models in high-dimensional observation spaces can be challenging for model-based RL agents. We propose a novel way to learn latent world models by learning to predict sequences of future actions conditioned on task completion. These task-conditioned models adaptively focus modeling capacity on task-relevant dynamics, while simultaneously serving as an effective heuristic for planning with sparse rewards. We evaluate our method on challenging visual goal completion tasks and show a substantial increase in performance compared to prior model-free approaches. Deep reinforcement learning has proven to be a powerful and effective framework for solving a diversity of challenging decision-making problems (Silver et al., 2017a; Berner et al., 2019). However these algorithms are typically trained to maximize a single reward function, ignoring information that is not directly relevant to the associated task at hand. This way of learning is in stark contrast to how humans learn (Tenenbaum, 2018). Without being prompted by a specific task, humans can still explore their environment, practice achieving imaginary goals, and in so doing learn about the dynamics of the environment. When subsequently presented with a novel task, humans can utilize this learned knowledge to bootstrap learning -- a property we would like our artificial agents to have. In this work, we investigate one way to bridge this gap by learning world models (Ha & Schmidhuber, 2018) that enable the realization of previously unseen tasks. By modeling the task-agnostic dynamics of an environment, an agent can make predictions about how its own actions may affect the environment state without the need for additional samples from the environment. Prior work has shown that by using powerful function approximators to model environment dynamics, training an agent entirely within its own world models can result in large gains in sample efficiency (Ha & Schmidhuber, 2018).

action sequence, agent step, glamor, (11 more...)

2012.02419

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.14)
Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)
(7 more...)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment (0.94)

Franchi, Gianni, Bursuc, Andrei, Aldea, Emanuel, Dubuisson, Severine, Bloch, Isabelle

Encoding the latent posterior of Bayesian Neural Networks for uncertainty quantification

arXiv.org Machine LearningDec-4-2020

Bayesian neural networks (BNNs) have been long considered an ideal, yet unscalable solution for improving the robustness and the predictive uncertainty of deep neural networks. While they could capture more accurately the posterior distribution of the network parameters, most BNN approaches are either limited to small networks or rely on constraining assumptions such as parameter independence. These drawbacks have enabled prominence of simple, but computationally heavy approaches such as Deep Ensembles, whose training and testing costs increase linearly with the number of networks. In this work we aim for efficient deep BNNs amenable to complex computer vision architectures, e.g. ResNet50 DeepLabV3+, and tasks, e.g. semantic segmentation, with fewer assumptions on the parameters. We achieve this by leveraging variational autoencoders (VAEs) to learn the interaction and the latent distribution of the parameters at each network layer. Our approach, Latent-Posterior BNN (LP-BNN), is compatible with the recent BatchEnsemble method, leading to highly efficient ({in terms of computation and} memory during both training and testing) ensembles. LP-BNN s attain competitive results across multiple metrics in several challenging benchmarks for image classification, semantic segmentation and out-of-distribution detection.

ensemble, lp-bnn, neural network, (14 more...)

2012.02818

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Massachusetts (0.04)
North America > United States > California (0.04)
(3 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

arXiv.org Artificial IntelligenceDec-3-2020

Research Progress of News Recommendation Methods

Qin, Jing

Due to researchers'aim to study personalized recommendations for different business fields, the summary of recommendation methods in specific fields is of practical significance. News recommendation systems were the earliest research field regarding recommendation systems, and were also the earliest recommendation field to apply the collaborative filtering method. In addition, news is real-time and rich in content, which makes news recommendation methods more challenging than in other fields. Thus, this paper summarizes the research progress regarding news recommendation methods. From 2018 to 2020, developed news recommendation methods were mainly deep learning-based, attention-based, and knowledge graphs-based. As of 2020, there are many news recommendation methods that combine attention mechanisms and knowledge graphs. However, these methods were all developed based on basic methods (the collaborative filtering method, the content-based recommendation method, and a mixed recommendation method combining the two). In order to allow researchers to have a detailed understanding of the development process of news recommendation methods, the news recommendation methods surveyed in this paper, which cover nearly 10 years, are divided into three categories according to the abovementioned basic methods. Firstly, the paper introduces the basic ideas of each category of methods and then summarizes the recommendation methods that are combined with other methods based on each category of methods and according to the time sequence of research results. Finally, this paper also summarizes the challenges confronting news recommendation systems.

news recommendation method, proceedings, recommendation method, (13 more...)

2012.0236

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.28)
Europe > Austria > Vienna (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
(37 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Information Technology > Services (0.94)
Information Technology > Security & Privacy (0.93)
Media > News (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)