AITopics

1905.1312

Country:

North America > United States (0.46)
North America > Canada (0.28)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)

Komanduru, Abi, Honorio, Jean

On the Correctness and Sample Complexity of Inverse Reinforcement Learning

arXiv.org Machine LearningJun-2-2019

Inverse reinforcement learning (IRL) is the problem of finding a reward function that generates a given optimal policy for a given Markov Decision Process. This paper looks at an algorithmic-independent geometric analysis of the IRL problem with finite states and actions. A L1-regularized Support Vector Machine formulation of the IRL problem motivated by the geometric analysis is then proposed with the basic objective of the inverse reinforcement problem in mind: to find a reward function that generates a specified optimal policy. The paper further analyzes the proposed formulation of inverse reinforcement learning with $n$ states and $k$ actions, and shows a sample complexity of $O(n^2 \log (nk))$ for recovering a reward function that generates a policy that satisfies Bellman's optimality condition with respect to the true transition probabilities.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

1906.00422

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.57)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Borovikov, Igor, Harder, Jesse, Sadovsky, Michael, Beirami, Ahmad

Towards Interactive Training of Non-Player Characters in Video Games

arXiv.org Artificial IntelligenceJun-2-2019

There is a high demand for high-quality Non-Player Characters (NPCs) in video games. Hand-crafting their behavior is a labor intensive and error prone engineering process with limited controls exposed to the game designers. We propose to create such NPC behaviors interactively by training an agent in the target environment using imitation learning with a human in the loop. While traditional behavior cloning may fall short of achieving the desired performance, we show that interactivity can substantially improve it with a modest amount of human efforts. The model we train is a multi-resolution ensemble of Markov models, which can be used as is or can be further "compressed" into a more compact model for inference on consumer devices. We illustrate our approach on an example in OpenAI Gym, where a human can help to quickly train an agent with only a handful of interactive demonstrations. We also outline our experiments with NPC training for a first-person shooter game currently in development.

artificial intelligence, demonstration, machine learning, (17 more...)

1906.00535

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
Europe > Russia (0.04)
Asia > Russia > Siberian Federal District > Krasnoyarsk Krai > Krasnoyarsk (0.04)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

arXiv.org Artificial IntelligenceJun-1-2019

Synthesizing Datalog Programs using Numerical Relaxation

Si, Xujie, Raghothaman, Mukund, Heo, Kihong, Naik, Mayur

The problem of learning logical rules from examples arises in diverse fields, including program synthesis, logic programming, and machine learning. Existing approaches either involve solving computationally difficult combinatorial problems, or performing parameter estimation in complex statistical models. In this paper, we present Difflog, a technique to extend the logic programming language Datalog to the continuous setting. By attaching real-valued weights to individual rules of a Datalog program, we naturally associate numerical values with individual conclusions of the program. Analogous to the strategy of numerical relaxation in optimization problems, we can now first determine the rule weights which cause the best agreement between the training labels and the induced values of output tuples, and subsequently recover the classical discrete-valued target program from the continuous optimum. We evaluate Difflog on a suite of 34 benchmark problems from recent literature in knowledge discovery, formal verification, and database query-by-example, and demonstrate significant improvements in learning complex programs with recursive rules, invented predicates, and relations of arbitrary arity.

logic & formal reasoning, machine learning, tuple, (20 more...)

1906.00163

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Kang, Hyunwook, Mynbay, Aydar, Morrison, James R., Park, Jinkyoo

Scalable and transferable learning of algorithms via graph embedding for multi-robot reward collection

arXiv.org Artificial IntelligenceJun-1-2019

Can the success of reinforcement learning methods for combinatorial optimization problems be extended to multi-robot scheduling problems in stochastic contexts? Three issues are particularly important in this context: quality of the resulting decisions, scalability, and transferability. To achieve these ends we generalize the concept of clique potential to stochastic clique potential. We extend a mean field inference fixed point iteration with this new concept and use it to modify thestructure2vec method. We next propose a new reinforcement learning framework combining a graph representation of the problem and a consensus auction inspired by heuristics in the problem domain. This representation enables transferability in terms of the number of robots. Sequential encoding of information through multiple layers of our extended structure2vec results in 96% optimal performance of the learned heuristics. While training tractability is inherited from single robot methods in the literature, use of a multi-robot consensus auction-based relaxation of the maximum operation in the Bellman optimality equation allows for scalable selection of actions in the fitted Q-iteration. We apply our framework to multi-robot reward collection (MRRC) problems in stochastic environments with linear or non-linear rewards. In stochastic environments with non-linear rewards, the new method achieves 20% superior performance relative to the popular sequential greedy assignment (SGA) algorithm. Linear scalability in terms of training is achieved and demonstrated. Transferability is demonstrated by the use of a heuristic trained with three robots that continues to achieve 95% optimal performance when applied to problems with various numbers of robots. We further mention the results obtained when extending the approach to identical parallel machine scheduling(IPMS) problems.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

1905.12204

Country:

Asia > South Korea (0.28)
North America > United States > Texas (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

arXiv.org Artificial IntelligenceMay-31-2019

Reinforcement Learning for Slate-based Recommender Systems: A Tractable Decomposition and Practical Methodology

Ie, Eugene, Jain, Vihan, Wang, Jing, Narvekar, Sanmit, Agarwal, Ritesh, Wu, Rui, Cheng, Heng-Tze, Lustman, Morgane, Gatto, Vince, Covington, Paul, McFadden, Jim, Chandra, Tushar, Boutilier, Craig

Recommender systems have become ubiquitous, transforming user interactions with products, services and content in a wide variety of domains. In content recommendation, recommenders generally surface relevant and/or novel personalized content based on learned models of user preferences (e.g., as in collaborative filtering [Breese et al., 1998, Konstan et al., 1997, Srebro et al., 2004, Salakhutdinov and Mnih, 2007]) or predictive models of user responses to specific recommendations. Well-known applications of recommender systems include video recommendations on YouTube [Covington et al., 2016], movie recommendations on Netflix [Gomez-Uribe and Hunt, 2016] and playlist construction on Spotify [Jacobson et al., 2016]. It is increasingly common to train deep neural networks (DNNs) [van den Oord et al., 2013, Wang et al., 2015, Covington et al., 2016, Cheng et al., 2016] to predict user responses (e.g., click-through rates, content engagement, ratings, likes) to generate, score and serve candidate recommendations. Practical recommender systems largely focus on myopic prediction--estimating a user's immediate response to a recommendation--without considering the long-term impact on subsequent user behavior. This can be limiting: modeling a recommendation's stochastic impact on the future affords opportunities to trade off user engagement in the near-term for longer-term benefit (e.g., by probing a user's interests, or improving satisfaction).

artificial intelligence, machine learning, slate, (19 more...)

1905.12767

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
North America > United States > New York (0.04)
South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
(17 more...)

Genre: Research Report > New Finding (0.93)

Industry:

Media (1.00)
Leisure & Entertainment (1.00)
Information Technology > Services (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Gudur, Gautham Krishna, Sundaramoorthy, Prahalathan, Umaashankar, Venkatesh

ActiveHARNet: Towards On-Device Deep Bayesian Active Learning for Human Activity Recognition

arXiv.org Machine LearningMay-31-2019

Various health-care applications such as assisted living, fall detection etc., require modeling of user behavior through Human Activity Recognition (HAR). HAR using mobile- and wearable-based deep learning algorithms have been on the rise owing to the advancements in pervasive computing. However, there are two other challenges that need to be addressed: first, the deep learning model should support on-device incremental training (model updation) from real-time incoming data points to learn user behavior over time, while also being resource-friendly; second, a suitable ground truthing technique (like Active Learning) should help establish labels on-the-fly while also selecting only the most informative data points to query from an oracle. Hence, in this paper, we propose ActiveHARNet, a resource-efficient deep ensembled model which supports on-device Incremental Learning and inference, with capabilities to represent model uncertainties through approximations in Bayesian Neural Networks using dropout. This is combined with suitable acquisition functions for active learning. Empirical results on two publicly available wrist-worn HAR and fall detection datasets indicate that ActiveHARNet achieves considerable efficiency boost during inference across different users, with a substantially low number of acquired pool points (at least 60% reduction) during incremental learning on both datasets experimented with various acquisition functions, thus demonstrating deployment and Incremental Learning feasibility.

artificial intelligence, deep learning, machine learning, (12 more...)

1906.00108

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Industry: Health & Medicine > Consumer Health (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Castellana, Daniele, Bacciu, Davide

Bayesian Tensor Factorisation for Bottom-up Hidden Tree Markov Models

arXiv.org Machine LearningMay-31-2019

Bottom-Up Hidden Tree Markov Model is a highly expressive model for tree-structured data. Unfortunately, it cannot be used in practice due to the intractable size of its state-transition matrix. We propose a new approximation which lies on the Tucker factorisation of tensors. The probabilistic interpretation of such approximation allows us to define a new probabilistic model for tree-structured data. Hence, we define the new approximated model and we derive its learning algorithm. Then, we empirically assess the effective power of the new model evaluating it on two different tasks. In both cases, our model outperforms the other approximated model known in the literature.

approximation, artificial intelligence, machine learning, (17 more...)

1905.13528

Country: Europe (0.28)

Genre:

Research Report (0.64)
Workflow (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Marra, Giuseppe, Kuželka, Ondřej

Neural Markov Logic Networks

arXiv.org Machine LearningMay-31-2019

We introduce Neural Markov Logic Networks (NMLNs), a statistical relational learning system that borrows ideas from Markov logic. Like Markov Logic Networks (MLNs), NMLNs are an exponential-family model for modelling distributions over possible worlds, but unlike MLNs, they do not rely on explicitly specified first-order logic rules. Instead, NMLNs learn an implicit representation of such rules as a neural network that acts as a potential function on fragments of the relational structure. Interestingly, any MLN can be represented as an NMLN. Similarly to recently proposed Neural theorem provers (NTPs) [Rocktäschel and Riedel, 2017], NMLNs can exploit embeddings of constants but, unlike NTPs, NMLNs work well also in their absence. This is extremely important for predicting in settings other than the transductive one.

artificial intelligence, machine learning, neural markov logic network, (14 more...)

1905.13462

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

#artificialintelligenceMay-30-2019, 19:28:47 GMT

AAIEA2019

The Workshop on Accelerating Artificial Intelligence for Embedded Autonomy aims at gathering researchers and practitioners in the fields of autonomy, automated reasoning, planning algorithms, and embedded systems to discuss the development of novel hardware architectures that can accelerate the wide variety of AI algorithms demanded by advanced autonomous and intelligent systems. Topics of interest include hardware architectures and design methodologies to accelerate: Applications based on deep learning, skill-level and instinctive autonomy based on deep reinforcement learning, storage and retrieval of facts in knowledge bases, logical reasoning methods such as deduction, search for classical planning algorithms and Hierarchical Task Networks (HTN), inference in probabilistic models such as Bayesian networks and probabilistic logic, planning algorithms for Markov Decision Processes (MDP), and planning algorithms for Partial Observable Markov Decision Processes (POMDP).

artificial intelligence, machine learning, planning algorithm, (4 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)