AITopics

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.38)

arXiv.org Artificial IntelligenceAug-8-2020

Online Multi-modal Person Search in Videos

Xia, Jiangyue, Rao, Anyi, Huang, Qingqiu, Xu, Linning, Wen, Jiangtao, Lin, Dahua

The task of searching certain people in videos has seen increasing potential in real-world applications, such as video organization and editing. Most existing approaches are devised to work in an offline manner, where identities can only be inferred after an entire video is examined. This working manner precludes such methods from being applied to online services or those applications that require real-time responses. In this paper, we propose an online person search framework, which can recognize people in a video on the fly. This framework maintains a multimodal memory bank at its heart as the basis for person recognition, and updates it dynamically with a policy obtained by reinforcement learning. Our experiments on a large movie dataset show that the proposed method is effective, not only achieving remarkable improvements over online schemes but also outperforming offline methods.

computer vision, machine learning, reinforcement learning, (14 more...)

2008.03546

Country: Asia > China > Hong Kong (0.04)

Genre: Research Report (0.64)

Industry:

Leisure & Entertainment (1.00)
Media > Film (0.96)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)

arXiv.org Artificial IntelligenceAug-8-2020

Hierarchial Reinforcement Learning in StarCraft II with Human Expertise in Subgoals Selection

Xu, Xinyi, Huang, Tiancheng, Wei, Pengfei, Narayan, Akshay, Leong, Tze-Yun

This work is inspired by recent advances in hierarchical reinforcement learning (HRL) (Barto and Mahadevan 2003;Hengst 2010), and improvements in learning efficiency with heuristic-based subgoal selection and hindsight experience replay (HER)(Andrychowicz et al. 2017; Levy et al. 2019). We propose a new method to integrate HRL, HER and effective subgoal selection based on human expertise to support sample-efficient learning and enhance interpretability of the agent's behavior. Human expertise remains indispensable in many areas such as medicine (Buch, Ahmed, and Maruthappu 2018) and law (Cath 2018), where interpretability, explainability and transparency are crucial in the decision making process, for ethical and legal reasons. Our method simplifies the complex task sets for achieving the overall objectives by decomposing into subgoals at different levels of abstraction. Incorporating relevant subjective knowledge also significantly reduces the computational resources spent in exploration for RL, especially in high speed, changing, and complex environments where the transition dynamics cannot be effectively learned and modelled in a short time. Experimental results in two StarCraft II (SC2) minigames demonstrate that our method can achieve better sample efficiency than flat and end-to-end RL methods, and provide an effective method for explaining the agent's performance.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

2008.03444

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Germany > Baden-Württemberg > Freiburg (0.04)
(5 more...)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games > Computer Games (0.85)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Artificial IntelligenceAug-8-2020

One for Many: Transfer Learning for Building HVAC Control

Xu, Shichao, Wang, Yixuan, Wang, Yanzhi, O'Neill, Zheng, Zhu, Qi

The design of building heating, ventilation, and air conditioning (HVAC) system is critically important, as it accounts for around half of building energy consumption and directly affects occupant comfort, productivity, and health. Traditional HVAC control methods are typically based on creating explicit physical models for building thermal dynamics, which often require significant effort to develop and are difficult to achieve sufficient accuracy and efficiency for runtime building control and scalability for field implementations. Recently, deep reinforcement learning (DRL) has emerged as a promising data-driven method that provides good control performance without analyzing physical models at runtime. However, a major challenge to DRL (and many other data-driven learning methods) is the long training time it takes to reach the desired performance. In this work, we present a novel transfer learning based approach to overcome this challenge. Our approach can effectively transfer a DRL-based HVAC controller trained for the source building to a controller for the target building with minimal effort and improved performance, by decomposing the design of neural network controller into a transferable front-end network that captures building-agnostic behavior and a back-end network that can be efficiently trained for each specific building. We conducted experiments on a variety of transfer scenarios between buildings with different sizes, numbers of thermal zones, materials and layouts, air conditioner types, and ambient weather conditions. The experimental results demonstrated the effectiveness of our approach in significantly reducing the training time, energy cost, and temperature violations.

controller, machine learning, reinforcement learning, (15 more...)

2008.03625

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.04)
North America > United States > Texas > Brazos County > College Station (0.04)
North America > United States > California > Riverside County > Riverside (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Construction & Engineering > HVAC (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Arenz, Oleg, Neumann, Gerhard

Non-Adversarial Imitation Learning and its Connections to Adversarial Methods

arXiv.org Machine LearningAug-8-2020

Imitation learning (IL, Schaal, 1999; Osa et al., 2018) and inverse reinforcement learning (IRL, Ng and Russell, 2000) are two related areas of research that aim to teach agents by providing demonstrations of the desired behavior. Whereas imitation learning aims to learn a policy that results in a similar behavior, inverse reinforcement learning focuses on inferring a reward function that might have been optimized by the demonstrator, aiming to better generalize to different environments. Both areas of research are often formalized as distribution-matching, that is, the learned policy (or the optimal policy for IRL) should induce a distribution over states and actions that is close to the expert's distribution with respect to a given (usually non-metric) distance. Commonly applied distances are the forward Kullback-Leibler (KL) divergence (e.g., Ziebart, 2010), which maximizes the likelihood of the demonstrated state-action pairs under the agent's distribution, and the reverse Kullback-Leibler (RKL) divergence (e.g., Arenz et al., 2016; Fu et al., 2018; Ghasemipour et al., 2020) which minimizes the expected discrimination information (Kullback and Leibler, 1951) of state-action pairs sampled from the agent's distribution. However, since the emergence of generative adversarial networks (GANs, Goodfellow et al., 2014) as a solution technique for both areas, other divergences have been investigated such as the Jensen-Shannon divergence (Ho and Ermon, 2016), the Wasserstein distance (Xiao et al., 2019) and general f-divergences (Ke et al., 2019; Ghasemipour et al., 2020).

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Machine Learning

2008.03525

Country:

Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
North America > United States > New York > Richmond County > New York City (0.04)
(9 more...)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

#artificialintelligenceAug-7-2020, 16:59:34 GMT

Artificial Intelligence Masterclass

Online Courses Udemy - Artificial Intelligence Masterclass, Enter the new era of Hybrid AI Models optimized by Deep NeuroEvolution, with a complete toolkit of ML, DL & AI models Created by Hadelin de Ponteves, Kirill Eremenko, SuperDataScience Team English, Italian [Auto] Students also bought Deep Reinforcement Learning 2.0 Cutting-Edge AI: Deep Reinforcement Learning in Python Artificial Intelligence for Business Deep Learning: Advanced Computer Vision (GANs, SSD, More!) Deep Learning: Convolutional Neural Networks in Python TensorFlow 2.0 Practical Advanced Preview this course GET COUPON CODE Description Today, we are bringing you the king of our AI courses...: The Artificial Intelligence MASTERCLASS Are you keen on Artificial Intelligence? Do want to learn to build the most powerful AI model developed so far and even play against it? Sounds tempting right... Then Artificial Intelligence Masterclass course is the right choice for you. This ultimate AI toolbox is all you need to nail it down with ease. You will get 10 hours step by step guide and the full roadmap which will help you build your own Hybrid AI Model from scratch.

artificial intelligence, machine learning, reinforcement learning, (11 more...)

Genre: Instructional Material > Training Manual (0.37)

Industry:

Education > Educational Setting > Online (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.59)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

#artificialintelligenceAug-7-2020, 07:56:06 GMT

Markov Decision Process

A machine learning algorithm may be tasked with an optimization problem. Using reinforcement learning, the algorithm will attempt to optimize the actions taken within an environment, in order to maximize the potential reward. Where supervised learning techniques require correct input/output pairs to create a model, reinforcement learning uses Markov decision processes to determine an optimal balance of exploration and exploitation. Machine learning may use reinforcement learning by way of the Markov decision process when the probabilities and rewards of an outcome are unspecified or unknown.

inductive learning, markov decision process, reinforcement learning, (3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.76)

#artificialintelligenceAug-7-2020, 01:41:04 GMT

Deep Reinforcement Learning & Its Applications

How To Make Most Successful Apps For Your Business? AI system'should be recognised as inventor'

artificial intelligence, deep reinforcement learning, machine learning, (1 more...)

Industry: Information Technology > Security & Privacy (0.42)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Dodaro, Carmine, Eiter, Thomas, Ogris, Paul, Schekotihin, Konstantin

Managing caching strategies for stream reasoning with reinforcement learning

arXiv.org Artificial IntelligenceAug-7-2020

Efficient decision-making over continuously changing data is essential for many application domains such as cyber-physical systems, industry digitalization, etc. Modern stream reasoning frameworks allow one to model and solve various real-world problems using incremental and continuous evaluation of programs as new data arrives in the stream. Applied techniques use, e.g., Datalog-like materialization or truth maintenance algorithms to avoid costly re-computations, thus ensuring low latency and high throughput of a stream reasoner. However, the expressiveness of existing approaches is quite limited and, e.g., they cannot be used to encode problems with constraints, which often appear in practice. In this paper, we suggest a novel approach that uses the Conflict-Driven Constraint Learning (CDCL) to efficiently update legacy solutions by using intelligent management of learned constraints. In particular, we study the applicability of reinforcement learning to continuously assess the utility of learned constraints computed in previous invocations of the solving algorithm for the current one. Evaluations conducted on real-world reconfiguration problems show that providing a CDCL algorithm with relevant learned constraints from previous iterations results in significant performance improvements of the algorithm in stream reasoning scenarios.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

2008.03212

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Italy > Calabria (0.04)
Europe > Germany > Brandenburg > Potsdam (0.04)
Europe > Austria > Vienna (0.04)

Genre:

Research Report > New Finding (0.68)
Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.66)

Polymenakos, Kyriakos, Rontsis, Nikitas, Abate, Alessandro, Roberts, Stephen

SafePILCO: a software tool for safe and data-efficient policy synthesis

arXiv.org Machine LearningAug-7-2020

SafePILCO is a software tool for safe and data-efficient policy search with reinforcement learning. It extends the known PILCO algorithm, originally written in MATLAB, to support safe learning. We provide a Python implementation and leverage existing libraries that allow the codebase to remain short and modular, which is appropriate for wider use by the verification, reinforcement learning, and control communities.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Machine Learning

2008.03273

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)