AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

A Review on Computational Intelligence Techniques in Cloud and Edge Computing

Asim, Muhammad, Wang, Yong, Wang, Kezhi, Huang, Pei-Qiu

arXiv.org Artificial IntelligenceJul-27-2020

Cloud computing (CC) is a centralized computing paradigm that accumulates resources centrally and provides these resources to users through Internet. Although CC holds a large number of resources, it may not be acceptable by real-time mobile applications, as it is usually far away from users geographically. On the other hand, edge computing (EC), which distributes resources to the network edge, enjoys increasing popularity in the applications with low-latency and high-reliability requirements. EC provides resources in a decentralized manner, which can respond to users' requirements faster than the normal CC, but with limited computing capacities. As both CC and EC are resource-sensitive, several big issues arise, such as how to conduct job scheduling, resource allocation, and task offloading, which significantly influence the performance of the whole system. To tackle these issues, many optimization problems have been formulated. These optimization problems usually have complex properties, such as non-convexity and NP-hardness, which may not be addressed by the traditional convex optimization-based solutions. Computational intelligence (CI), consisting of a set of nature-inspired computational approaches, recently exhibits great potential in addressing these optimization problems in CC and EC. This paper provides an overview of research problems in CC and EC and recent progresses in addressing them with the help of CI techniques. Informative discussions and future research trends are also presented, with the aim of offering insights to the readers and motivating new research directions.

evolutionary algorithm, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TETCI.2020.3007905

2007.14215

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(9 more...)

Genre: Overview (1.00)

Industry:

Telecommunications (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (0.93)
Energy > Power Industry (0.68)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Communications > Mobile (1.00)
Information Technology > Cloud Computing (1.00)
(9 more...)

Add feedback

Noisy Agents: Self-supervised Exploration by Predicting Auditory Events

Gan, Chuang, Chen, Xiaoyu, Isola, Phillip, Torralba, Antonio, Tenenbaum, Joshua B.

arXiv.org Artificial IntelligenceJul-27-2020

Humans integrate multiple sensory modalities (e.g. visual and audio) to build a causal understanding of the physical world. In this work, we propose a novel type of intrinsic motivation for Reinforcement Learning (RL) that encourages the agent to understand the causal effect of its actions through auditory event prediction. First, we allow the agent to collect a small amount of acoustic data and use K-means to discover underlying auditory event clusters. We then train a neural network to predict the auditory events and use the prediction errors as intrinsic rewards to guide RL exploration. Experimental results on Atari games show that our new intrinsic motivation significantly outperforms several state-of-the-art baselines. We further visualize our noisy agents' behavior in a physics environment and demonstrate that our newly designed intrinsic reward leads to the emergence of physical interaction behaviors (e.g. contact with objects).

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2007.13729

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Statistical Bootstrapping for Uncertainty Estimation in Off-Policy Evaluation

Kostrikov, Ilya, Nachum, Ofir

arXiv.org Machine LearningJul-27-2020

In reinforcement learning, it is typical to use the empirically observed transitions and rewards to estimate the value of a policy via either model-based or Q-fitting approaches. Although straightforward, these techniques in general yield biased estimates of the true value of the policy. In this work, we investigate the potential for statistical bootstrapping to be used as a way to take these biased estimates and produce calibrated confidence intervals for the true value of the policy. We identify conditions - specifically, sufficient data size and sufficient coverage - under which statistical bootstrapping in this setting is guaranteed to yield correct confidence intervals. In practical situations, these conditions often do not hold, and so we discuss and propose mechanisms that can be employed to mitigate their effects. We evaluate our proposed method and show that it can yield accurate confidence intervals in a variety of conditions, including challenging continuous control environments and small data regimes.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Machine Learning

2007.13609

Country: North America > United States > Massachusetts (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.83)

Add feedback

Python for Computer Vision with OpenCV and Deep Learning

#artificialintelligenceJul-26-2020, 09:16:54 GMT

Bestseller Created by Jose Portilla English [Auto], French [Auto] Students also bought Natural Language Processing with Deep Learning in Python Artificial Intelligence: Reinforcement Learning in Python Tensorflow 2.0: Deep Learning and Artificial Intelligence Bayesian Machine Learning in Python: A/B Testing Modern Deep Learning in Python Modern Reinforcement Learning: Deep Q Learning in PyTorch Preview this course GET COUPON CODE Description Welcome to the ultimate online course on Python for Computer Vision! This course is your best resource for learning how to use the Python programming language for Computer Vision. We'll be exploring how to use Python and the OpenCV (Open Computer Vision) library to analyze images and video data. The most popular platforms in the world are generating never before seen amounts of image and video data. Now more than ever its necessary for developers to gain the necessary skills to work with image and video data using computer vision.

artificial intelligence, machine learning, reinforcement learning, (10 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (0.96)

Industry:

Education > Educational Setting > Online (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.42)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ugurkanates/awesome-real-world-rl

#artificialintelligenceJul-25-2020, 12:20:47 GMT

This list is big compilation of all things trying to adapt Reinforcement Learning techniques in real world.Whether it's mixing real world data into mix or trying to adapt simulations in a better way.It will also include some of Imitation Learning and Meta Learning along the way. If you have anything missing feel free to open a PR, I'm all for community contributions. I'm open to new categories so just read the contributing doc and provide a pull request. You can help also by starring our lovely repository and sharing 3 be safe! Any academic work done related to RL in real world.This is the other part of list, anything doesn't fit but still related gets here.

machine learning, reinforcement learning, ugurkanate awesome-real-world-rl, (2 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.36)

Add feedback

Automated Database Indexing using Model-free Reinforcement Learning

Licks, Gabriel Paludo, Meneguzzi, Felipe

arXiv.org Artificial IntelligenceJul-25-2020

Configuring databases for efficient querying is a complex task, often carried out by a database administrator. Solving the problem of building indexes that truly optimize database access requires a substantial amount of database and domain knowledge, the lack of which often results in wasted space and memory for irrelevant indexes, possibly jeopardizing database performance for querying and certainly degrading performance for updating. We develop an architecture to solve the problem of automatically indexing a database by using reinforcement learning to optimize queries by indexing data throughout the lifetime of a database. In our experimental evaluation, our architecture shows superior performance compared to related work on reinforcement learning and genetic algorithms, maintaining near-optimal index configurations and efficiently scaling to large databases.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2007.14244

Country:

North America > United States > New York > New York County > New York City (0.04)
South America > Brazil > Rio Grande do Sul (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Poland > Lower Silesia Province > Wroclaw (0.04)

Genre:

Research Report (0.64)
Workflow (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Human Preference Scaling with Demonstrations For Deep Reinforcement Learning

Cao, Zehong, Wong, KaiChiu, Lin, Chin-Teng

arXiv.org Artificial IntelligenceJul-25-2020

The current reward learning from human preferences could be used for resolving complex reinforcement learning (RL) tasks without access to the reward function by defining a single fixed preference between pairs of trajectory segments. However, the judgement of preferences between trajectories is not dynamic and still requires human inputs over 1,000 times. In this study, we propose a human preference scaling model that naturally reflects the human perception of the degree of choice between trajectories and then develop a human-demonstration preference model via supervised learning to reduce the number of human inputs. The proposed human preference scaling model with demonstrations can effectively solve complex RL tasks and achieve higher cumulative rewards in simulated robot locomotion - MuJoCo games - relative to the single fixed human preferences. Furthermore, our developed human-demonstration preference model only needs human feedback for less than 0.01\% of the agent's interactions with the environment and significantly reduces up to 30\% of the cost of human inputs compared to the existing approaches. To present the flexibility of our approach, we released a video (https://youtu.be/jQPe1OILT0M) showing comparisons of behaviours of agents trained with different types of human inputs. We believe that our naturally inspired human preference scaling with demonstrations is beneficial for precise reward learning and can potentially be applied to state-of-the-art RL systems, such as autonomy-level driving systems.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2007.12904

Country:

Oceania > Australia > Tasmania (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Exploring Exploration: Comparing Children with RL Agents in Unified Environments

#artificialintelligenceJul-24-2020, 00:58:40 GMT

Despite recent advances in artificial intelligence (AI) research, human children are still by far the best learners we know of, learning impressive skills like language and high-level reasoning from very little data. Children's learning is supported by highly efficient, hypothesis-driven exploration: in fact, they explore so well that many machine learning researchers have been inspired to put videos like the one below in their talks to motivate research into exploration methods. However, because applying results from studies in developmental psychology can be difficult, this video is often the extent to which such research actually connects with human cognition. Why is directly applying research from developmental psychology to problems in AI so hard? For one, taking inspiration from developmental studies can be difficult because the environments that human children and artificial agents are typically studied in can be very different.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)

Add feedback

Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning

Forestier, Sébastien, Portelas, Rémy, Mollard, Yoan, Oudeyer, Pierre-Yves

arXiv.org Artificial IntelligenceJul-24-2020

Intrinsically motivated spontaneous exploration is a key enabler of autonomous lifelong learning in human children. It enables the discovery and acquisition of large repertoires of skills through self-generation, self-selection, self-ordering and self-experimentation of learning goals. We present an algorithmic approach called Intrinsically Motivated Goal Exploration Processes (IMGEP) to enable similar properties of autonomous or self-supervised learning in machines. The IMGEP algorithmic architecture relies on several principles: 1) self-generation of goals, generalized as fitness functions; 2) selection of goals based on intrinsic rewards; 3) exploration with incremental goal-parameterized policy search and exploitation of the gathered data with a batch learning algorithm; 4) systematic reuse of information acquired when targeting a goal for improving towards other goals. We present a particularly efficient form of IMGEP, called Modular Population-Based IMGEP, that uses a population-based policy and an object-centered modularity in goals and mutations. We provide several implementations of this architecture and demonstrate their ability to automatically generate a learning curriculum within several experimental setups including a real humanoid robot that can explore multiple spaces of goals with several hundred continuous dimensions. While no particular target goal is provided to the system, this curriculum allows the discovery of skills that act as stepping stone for learning more complex skills, e.g. nested tool use. We show that learning diverse spaces of goals with intrinsic motivations is more efficient for learning complex skills than only trying to directly learn these complex skills.

artificial intelligence, machine learning, reinforcement learning, (21 more...)

arXiv.org Artificial Intelligence

1708.0219

Country:

Europe > Sweden > Stockholm > Stockholm (0.04)
Europe > France > Pays de la Loire > Loire-Atlantique > Nantes (0.04)

Genre:

Research Report (1.00)
Instructional Material (0.87)

Industry:

Education (1.00)
Leisure & Entertainment > Games > Computer Games (0.94)
Leisure & Entertainment > Sports (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
(3 more...)

Add feedback

Deep Inverse Reinforcement Learning for Structural Evolution of Small Molecules

Agyemang, Brighter, Wu, Wei-Ping, Addo, Daniel, Kpiebaareh, Michael Y., Nanor, Ebenezer, Haruna, Charles Roland

arXiv.org Artificial IntelligenceJul-24-2020

The size and quality of chemical libraries to the drug discovery pipeline are crucial for developing new drugs or repurposing existing drugs. Existing techniques such as combinatorial organic synthesis and High-Throughput Screening usually make the process extraordinarily tough and complicated since the search space of synthetically feasible drugs is exorbitantly huge. While reinforcement learning has been mostly exploited in the literature for generating novel compounds, the requirement of designing a reward function that succinctly represents the learning objective could prove daunting in certain complex domains. Generative Adversarial Network-based methods also mostly discard the discriminator after training and could be hard to train. In this study, we propose a framework for training a compound generator and learning a transferable reward function based on the entropy maximization inverse reinforcement learning paradigm. We show from our experiments that the inverse reinforcement learning route offers a rational alternative for generating chemical compounds in domains where reward function engineering may be less appealing or impossible while data exhibiting the desired objective is readily available.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2008.11804

Country:

Asia > China > Sichuan Province > Chengdu (0.05)
Europe > France > Hauts-de-France > Nord > Lille (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback