AITopics

2211.01496

Country: North America > United States > Arizona > Maricopa County > Tempe (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Felizardo, Leonardo Kanashiro, Paiva, Francisco Caio Lima, Costa, Anna Helena Reali, Del-Moral-Hernandez, Emilio

Reinforcement Learning Applied to Trading Systems: A Survey

arXiv.org Artificial IntelligenceNov-1-2022

Financial domain tasks, such as trading in market exchanges, are challenging and have long attracted researchers. The recent achievements and the consequent notoriety of Reinforcement Learning (RL) have also increased its adoption in trading tasks. RL uses a framework with well-established formal concepts, which raises its attractiveness in learning profitable trading strategies. However, RL use without due attention in the financial area can prevent new researchers from following standards or failing to adopt relevant conceptual guidelines. In this work, we embrace the seminal RL technical fundamentals, concepts, and recommendations to perform a unified, theoretically-grounded examination and comparison of previous research that could serve as a structuring guide for the field of study. A selection of twenty-nine articles was reviewed under our classification that considers RL's most common formulations and design patterns from a large volume of available studies. This classification allowed for precise inspection of the most relevant aspects regarding data input, preprocessing, state and action composition, adopted RL techniques, evaluation setups, and overall results. Our analysis approach organized around fundamental RL concepts allowed for a clear identification of current system design best practices, gaps that require further investigation, and promising research opportunities. Finally, this review attempts to promote the development of this field of study by facilitating researchers' commitment to standards adherence and helping them to avoid straying away from the RL constructs' firm ground.

evolutionary algorithm, machine learning, reinforcement learning, (20 more...)

2212.06064

Genre:

Overview (1.00)
Research Report > New Finding (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(4 more...)

arXiv.org Artificial IntelligenceNov-1-2022

Inferring school district learning modalities during the COVID-19 pandemic with a hidden Markov model

Panaggio, Mark J., Fang, Mike, Bang, Hyunseung, Armstrong, Paige A., Binder, Alison M., Grass, Julian E., Magid, Jake, Papazian, Marc, Shapiro-Mendoza, Carrie K, Parks, Sharyn E.

In this study, learning modalities offered by public schools across the United States were investigated to track changes in the proportion of schools offering fully in-person, hybrid and fully remote learning over time. Learning modalities from 14,688 unique school districts from September 2020 to June 2021 were reported by Burbio, MCH Strategic Data, the American Enterprise Institute's Return to Learn Tracker and individual state dashboards. A model was needed to combine and deconflict these data to provide a more complete description of modalities nationwide. A hidden Markov model (HMM) was used to infer the most likely learning modality for each district on a weekly basis. This method yielded higher spatiotemporal coverage than any individual data source and higher agreement with three of the four data sources than any other single source. The model output revealed that the percentage of districts offering fully in-person learning rose from 40.3% in September 2020 to 54.7% in June of 2021 with increases across 45 states and in both urban and rural districts. This type of probabilistic model can serve as a tool for fusion of incomplete and contradictory data sources in support of public health surveillance and research efforts.

artificial intelligence, machine learning, modality, (18 more...)

2211.00708

Country:

North America > United States > Vermont (0.05)
North America > United States > Virginia (0.05)
North America > United States > South Carolina (0.05)
(18 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Public Health (1.00)
(3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Celemin, Carlos, Pérez-Dattari, Rodrigo, Chisari, Eugenio, Franzese, Giovanni, Rosa, Leandro de Souza, Prakash, Ravi, Ajanović, Zlatan, Ferraz, Marta, Valada, Abhinav, Kober, Jens

Interactive Imitation Learning in Robotics: A Survey

arXiv.org Artificial IntelligenceOct-31-2022

Interactive Imitation Learning (IIL) is a branch of Imitation Learning (IL) where human feedback is provided intermittently during robot execution allowing an online improvement of the robot's behavior. In recent years, IIL has increasingly started to carve out its own space as a promising data-driven alternative for solving complex robotic tasks. The advantages of IIL are its data-efficient, as the human feedback guides the robot directly towards an improved behavior, and its robustness, as the distribution mismatch between the teacher and learner trajectories is minimized by providing feedback directly over the learner's trajectories. Nevertheless, despite the opportunities that IIL presents, its terminology, structure, and applicability are not clear nor unified in the literature, slowing down its development and, therefore, the research of innovative formulations and discoveries. In this article, we attempt to facilitate research in IIL and lower entry barriers for new practitioners by providing a survey of the field that unifies and structures it. In addition, we aim to raise awareness of its potential, what has been accomplished and what are still open research questions. We organize the most relevant works in IIL in terms of human-robot interaction (i.e., types of feedback), interfaces (i.e., means of providing feedback), learning (i.e., models learned from feedback and function approximators), user experience (i.e., human perception about the learning process), applications, and benchmarks. Furthermore, we analyze similarities and differences between IIL and RL, providing a discussion on how the concepts offline, online, off-policy and on-policy learning should be transferred to IIL from the RL literature. We particularly focus on robotic applications in the real world and discuss their implications, limitations, and promising future areas of research.

evolutionary algorithm, machine learning, reinforcement learning, (24 more...)

2211.006

Country:

Europe (1.00)
Asia (1.00)
North America > United States > California > Los Angeles County (0.45)
North America > United States > Massachusetts > Middlesex County (0.27)

Genre:

Overview (1.00)
Instructional Material > Course Syllabus & Notes (1.00)
Research Report > New Finding (0.87)
Research Report > Experimental Study > Negative Result (0.33)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Information Technology (1.00)
Health & Medicine (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(8 more...)

arXiv.org Artificial IntelligenceOct-31-2022

Teacher-student curriculum learning for reinforcement learning

Schraner, Yanick

Reinforcement learning (rl) is a popular paradigm for sequential decision making problems. The past decade's advances in rl have led to breakthroughs in many challenging domains such as video games, board games, robotics, and chip design. The sample inefficiency of deep reinforcement learning methods is a significant obstacle when applying rl to real-world problems. Transfer learning has been applied to reinforcement learning such that the knowledge gained in one task can be applied when training in a new task. Curriculum learning is concerned with sequencing tasks or data samples such that knowledge can be transferred between those tasks to learn a target task that would otherwise be too difficult to solve. Designing a curriculum that improves sample efficiency is a complex problem. In this thesis, we propose a teacher-student curriculum learning setting where we simultaneously train a teacher that selects tasks for the student while the student learns how to solve the selected task. Our method is independent of human domain knowledge and manual curriculum design. We evaluated our methods on two reinforcement learning benchmarks: grid world and the challenging Google Football environment. With our method, we can improve the sample efficiency and generality of the student compared to tabula-rasa reinforcement learning.

curriculum, machine learning, reinforcement learning, (17 more...)

2210.17368

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(8 more...)

Genre:

Research Report > New Finding (0.92)
Instructional Material > Course Syllabus & Notes (0.87)

Industry:

Leisure & Entertainment > Sports > Soccer (1.00)
Leisure & Entertainment > Games > Computer Games (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Arumugam, Dilip, Singh, Satinder

Planning to the Information Horizon of BAMDPs via Epistemic State Abstraction

The Bayes-Adaptive Markov Decision Process (BAMDP) formalism pursues the Bayes-optimal solution to the exploration-exploitation trade-off in reinforcement learning. As the computation of exact solutions to Bayesian reinforcement-learning problems is intractable, much of the literature has focused on developing suitable approximation algorithms. In this work, before diving into algorithm design, we first define, under mild structural assumptions, a complexity measure for BAMDP planning. As efficient exploration in BAMDPs hinges upon the judicious acquisition of information, our complexity measure highlights the worst-case difficulty of gathering information and exhausting epistemic uncertainty. To illustrate its significance, we establish a computationally-intractable, exact planning algorithm that takes advantage of this measure to show more efficient planning. We then conclude by introducing a specific form of state abstraction with the potential to reduce BAMDP complexity and gives rise to a computationally-tractable, approximate planning algorithm.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

2210.16872

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Arumugam, Dilip, Ho, Mark K., Goodman, Noah D., Van Roy, Benjamin

On Rate-Distortion Theory in Capacity-Limited Cognition & Reinforcement Learning

Throughout the cognitive-science literature, there is widespread agreement that decision-making agents operating in the real world do so under limited information-processing capabilities and without access to unbounded cognitive or computational resources. Prior work has drawn inspiration from this fact and leveraged an information-theoretic model of such behaviors or policies as communication channels operating under a bounded rate constraint. Meanwhile, a parallel line of work also capitalizes on the same principles from rate-distortion theory to formalize capacity-limited decision making through the notion of a learning target, which facilitates Bayesian regret bounds for provably-efficient learning algorithms. In this paper, we aim to elucidate this latter perspective by presenting a brief survey of these information-theoretic models of capacity-limited decision making in biological and artificial agents.

artificial intelligence, machine learning, reinforcement learning, (10 more...)

2210.16877

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre:

Overview (0.54)
Instructional Material > Course Syllabus & Notes (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Arumugam, Dilip, Van Roy, Benjamin

Deciding What to Model: Value-Equivalent Sampling for Reinforcement Learning

The quintessential model-based reinforcement-learning agent iteratively refines its estimates or prior beliefs about the true underlying model of the environment. Recent empirical successes in model-based reinforcement learning with function approximation, however, eschew the true model in favor of a surrogate that, while ignoring various facets of the environment, still facilitates effective planning over behaviors. Recently formalized as the value equivalence principle, this algorithmic technique is perhaps unavoidable as real-world reinforcement learning demands consideration of a simple, computationally-bounded agent interacting with an overwhelmingly complex environment, whose underlying dynamics likely exceed the agent's capacity for representation. In this work, we consider the scenario where agent limitations may entirely preclude identifying an exactly value-equivalent model, immediately giving rise to a trade-off between identifying a model that is simple enough to learn while only incurring bounded sub-optimality. To address this problem, we introduce an algorithm that, using rate-distortion theory, iteratively computes an approximately-value-equivalent, lossy compression of the environment which an agent may feasibly target in lieu of the true model. We prove an information-theoretic, Bayesian regret bound for our algorithm that holds for any finite-horizon, episodic sequential decision-making problem. Crucially, our regret bound can be expressed in one of two possible forms, providing a performance guarantee for finding either the simplest model that achieves a desired sub-optimality gap or, alternatively, the best model given a limit on agent capacity.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

2206.02072

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Romania > Sud-Est Development Region > Tulcea County > Tulcea (0.04)
(4 more...)

Genre:

Research Report (0.81)
Instructional Material > Course Syllabus & Notes (0.67)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Reward Shaping Using Convolutional Neural Network

Sami, Hani, Otrok, Hadi, Bentahar, Jamal, Mourad, Azzam, Damiani, Ernesto

In this paper, we propose Value Iteration Network for Reward Shaping (VIN-RS), a potential-based reward shaping mechanism using Convolutional Neural Network (CNN). The proposed VIN-RS embeds a CNN trained on computed labels using the message passing mechanism of the Hidden Markov Model. The CNN processes images or graphs of the environment to predict the shaping values. Recent work on reward shaping still has limitations towards training on a representation of the Markov Decision Process (MDP) and building an estimate of the transition matrix. The advantage of VIN-RS is to construct an effective potential function from an estimated MDP while automatically inferring the environment transition matrix. The proposed VIN-RS estimates the transition matrix through a self-learned convolution filter while extracting environment details from the input frames or sampled graphs. Due to (1) the previous success of using message passing for reward shaping; and (2) the CNN planning behavior, we use these messages to train the CNN of VIN-RS. Experiments are performed on tabular games, Atari 2600 and MuJoCo, for discrete and continuous action space. Our results illustrate promising improvements in the learning speed and maximum cumulative reward compared to the state-of-the-art.

artificial intelligence, machine learning, vin-rs, (19 more...)

2210.16956

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
North America > United States > New York (0.04)
North America > Canada > Quebec > Montreal (0.04)
Asia > Middle East > Lebanon > Beirut Governorate > Beirut (0.04)

Genre: Research Report > New Finding (0.48)

Industry:

Information Technology (0.93)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Ravuri, Aditya, Andersson, Tom R., Kazlauskaite, Ieva, Tebbutt, Will, Turner, Richard E., Hosking, J. Scott, Lawrence, Neil D., Kaiser, Markus

Ice Core Dating using Probabilistic Programming

arXiv.org Artificial IntelligenceOct-29-2022

However, before ice core data can have scientific value, the chronology must be inferred by estimating the age as a function of depth. Under certain conditions, chemicals locked in the ice display quasi-periodic cycles that delineate annual layers. Manually counting these noisy seasonal patterns to infer the chronology can be an imperfect and time-consuming process, and does not capture uncertainty in a principled fashion. In addition, several ice cores may be collected from a region, introducing an aspect of spatial correlation between them. We present an exploration of the use of probabilistic models for automatic dating of ice cores, using probabilistic programming to showcase its use for prototyping, automatic inference and maintainability, and demonstrate common failure modes of these tools.

artificial intelligence, machine learning, observation model, (16 more...)

2210.16568

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.15)
Europe > Denmark > Capital Region > Copenhagen (0.04)
Antarctica > West Antarctica > Antarctic Peninsula (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.72)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)