unifying view
Review for NeurIPS paper: A Unifying View of Optimism in Episodic Reinforcement Learning
Weaknesses: While I like the duality result, I find this paper is not substantial enough that merits acceptance. This paper shows a class of model-optimistic algorithms can be implemented efficiently (with minor modifications). However, none of state-of-the-art algorithms is model-optimistic algorithms. This is somehow inherent with this class of algorithms because the transition model scales with S 2 but the optimal bounds scale linearly in S via value-optimistic algorithms. Value-optimisic algorithms are not only more computationally efficient but also more statistically efficient. So making model-optimistic algorithms more efficient is not a very significant result.
Review for NeurIPS paper: A Unifying View of Optimism in Episodic Reinforcement Learning
The reviewers are in agreement that this is interesting and well-presented work. The main concern was about the extent to which the results will help us derive SOTA algorithms in the future. I find the contribution reasonable without this and hope the community will figure out how/if these results are useful. Please do take the reviewers minor suggestions into consideration when preparing a final version.
A Unifying View of Optimism in Episodic Reinforcement Learning
In this paper we provide a general framework for designing, analyzing and implementing such algorithms in the episodic reinforcement learning problem. This framework is built upon Lagrangian duality, and demonstrates that every model-optimistic algorithm that constructs an optimistic MDP has an equivalent representation as a value-optimistic dynamic programming algorithm. Typically, it was thought that these two classes of algorithms were distinct, with model-optimistic algorithms benefiting from a cleaner probabilistic analysis while value-optimistic algorithms are easier to implement and thus more practical. With the framework developed in this paper, we show that it is possible to get the best of both worlds by providing a class of algorithms which have a computationally efficient dynamic-programming implementation and also a simple probabilistic analysis. Besides being able to capture many existing algorithms in the tabular setting, our framework can also address large-scale problems under realizable function approximation, where it enables a simple model-based analysis of some recently proposed methods.
Active Inference or Control as Inference? A Unifying View
Watson, Joe, Imohiosen, Abraham, Peters, Jan
Active inference (AI) is a persuasive theoretical framework from computational neuroscience that seeks to describe action and perception as inference-based computation. However, this framework has yet to provide practical sensorimotor control algorithms that are competitive with alternative approaches. In this work, we frame active inference through the lens of control as inference (CaI), a body of work that presents trajectory optimization as inference. From the wider view of `probabilistic numerics', CaI offers principled, numerically robust optimal control solvers that provide uncertainty quantification, and can scale to nonlinear problems with approximate inference. We show that AI may be framed as partially-observed CaI when the cost function is defined specifically in the observation states.
- Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.05)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Multi-Target Prediction: A Unifying View on Problems and Methods
Waegeman, Willem, Dembczynski, Krzysztof, Huellermeier, Eyke
Multi-target prediction (MTP) is concerned with the simultaneous prediction of multiple target variables of diverse type. Due to its enormous application potential, it has developed into an active and rapidly expanding research field that combines several subfields of machine learning, including multivariate regression, multi-label classification, multi-task learning, dyadic prediction, zero-shot learning, network inference, and matrix completion. In this paper, we present a unifying view on MTP problems and methods. First, we formally discuss commonalities and differences between existing MTP problems. To this end, we introduce a general framework that covers the above subfields as special cases. As a second contribution, we provide a structured overview of MTP methods. This is accomplished by identifying a number of key properties, which distinguish such methods and determine their suitability for different types of problems. Finally, we also discuss a few challenges for future research.
- North America > United States > New York > New York County > New York City (0.28)
- North America > United States > District of Columbia > Washington (0.04)
- Europe > Poland > Greater Poland Province > Poznań (0.04)
- (20 more...)
- Research Report > Experimental Study (0.67)
- Research Report > New Finding (0.46)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Education (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Explanation-Based Generalization: A Unifying View
Mitchell, T. M. | Keller, R. | Kedar-Cabelli, S.
"The problem of formulating general concepts from specific training examples has long been a major focus of machine learning research. While most previous research has focused on empirical methods for generalizing from a large number of training examples using no domain-specific knowledge, in the past few years new methods have been developed for applying domain-specific knowledge to formulate valid generalizations from single training examples. The characteristic common to these methods is that their ability to generalize from a single example follows from their ability to explain why the training example is a member of the concept being learned. This paper proposes a general, domain-independent mechanism, called EBG, that unifies previous approaches to explanation-based generalization. The EBG method is illustrated in the context of several example problems, and used to contrast several existing systems for explanation-based generalization. The perspective on explanation-based generalization afforded by this general method is also used to identify open research problems in this area." Machine Learning, 1 (1), 47–80.