Autonomous Learning of Features for Control: Experiments with Embodied and Situated Agents
Milano, Nicola, Nolfi, Stefano
–arXiv.org Artificial Intelligence
Indeed, previous works demonstrated how combined models of this type can speedup learning and/or achieve better performance also in continuous problems domains. In particular, the research reported in (Riedmiller & VoigtHinder, 2012; Mattner, Lange & Riedmiller, 2012; Ha & Schmidhuber, 2018) demonstrated how the addition of feature-9 extraction network is beneficial, at least in the case of problems that can benefit from dimensionality reduction and that involve a perspective transformation of the observation states. In this paper we report new data that provide further evidences on the utility of feature extractions, permit to compare the relative efficacy of alternative methods, and demonstrate the importance of updating the feature extracted during the training of the policy network. The data reported further support the hypothesis that feature extraction can enhance learning, also in the case of continuous problem domains in which relevant features extend over space and time. Indeed, the usage of feature extraction enabled us to obtain significantly better results in 3 of the 4 problems considered. The utilization of problems that involve agents operating on the basis of egocentric information, instead of allocentric information as in previous studies, demonstrates that feature extraction can be advantageous in general terms, irrespectively from the necessity to perform a perspective transformation. Moreover, the utilization of problems that involve relatively compact observation vectors, instead than large observation vectors as in previous studies, demonstrates that feature extraction can be advantageous also in problems that do not benefit from dimensionality reduction. The data collected by training the feature extracting network before the policy network, as in previous studies, or also during the training of the policy network demonstrates that the latter technique is much more effective and that the method proposed in this paper for realizing the continuous training is sound. Finally, the comparison of different self-supervised techniques for extracting useful features demonstrates that sequence-to-sequence learning produces the best results and outperform the other methods used in previous studies in the problem considered.
arXiv.org Artificial Intelligence
Sep-15-2020
- Country:
- Europe > Italy (0.04)
- Oceania > Australia
- Queensland > Brisbane (0.04)
- North America > United States
- New York > New York County
- New York City (0.04)
- Massachusetts > Middlesex County
- Cambridge (0.04)
- New York > New York County
- Asia > Middle East
- Qatar (0.04)
- Genre:
- Research Report
- Experimental Study (0.50)
- New Finding (0.47)
- Research Report
- Industry:
- Leisure & Entertainment > Sports (0.30)
- Technology: