Goto

Collaborating Authors

 Yang, Ruihan


Neural Volumetric Memory for Visual Locomotion Control

arXiv.org Artificial Intelligence

The control of such behaviors requires tight coupling with perception because vision is needed to provide details of Legged robots have the potential to expand the reach of the terrain right beneath the robot and the 3D scene immediately autonomy beyond paved roads. In this work, we consider around it. This problem is also partially-observable. the difficult problem of locomotion on challenging terrains Immediately relevant terrain information is often occluded using a single forward-facing depth camera. Due to the from the robot's current frame of observation, forcing it to partial observability of the problem, the robot has to rely rely on past observations for control decisions. For this reason, on past observations to infer the terrain currently beneath while blind controllers that are learned in simulation it. To solve this problem, we follow the paradigm in computer using reinforcement learning have achieved impressive results vision that explicitly models the 3D geometry of the in agility and robustness [33, 36, 38], there are clear scene and propose Neural Volumetric Memory (NVM), a geometric limitations on how much they can do. How to incorporate memory architecture that explicitly accounts for the perception into the pipeline to produce an integrated visuomotor SE(3) equivariance of the 3D world. NVM aggregates feature controller thus remains an open problem.


Diffusion Probabilistic Modeling for Video Generation

arXiv.org Artificial Intelligence

Denoising diffusion probabilistic models are a promising new class of generative models that mark a milestone in high-quality image generation. This paper showcases their ability to sequentially generate video, surpassing prior methods in perceptual and probabilistic forecasting metrics. We propose an autoregressive, end-to-end optimized video diffusion model inspired by recent advances in neural video compression. The model successively generates future frames by correcting a deterministic next-frame prediction using a stochastic residual generated by an inverse diffusion process. We compare this approach against five baselines on four datasets involving natural and simulation-based videos. We find significant improvements in terms of perceptual quality for all datasets. Furthermore, by introducing a scalable version of the Continuous Ranked Probability Score (CRPS) applicable to video, we show that our model also outperforms existing approaches in their probabilistic frame forecasting ability.


Deep Music Analogy Via Latent Representation Disentanglement

arXiv.org Machine Learning

Analogy is a key solution to automated music generation, featured by its ability to generate both natural and creative pieces based on only a few examples. In general, an analogy is made by partially transferring the music abstractions, i.e., high-level representations and their relationships, from one piece to another; however, this procedure requires disentangling music representations, which takes little effort for musicians but is non-trivial for computers. Three sub-problems arise: extracting latent representations from the observation, disentangling the representations so that each part has a unique semantic interpretation, and mapping the latent representations back to actual music. An explicitly-constrained conditional variational auto-encoder (EC2-VAE) is proposed as a unified solution to all three sub-problems. In this study, we focus on disentangling the pitch and rhythm representations of 8-beat music clips conditioned on chords. In producing music analogies, this model helps us to realize the imaginary situation of "what if" a piece is composed using a different pitch contour, rhythm pattern, chord progression etc., by borrowing the representations from other pieces. Finally, we validate the proposed disentanglement method using objective measurements and evaluate the analogy examples by a subjective study.


Learning Efficient and Effective Exploration Policies with Counterfactual Meta Policy

arXiv.org Machine Learning

A fundamental issue in reinforcement learning algorithms is the balance between exploration of the environment and exploitation of information already obtained by the agent. Especially, exploration has played a critical role for both efficiency and efficacy of the learning process. However, Existing works for exploration involve task-agnostic design, that is performing well in one environment, but be ill-suited to another. To the purpose of learning an effective and efficient exploration policy in an automated manner. We formalized a feasible metric for measuring the utility of exploration based on counterfactual ideology. Based on that, We proposed an end-to-end algorithm to learn exploration policy by meta-learning. We demonstrate that our method achieves good results compared to previous works in the high-dimensional control tasks in MuJoCo simulator.


Artificial Intelligence for Prosthetics - challenge solutions

arXiv.org Machine Learning

In the NeurIPS 2018 Artificial Intelligence for Prosthetics challenge, participants were tasked with building a controller for a musculoskeletal model with a goal of matching a given time-varying velocity vector. Top participants were invited to describe their algorithms. In this work, we describe the challenge and present thirteen solutions that used deep reinforcement learning approaches. Many solutions use similar relaxations and heuristics, such as reward shaping, frame skipping, discretization of the action space, symmetry, and policy blending. However, each team implemented different modifications of the known algorithms by, for example, dividing the task into subtasks, learning low-level control, or by incorporating expert knowledge and using imitation learning.