Just Round: Quantized Observation Spaces Enable Memory Efficient Learning of Dynamic Locomotion
Grossman, Lev, Plancher, Brian
–arXiv.org Artificial Intelligence
Importantly, unlike simply reducing the number of observations Deep reinforcement learning (DRL) continues to see increased stored in the buffer, which decreases the memory attention by the robotics community due to its footprint at the cost of reduced learning performance, our ability to learn complex behaviors in both simulated and quantization scheme is able to reduce memory usage without real environments. These methods have been successfully impacting the training performance. We present experiments applied to a host of robotic tasks including: dexterous manipulation across four popular simulated robotic locomotion domains, [1], quadrupedal locomotion [2], and high-speed using two of the most popular DRL algorithms, the on-policy drone racing [3]. Despite these successes, DRL remains Proximal Policy Optimization (PPO) and off-policy Soft largely sample inefficient, depending on enormous amounts Actor-Critic (SAC), and find that our approach can reduce of training data to learn. As much of this data is kept in the memory footprint by as much as 4.2 without impacting replay buffers during training, DRL is extremely memory training performance.
arXiv.org Artificial Intelligence
Apr-22-2023
- Country:
- North America > United States (0.04)
- Asia
- Middle East > Jordan (0.04)
- South Korea > Incheon
- Incheon (0.04)
- Genre:
- Research Report (0.45)
- Industry:
- Leisure & Entertainment (0.48)
- Transportation (0.34)
- Information Technology (0.34)
- Technology: