offline rl algorithm
Offline Behavior Distillation
Inspired by dataset distillation (DD) [Wang et al., 2018, Zhao et al., (Corollary 1). Extensive experiments on nine datasets of D4RL benchmark [Fu et al., 2020] with multiple environments and data qualities illustrate that our Av-PBC remarkably promotes the OBD performance, Moreover, Av-PBC has a significant convergence speed and requires only a quarter of distillation steps compared to DBC and PBC.
Genre:
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.67)
Technology:
Country:
- North America > United States > California > San Diego County > San Diego (0.04)
- Asia > Middle East > Jordan (0.04)
Industry:
- Information Technology (0.93)
- Telecommunications (0.68)
Technology:
- Information Technology > Communications > Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
- Information Technology > Artificial Intelligence > Robots (0.67)
Country:
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > Portugal > Porto > Porto (0.04)
Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Genre:
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Country:
- Asia > China > Tianjin Province > Tianjin (0.04)
- North America > United States > Montana (0.04)
- North America > Canada > Quebec > Montreal (0.04)
Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Country:
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Genre:
- Research Report (0.69)
- Instructional Material (0.46)
Technology:
Country:
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
Technology:
- Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)
Country:
- Europe > United Kingdom (0.04)
- Europe > France (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
Genre:
- Research Report > New Finding (1.00)
- Personal (0.67)
Technology:
- Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.93)
Country:
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
Industry:
- Government (0.67)
- Information Technology (0.46)
Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)