behavior policy
Country:
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > United Kingdom > England (0.04)
Technology:
31839b036f63806cba3f47b93af8ccb5-Paper.pdf
Offline reinforcement learning (RL) tasks require the agent to learn from a precollected dataset with no further interactions with the environment. Despite the potential tosurpass thebehavioral policies, RL-based methods aregenerally impractical duetothetraining instability andbootstrapping theextrapolation errors, which always require careful hyperparameter tuning via online evaluation.
Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.91)
Country:
Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Country:
- Asia > China > Shanghai > Shanghai (0.40)
- North America > United States > California (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Technology:
Country:
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- (14 more...)
Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)
Country:
- Asia > Middle East > Jordan (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- North America > United States > New Jersey (0.04)
- (2 more...)
Genre:
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.93)
- Workflow (0.68)
Industry:
- Leisure & Entertainment > Games (0.46)
- Education > Educational Setting (0.46)
Technology:
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- (4 more...)
Technology:
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.73)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Optimal Treatment Allocation for Efficient Policy Evaluation in Sequential Decision Making Ting Li
A/B testing is critical for modern technological companies to evaluate the effectiveness of newly developed products against standard baselines. This paper studies optimal designs that aim to maximize the amount of information obtained from online experiments to estimate treatment effects accurately.
Country:
- Asia > China > Shanghai > Shanghai (0.04)
- North America > United States > North Carolina (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
Genre:
- Research Report > Strength High (1.00)
- Research Report > Experimental Study (1.00)
Industry:
- Health & Medicine (1.00)
- Transportation > Passenger (0.46)
Technology:
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)