visitation
A Code
Input preprocessing We convert all images to grayscale and resize to 84x84. It is a convolutional neural network with fixed random weights. In Atari, we use 128 parallel environments, and in Habitat, we use 1 environment, as it does not support multithreading. We use the same hyperparameters as in large scale curiosity: a learning rate of 0.0001 for all models, a discount factor Future prediction and multimodal association can be complementary forms of curiosity. Further work could explore other ways of combining intrinsic rewards, such as switching between the complementary forms.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
- North America > United States > Massachusetts (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.96)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Checklist
The checklist follows the references. Please do not modify the questions and only use the provided macros for your answers. Checklist section does not count towards the page limit. Do the main claims made in the abstract and introduction accurately reflect the paper's Did you describe the limitations of your work? Did you discuss any potential negative societal impacts of your work?
A Code
We convert all images to grayscale and resize to 84x84. It is a convolutional neural network with fixed random weights. In Atari, we use 128 parallel environments, and in Habitat, we use 1 environment, as it does not support multithreading. We use the same hyperparameters as in large scale curiosity: a learning rate of 0.0001 for all models, a discount factor Future prediction and multimodal association can be complementary forms of curiosity. Further work could explore other ways of combining intrinsic rewards, such as switching between the complementary forms.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
- North America > United States > Massachusetts (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.96)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Hyper: Hyperparameter Robust Efficient Exploration in Reinforcement Learning
Wang, Yiran, Liu, Chenshu, Li, Yunfan, Amani, Sanae, Zhou, Bolei, Yang, Lin F.
The exploration \& exploitation dilemma poses significant challenges in reinforcement learning (RL). Recently, curiosity-based exploration methods achieved great success in tackling hard-exploration problems. However, they necessitate extensive hyperparameter tuning on different environments, which heavily limits the applicability and accessibility of this line of methods. In this paper, we characterize this problem via analysis of the agent behavior, concluding the fundamental difficulty of choosing a proper hyperparameter. We then identify the difficulty and the instability of the optimization when the agent learns with curiosity. We propose our method, hyperparameter robust exploration (\textbf{Hyper}), which extensively mitigates the problem by effectively regularizing the visitation of the exploration and decoupling the exploitation to ensure stable training. We theoretically justify that \textbf{Hyper} is provably efficient under function approximation setting and empirically demonstrate its appealing performance and robustness in various environments.
This Joshua Tree search and rescue team tries to head off calamity before it strikes
It's 4 p.m. in Joshua Tree National Park and the air temperature is hovering around 99 degrees -- relatively mild for an August afternoon. But at ground level, the sand along the popular Hidden Valley Nature Trail has reached a scorching 136. "I don't want my bare feet on that," says ranger Anna Marini as she shows her thermometer gun reading to a couple visiting from Switzerland, who are appropriately awed. Marini uses the tool as a prop to engage hikers traversing this surreal desert wilderness that's roughly the size of Rhode Island. As the coordinator of the park's Preventative Search and Rescue Program, her mission is to protect visitors from hazards that include extreme heat, razor-sharp cacti and thirsty bees.
- North America > United States > Rhode Island (0.25)
- Europe > Switzerland (0.25)
- North America > United States > District of Columbia > Washington (0.05)
- North America > United States > California (0.05)
Leveraging Computer Vision in the Intensive Care Unit (ICU) for Examining Visitation and Mobility
Siegel, Scott, Zhang, Jiaqing, Bandyopadhyay, Sabyasachi, Nerella, Subhash, Silva, Brandon, Baslanti, Tezcan, Bihorac, Azra, Rashidi, Parisa
Despite the importance of closely monitoring patients in the Intensive Care Unit (ICU), many aspects are still assessed in a limited manner due to the time constraints imposed on healthcare providers. For example, although excessive visitations during rest hours can potentially exacerbate the risk of circadian rhythm disruption and delirium, it is not captured in the ICU. Likewise, while mobility can be an important indicator of recovery or deterioration in ICU patients, it is only captured sporadically or not captured at all. In the past few years, the computer vision field has found application in many domains by reducing the human burden. Using computer vision systems in the ICU can also potentially enable non-existing assessments or enhance the frequency and accuracy of existing assessments while reducing the staff workload. In this study, we leverage a state-of-the-art noninvasive computer vision system based on depth imaging to characterize ICU visitations and patients' mobility. We then examine the relationship between visitation and several patient outcomes, such as pain, acuity, and delirium. We found an association between deteriorating patient acuity and the incidence of delirium with increased visitations. In contrast, self-reported pain, reported using the Defense and Veteran Pain Rating Scale (DVPRS), was correlated with decreased visitations. Our findings highlight the feasibility and potential of using noninvasive autonomous systems to monitor ICU patients.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Greece > Central Macedonia > Thessaloniki (0.04)
- Europe > Finland > Uusimaa > Helsinki (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
First Go, then Post-Explore: the Benefits of Post-Exploration in Intrinsic Motivation
Yang, Zhao, Moerland, Thomas M., Preuss, Mike, Plaat, Aske
Go-Explore achieved breakthrough performance on challenging reinforcement learning (RL) tasks with sparse rewards. The key insight of Go-Explore was that successful exploration requires an agent to first return to an interesting state ('Go'), and only then explore into unknown terrain ('Explore'). We refer to such exploration after a goal is reached as 'post-exploration'. In this paper, we present a clear ablation study of post-exploration in a general intrinsically motivated goal exploration process (IMGEP) framework, that the Go-Explore paper did not show. We study the isolated potential of post-exploration, by turning it on and off within the same algorithm under both tabular and deep RL settings on both discrete navigation and continuous control tasks. Experiments on a range of MiniGrid and Mujoco environments show that post-exploration indeed helps IMGEP agents reach more diverse states and boosts their performance. In short, our work suggests that RL researchers should consider to use post-exploration in IMGEP when possible since it is effective, method-agnostic and easy to implement.