AITopics | visitation

Collaborating Authors

visitation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning Process Rewards via Success Visitation Matching for Efficient RL

Tsao, Raymond, Wagenmaker, Andrew, Levine, Sergey

arXiv.org Machine LearningJun-23-2026

In many modern applications of reinforcement learning (RL), the natural reward for a task of interest is inherently sparse: a reward of 0 is given everywhere except when the task is completed, when a reward of +1 is given. Training a policy to maximize such a sparse reward requires solving a challenging credit assignment problem, leading to slow or ineffective RL improvement. We propose a simple approach to transform a sparse outcome reward into a dense process reward. Our approach relies on training a discriminator to distinguish between previous successful and unsuccessful episodes, and using this discriminator to incentivize the RL-learned policy to match the state-action visitations of successful episodes, while avoiding those of unsuccessful episodes. By incentivizing the policy to match the visitations over all states, not just those that correspond to task success, this reward provides dense feedback on whether progress is being made towards task completion, and, we show, provably achieves this without changing the optimal policy. Focusing on finetuning of robotic control policies, we demonstrate that our approach leads to significantly faster RL finetuning performance on both simulated and real-world manipulation tasks, as compared to simply maximizing the sparse outcome reward.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

arXiv.org Machine Learning

2606.2364

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Checklist

Neural Information Processing SystemsApr-25-2026, 17:32:32 GMT

The checklist follows the references. Please read the checklist guidelines carefully for information on how to answer these questions. You are strongly encouraged to include a justification to your answer, either by referencing the appropriate section of your paper or providing a brief inline description. Please do not modify the questions and only use the provided macros for your answers. Note that the Checklist section does not count towards the page limit. In your paper, please delete this instructions block and only keep the Checklist section heading above along with the questions/answers below.

machine learning, reinforcement learning, transition, (18 more...)

Neural Information Processing Systems

Genre: Research Report (0.93)

Technology:

Information Technology > Artificial Intelligence > Robots (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)

Add feedback

Curiosity-Critic: Cumulative Prediction Error Improvement as a Tractable Intrinsic Reward for World Model Training

Bhaskara, Vin, Wang, Haicheng

arXiv.org Machine LearningApr-22-2026

Local prediction-error-based curiosity rewards focus on the current transition without considering the world model's cumulative prediction error across all visited transitions. We introduce Curiosity-Critic, which grounds its intrinsic reward in the improvement of this cumulative objective, and show that it reduces to a tractable per-step form: the difference between the current prediction error and the asymptotic error baseline of the current state transition. We estimate this baseline online with a learned critic co-trained alongside the world model; regressing a single scalar, the critic converges well before the world model saturates, redirecting exploration toward learnable transitions without oracle knowledge of the noise floor. The reward is higher for learnable transitions and collapses toward the baseline for stochastic ones, effectively separating epistemic (reducible) from aleatoric (irreducible) prediction error online. Prior prediction-error curiosity formulations, from Schmidhuber (1991) to learned-feature-space variants, emerge as special cases corresponding to specific approximations of this baseline. Experiments on a stochastic grid world show that Curiosity-Critic outperforms prediction-error and visitation-count baselines in convergence speed and final world model accuracy.

artificial intelligence, machine learning, transition, (18 more...)

arXiv.org Machine Learning

2604.18701

Country:

North America > Canada > Ontario > Toronto (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

59112692262234e3fad47fa8eabf03a4-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 19:31:38 GMT

However,extrinsic rewards may be insufficiently informative to encourage an agent to explore and understand its environment, particularly in partially observed settings where the agent has a limited view of its environment.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
North America > United States > Massachusetts (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)

Add feedback

Checklist

Neural Information Processing SystemsFeb-8-2026, 11:56:10 GMT

The checklist follows the references. Please do not modify the questions and only use the provided macros for your answers. Checklist section does not count towards the page limit. Do the main claims made in the abstract and introduction accurately reflect the paper's Did you describe the limitations of your work? Did you discuss any potential negative societal impacts of your work?

machine learning, reinforcement learning, transition, (17 more...)

Neural Information Processing Systems

Genre: Research Report (0.93)

Technology:

Information Technology > Artificial Intelligence > Robots (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)

Add feedback

A Code

Neural Information Processing SystemsAug-15-2025, 17:27:30 GMT

We convert all images to grayscale and resize to 84x84. It is a convolutional neural network with fixed random weights. In Atari, we use 128 parallel environments, and in Habitat, we use 1 environment, as it does not support multithreading. We use the same hyperparameters as in large scale curiosity: a learning rate of 0.0001 for all models, a discount factor Future prediction and multimodal association can be complementary forms of curiosity. Further work could explore other ways of combining intrinsic rewards, such as switching between the complementary forms.

baseline, curiosity, experiment, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.37)

Add feedback

Information is Power: Intrinsic Control via Information Capture

Neural Information Processing SystemsAug-14-2025, 16:06:40 GMT

Figure 1: The agent uses a latent state space model to represent beliefs about the world, including dynamic objects like the goat. The blue window represents the agent's field-of-view, which defines the extent of the

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
North America > United States > Massachusetts (0.04)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Hyper: Hyperparameter Robust Efficient Exploration in Reinforcement Learning

Wang, Yiran, Liu, Chenshu, Li, Yunfan, Amani, Sanae, Zhou, Bolei, Yang, Lin F.

arXiv.org Machine LearningDec-4-2024

The exploration \& exploitation dilemma poses significant challenges in reinforcement learning (RL). Recently, curiosity-based exploration methods achieved great success in tackling hard-exploration problems. However, they necessitate extensive hyperparameter tuning on different environments, which heavily limits the applicability and accessibility of this line of methods. In this paper, we characterize this problem via analysis of the agent behavior, concluding the fundamental difficulty of choosing a proper hyperparameter. We then identify the difficulty and the instability of the optimization when the agent learns with curiosity. We propose our method, hyperparameter robust exploration (\textbf{Hyper}), which extensively mitigates the problem by effectively regularizing the visitation of the exploration and decoupling the exploitation to ensure stable training. We theoretically justify that \textbf{Hyper} is provably efficient under function approximation setting and empirically demonstrate its appealing performance and robustness in various environments.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Machine Learning

2412.03767

Country: North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre: Research Report (1.00)

Industry: Energy > Oil & Gas > Upstream (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

This Joshua Tree search and rescue team tries to head off calamity before it strikes

Los Angeles TimesSep-1-2024, 10:00:25 GMT

It's 4 p.m. in Joshua Tree National Park and the air temperature is hovering around 99 degrees -- relatively mild for an August afternoon. But at ground level, the sand along the popular Hidden Valley Nature Trail has reached a scorching 136. "I don't want my bare feet on that," says ranger Anna Marini as she shows her thermometer gun reading to a couple visiting from Switzerland, who are appropriately awed. Marini uses the tool as a prop to engage hikers traversing this surreal desert wilderness that's roughly the size of Rhode Island. As the coordinator of the park's Preventative Search and Rescue Program, her mission is to protect visitors from hazards that include extreme heat, razor-sharp cacti and thirsty bees.

artificial intelligence, marini, visitation, (9 more...)

Los Angeles Times

Country:

North America > United States > Rhode Island (0.25)
Europe > Switzerland (0.25)
North America > United States > District of Columbia > Washington (0.05)
North America > United States > California (0.05)

Industry: Health & Medicine > Consumer Health (0.47)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.40)

Add feedback

Leveraging Computer Vision in the Intensive Care Unit (ICU) for Examining Visitation and Mobility

Siegel, Scott, Zhang, Jiaqing, Bandyopadhyay, Sabyasachi, Nerella, Subhash, Silva, Brandon, Baslanti, Tezcan, Bihorac, Azra, Rashidi, Parisa

arXiv.org Artificial IntelligenceJul-12-2024

Despite the importance of closely monitoring patients in the Intensive Care Unit (ICU), many aspects are still assessed in a limited manner due to the time constraints imposed on healthcare providers. For example, although excessive visitations during rest hours can potentially exacerbate the risk of circadian rhythm disruption and delirium, it is not captured in the ICU. Likewise, while mobility can be an important indicator of recovery or deterioration in ICU patients, it is only captured sporadically or not captured at all. In the past few years, the computer vision field has found application in many domains by reducing the human burden. Using computer vision systems in the ICU can also potentially enable non-existing assessments or enhance the frequency and accuracy of existing assessments while reducing the staff workload. In this study, we leverage a state-of-the-art noninvasive computer vision system based on depth imaging to characterize ICU visitations and patients' mobility. We then examine the relationship between visitation and several patient outcomes, such as pain, acuity, and delirium. We found an association between deteriorating patient acuity and the incidence of delirium with increased visitations. In contrast, self-reported pain, reported using the Defense and Veteran Pain Rating Scale (DVPRS), was correlated with decreased visitations. Our findings highlight the feasibility and potential of using noninvasive autonomous systems to monitor ICU patients.

delirium, intensive care unit, publisher, (11 more...)

arXiv.org Artificial Intelligence

2403.06322

Country: