AITopics | pessimistic

Collaborating Authors

pessimistic

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Why So Pessimistic? Estimating Uncertainties for Offline RL through Ensembles, and Why Their Independence Matters

Neural Information Processing SystemsDec-24-2025, 11:19:46 GMT

Motivated by the success of ensembles for uncertainty estimation in supervised learning, we take a renewed look at how ensembles of $Q$-functions can be leveraged as the primary source of pessimism for offline reinforcement learning (RL). We begin by identifying a critical flaw in a popular algorithmic choice used by many ensemble-based RL algorithms, namely the use of shared pessimistic target values when computing each ensemble member's Bellman error. Through theoretical analyses and construction of examples in toy MDPs, we demonstrate that shared pessimistic targets can paradoxically lead to value estimates that are effectively optimistic. Given this result, we propose MSG, a practical offline RL algorithm that trains an ensemble of $Q$-functions with independently computed targets based on completely separate networks, and optimizes a policy with respect to the lower confidence bound of predicted action values. Our experiments on the popular D4RL and RL Unplugged offline RL benchmarks demonstrate that on challenging domains such as antmazes, MSG with deep ensembles surpasses highly well-tuned state-of-the-art methods by a wide margin. Additionally, through ablations on benchmarks domains, we verify the critical significance of using independently trained $Q$-functions, and study the role of ensemble size. Finally, as using separate networks per ensemble member can become computationally costly with larger neural network architectures, we investigate whether efficient ensemble approximations developed for supervised learning can be similarly effective, and demonstrate that they do not match the performance and robustness of MSG with separate networks, highlighting the need for new efforts into efficient uncertainty estimation directed at RL.

name change, pessimistic, uncertainty, (12 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.59)

Add feedback

Why So Pessimistic? Estimating Uncertainties for Offline RL through Ensembles, and Why Their Independence Matters

Neural Information Processing SystemsOct-11-2024, 15:22:06 GMT

Motivated by the success of ensembles for uncertainty estimation in supervised learning, we take a renewed look at how ensembles of Q -functions can be leveraged as the primary source of pessimism for offline reinforcement learning (RL). We begin by identifying a critical flaw in a popular algorithmic choice used by many ensemble-based RL algorithms, namely the use of shared pessimistic target values when computing each ensemble member's Bellman error. Through theoretical analyses and construction of examples in toy MDPs, we demonstrate that shared pessimistic targets can paradoxically lead to value estimates that are effectively optimistic. Given this result, we propose MSG, a practical offline RL algorithm that trains an ensemble of Q -functions with independently computed targets based on completely separate networks, and optimizes a policy with respect to the lower confidence bound of predicted action values. Our experiments on the popular D4RL and RL Unplugged offline RL benchmarks demonstrate that on challenging domains such as antmazes, MSG with deep ensembles surpasses highly well-tuned state-of-the-art methods by a wide margin.

independence matter, pessimistic, separate network, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.84)

Add feedback

COVID-Consumers: Pessimistic, but spending more online - Search Engine Land

#artificialintelligenceMar-26-2020, 02:37:50 GMT

Consumer sentiment has turned sharply negative as the virus has disrupted every aspect of daily American life. According to a consumer survey from Engine, 88% of consumers in the U.S. are now concerned about the pandemic. And according to another survey of roughly 2,600 U.S. adults from L.E.K. Consulting and Civis (.pdf), between 80% and 90% of adults expect a recession next year. In addition to measuring consumer sentiment, the survey explored how the coronavirus has shifted buying patterns across industries. Generally, the survey finds "significant increases in at-home activities, particularly cooking at home, watching television, browsing social media and exercising at home."

consumer, online, search engine land, (11 more...)

#artificialintelligence

Country: North America > United States (0.26)

Genre: Questionnaire & Opinion Survey (0.52)

Industry:

Health & Medicine (0.85)
Retail (0.53)

Technology:

Information Technology > Information Management > Search (0.85)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.40)

Add feedback

Decision Making with Dynamic Uncertain Events

Kalech, Meir, Reches, Shulamit

Journal of Artificial Intelligence ResearchNov-1-2015

When to make a decision is a key question in decision making problems characterized by uncertainty. In this paper we deal with decision making in environments where information arrives dynamically. We address the tradeoff between waiting and stopping strategies. On the one hand, waiting to obtain more information reduces uncertainty, but it comes with a cost. Stopping and making a decision based on an expected utility reduces the cost of waiting, but the decision is based on uncertain information. We propose an optimal algorithm and two approximation algorithms. We prove that one approximation is optimistic - waits at least as long as the optimal algorithm, while the other is pessimistic - stops not later than the optimal algorithm. We evaluate our algorithms theoretically and empirically and show that the quality of the decision in both approximations is near-optimal and much faster than the optimal algorithm. Also, we can conclude from the experiments that the cost function is a key factor to chose the most effective algorithm.

algorithm, assignment, node, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.4869

AI Access Foundation

10963

Journal of Artificial Intelligence Research

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)
(2 more...)

Genre: Research Report > New Finding (0.67)

Industry: Banking & Finance (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.67)

Add feedback