AITopics | discounting

Collaborating Authors

discounting

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Reinforcement Learning with Non-Exponential Discounting

Neural Information Processing SystemsApr-24-2026, 20:15:47 GMT

Commonly in reinforcement learning (RL), rewards are discounted over time using an exponential function to model time preference, thereby bounding the expected long-term reward. In contrast, in economics and psychology, it has been shown that humans often adopt a hyperbolic discounting scheme, which is optimal when a specific task termination time distribution is assumed. In this work, we propose a theory for continuous-time model-based reinforcement learning generalized to arbitrary discount functions. This formulation covers the case in which there is a non-exponential random termination time. We derive a Hamilton-Jacobi-Bellman (HJB) equation characterizing the optimal policy and describe how it can be solved using a collocation method, which uses deep learning for function approximation. Further, we show how the inverse RL problem can be approached, in which one tries to recover properties of the discount function given decision data. We validate the applicability of our proposed approach on two simulated problems. Our approach opens the way for the analysis of human discounting in sequential decision-making tasks.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: Europe (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

The power of absolute discounting: all-dimensional distribution estimation

Moein Falahatgar, Mesrob I. Ohannessian, Alon Orlitsky, Venkatadheeraj Pichapati

Neural Information Processing SystemsNov-21-2025, 06:52:09 GMT

Neural Information Processing Systems http://nips.cc/

absolute discounting, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Michigan > Wayne County > Detroit (0.04)
North America > United States > Maryland (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Asia > Middle East > Iraq > Baghdad Governorate > Baghdad (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Teaching Precommitted Agents: Model-Free Policy Evaluation and Control in Quasi-Hyperbolic Discounted MDPs

Eshwar, S. R.

arXiv.org Artificial IntelligenceSep-9-2025

Abstract-- Time-inconsistent preferences, where agents favor smaller-sooner over larger-later rewards, are a key feature of human and animal decision-making. Quasi-Hyperbolic (QH) discounting provides a simple yet powerful model for this behavior, but its integration into the reinforcement learning (RL) framework has been limited. We make two primary contributions: (i) we formally characterize the structure of the optimal policy, proving for the first time that it reduces to a simple one-step non-stationary form; and (ii) we design the first practical, model-free algorithms for both policy evaluation and Q-learning in this setting, both with provable convergence guarantees. Our results provide foundational insights for incorporating QH preferences in RL. Reinforcement learning (RL) [12] provides a powerful framework for sequential decision-making, where an agent interacts with an environment to maximize long-term cumulative rewards.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2509.06094

Country: Asia > India (0.14)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

The Paradox of Doom: Acknowledging Extinction Risk Reduces the Incentive to Prevent It

Growiec, Jakub, Prettner, Klaus

arXiv.org Artificial IntelligenceSep-8-2025

We investigate the salience of extinction risk as a source of impatience. Our framework distinguishes between human extinction risk and individual mortality risk while allowing for various degrees of intergenerational altruism. Additionally, we consider the evolutionarily motivated "selfish gene" perspective. We find that the risk of human extinction is an indispensable component of the discount rate, whereas individual mortality risk can be hedged against - partially or fully, depending on the setup - through human reproduction. Overall, we show that in the face of extinction risk, people become more impatient rather than more farsighted. Thus, the greater the threat of extinction, the less incentive there is to invest in avoiding it. Our framework can help explain why humanity consistently underinvests in mitigation of catastrophic risks, ranging from climate change mitigation, via pandemic prevention, to addressing the emerging risks of transformative artificial intelligence.

artificial intelligence, discount rate, extinction risk, (12 more...)

arXiv.org Artificial Intelligence

2509.04855

Country: Europe > Austria (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (0.34)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Adaptive Traffic-Following Scheme for Orderly Distributed Control of Multi-Vehicle Systems

Jain, Anahita, Idris, Husni, Clarke, John-Paul, Delahaye, Daniel

arXiv.org Artificial IntelligenceJun-3-2025

--We present an adaptive control scheme to enable the emergence of order within distributed, autonomous multi-agent systems. Past studies showed that under high-density conditions, order generated from traffic-following behavior reduces travel times, while under low densities, choosing direct paths is more beneficial. In this paper, we leveraged those findings to allow aircraft to independently and dynamically adjust their degree of traffic-following behavior based on the current state of the airspace. This enables aircraft to follow other traffic only when beneficial. Quantitative analyses revealed that dynamic traffic-following behavior results in lower aircraft travel times at the cost of minimal levels of additional disorder to the airspace. The sensitivity of these benefits to temporal and spatial horizons was also investigated. Overall, this work highlights the benefits, and potential necessity, of incorporating self-organizing behavior in making distributed, autonomous multi-agent systems scalable. Autonomous vehicle operations are expected to increase in the airspace over the coming decades. Initially, applications will include non-passenger operations, such as fire fighting or cargo delivery using uncrewed aerial vehicles of different sizes. Eventually, the scope will expand to passenger-carrying vehicles for urban or regional air mobility. These vehicles are expected to interact and integrate with other traffic within the same airspace. For scalability, this airspace of the future will be a collective system of autonomous vehicles, where each vehicle makes increasingly independent decisions.

aircraft, artificial intelligence, traffic-following behavior, (14 more...)

arXiv.org Artificial Intelligence

2506.00703

Country: North America > United States > Texas > Travis County > Austin (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Transportation > Air (1.00)
Transportation > Infrastructure & Services (0.94)
Government > Regional Government > North America Government > United States Government (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Information-Seeking Decision Strategies Mitigate Risk in Dynamic, Uncertain Environments

Barendregt, Nicholas W., Gold, Joshua I., Josić, Krešimir, Kilpatrick, Zachary P.

arXiv.org Artificial IntelligenceMar-24-2025

To survive in dynamic and uncertain environments, individuals must develop effective decision strategies that balance information gathering and decision commitment. Models of such strategies often prioritize either optimizing tangible payoffs, like reward rate, or gathering information to support a diversity of (possibly unknown) objectives. However, our understanding of the relative merits of these two approaches remains incomplete, in part because direct comparisons have been limited to idealized, static environments that lack the dynamic complexity of the real world. Here we compared the performance of normative reward- and information-seeking strategies in a dynamic foraging task. Both strategies show similar transitions between exploratory and exploitative behaviors as environmental uncertainty changes. However, we find subtle disparities in the actions they take, resulting in meaningful performance differences: whereas reward-seeking strategies generate slightly more reward on average, information-seeking strategies provide more consistent and predictable outcomes. Our findings support the adaptive value of information-seeking behaviors that can mitigate risk with minimal reward loss.

artificial intelligence, information management, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2503.19107

Country:

North America > United States > Colorado > Boulder County > Boulder (0.04)
North America > United States > Pennsylvania (0.04)
North America > United States > New York (0.04)
North America > Cuba (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Health & Medicine (0.69)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)

Add feedback

Function-Coherent Gambles

Wheeler, Gregory

arXiv.org Artificial IntelligenceFeb-22-2025

The desirable gambles framework provides a foundational approach to imprecise probability theory but relies heavily on linear utility assumptions. This paper introduces {\em function-coherent gambles}, a generalization that accommodates non-linear utility while preserving essential rationality properties. We establish core axioms for function-coherence and prove a representation theorem that characterizes acceptable gambles through continuous linear functionals. The framework is then applied to analyze various forms of discounting in intertemporal choice, including hyperbolic, quasi-hyperbolic, scale-dependent, and state-dependent discounting. We demonstrate how these alternatives to constant-rate exponential discounting can be integrated within the function-coherent framework. This unified treatment provides theoretical foundations for modeling sophisticated patterns of time preference within the desirability paradigm, bridging a gap between normative theory and observed behavior in intertemporal decision-making under genuine uncertainty.

application, discounting, imprecise probability, (16 more...)

arXiv.org Artificial Intelligence

2503.01855

Country:

Europe > United Kingdom > England > West Sussex (0.04)
Europe > Germany (0.04)

Genre: Research Report (0.50)

Industry: Banking & Finance > Trading (0.46)

Technology: Information Technology > Artificial Intelligence (0.69)

Add feedback

Partial Identifiability in Inverse Reinforcement Learning For Agents With Non-Exponential Discounting

Skalse, Joar, Abate, Alessandro

arXiv.org Artificial IntelligenceDec-15-2024

The aim of inverse reinforcement learning (IRL) is to infer an agent's preferences from observing their behaviour. Usually, preferences are modelled as a reward function, $R$, and behaviour is modelled as a policy, $\pi$. One of the central difficulties in IRL is that multiple preferences may lead to the same observed behaviour. That is, $R$ is typically underdetermined by $\pi$, which means that $R$ is only partially identifiable. Recent work has characterised the extent of this partial identifiability for different types of agents, including optimal and Boltzmann-rational agents. However, work so far has only considered agents that discount future reward exponentially: this is a serious limitation, especially given that extensive work in the behavioural sciences suggests that humans are better modelled as discounting hyperbolically. In this work, we newly characterise partial identifiability in IRL for agents with non-exponential discounting: our results are in particular relevant for hyperbolical discounting, but they also more generally apply to agents that use other types of (non-exponential) discounting. We significantly show that generally IRL is unable to infer enough information about $R$ to identify the correct optimal policy, which entails that IRL alone can be insufficient to adequately characterise the preferences of such agents.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2412.11155

Country: