AITopics | discount rate

Collaborating Authors

discount rate

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Non-stationary and Varying-discounting Markov Decision Processes for Reinforcement Learning

Chen, Zhizuo, Allen, Theodore T.

arXiv.org Machine LearningDec-3-2025

Algorithms developed under stationary Markov Decision Processes (MDPs) often face challenges in non-stationary environments, and infinite-horizon formulations may not directly apply to finite-horizon tasks. To address these limitations, we introduce the Non-stationary and Varying-discounting MDP (NVMDP) framework, which naturally accommodates non-stationarity and allows discount rates to vary with time and transitions. Infinite-horizon, stationary MDPs emerge as special cases of NVMDPs for identifying an optimal policy, and finite-horizon MDPs are also subsumed within the NVMDP formulations. Moreover, NVMDPs provide a flexible mechanism to shape optimal policies, without altering the state space, action space, or the reward structure. We establish the theoretical foundations of NVMDPs, including assumptions, state- and action-value formulation and recursion, matrix representation, optimality conditions, and policy improvement under finite state and action spaces. Building on these results, we adapt dynamic programming and generalized Q-learning algorithms to NVMDPs, along with formal convergence proofs. For problems requiring function approximation, we extend the Policy Gradient Theorem and the policy improvement bound in Trust Region Policy Optimization (TRPO), offering proofs in both scalar and matrix forms. Empirical evaluations in a non-stationary gridworld environment demonstrate that NVMDP-based algorithms successfully recover optimal trajectories under multiple reward and discounting schemes, whereas original Q-learning fails. These results collectively show that NVMDPs provide a theoretically sound and practically effective framework for reinforcement learning, requiring only minor algorithmic modifications while enabling robust handling of non-stationarity and explicit optimal policy shaping.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Machine Learning

2511.17598

Country:

North America > United States > Ohio (0.04)
Asia > Japan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Detection and Localization of Changes in Conditional Distributions

Neural Information Processing SystemsNov-17-2025, 07:51:42 GMT

Despite the existence of many applied scenarios where the relationship between a pair changes, change point methods specifically designed for paired data are quite scarce.

artificial intelligence, change point, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England > Greater London > London > City of London (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre:

Research Report > Experimental Study (0.47)
Research Report > New Finding (0.46)

Industry:

Banking & Finance > Trading (1.00)
Government > Regional Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing SystemsOct-2-2025, 22:32:44 GMT

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The paper presents an algorithm that achieves optimal regret for sellers in posted-price auctions with strategic buyers. The intuition behind the definition of Regret is not clear enough, what does a small regret mean for the seller. There should be more elaboration on the intuition. The paper is well-written with proofs and theorems clearly stated.

artificial intelligence, discount rate, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.05)

Genre: Overview (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.30)

Add feedback

The Paradox of Doom: Acknowledging Extinction Risk Reduces the Incentive to Prevent It

Growiec, Jakub, Prettner, Klaus

arXiv.org Artificial IntelligenceSep-8-2025

We investigate the salience of extinction risk as a source of impatience. Our framework distinguishes between human extinction risk and individual mortality risk while allowing for various degrees of intergenerational altruism. Additionally, we consider the evolutionarily motivated "selfish gene" perspective. We find that the risk of human extinction is an indispensable component of the discount rate, whereas individual mortality risk can be hedged against - partially or fully, depending on the setup - through human reproduction. Overall, we show that in the face of extinction risk, people become more impatient rather than more farsighted. Thus, the greater the threat of extinction, the less incentive there is to invest in avoiding it. Our framework can help explain why humanity consistently underinvests in mitigation of catastrophic risks, ranging from climate change mitigation, via pandemic prevention, to addressing the emerging risks of transformative artificial intelligence.

artificial intelligence, discount rate, extinction risk, (12 more...)

arXiv.org Artificial Intelligence

2509.04855

Country:

Europe > Austria > Vienna (0.14)
Europe > Poland > Masovia Province > Warsaw (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (0.34)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Detection and Localization of Changes in Conditional Distributions

Neural Information Processing SystemsAug-19-2025, 16:28:00 GMT

Despite the existence of many applied scenarios where the relationship between a pair changes, change point methods specifically designed for paired data are quite scarce.

artificial intelligence, change point, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England > Greater London > London > City of London (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre:

Research Report > Experimental Study (0.47)
Research Report > New Finding (0.46)

Industry:

Banking & Finance > Trading (1.00)
Government > Regional Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

Add feedback

Not All Water Consumption Is Equal: A Water Stress Weighted Metric for Sustainable Computing

Wu, Yanran, Hua, Inez, Ding, Yi

arXiv.org Artificial IntelligenceJul-2-2025

Water consumption is an increasingly critical dimension of computing sustainability, especially as AI workloads rapidly scale. However, current water impact assessment often overlooks where and when water stress is more severe. To fill in this gap, we present SCARF, the first general framework that evaluates water impact of computing by factoring in both spatial and temporal variations in water stress. SCARF calculates an Adjusted Water Impact (AWI) metric that considers both consumption volume and local water stress over time. Through three case studies on LLM serving, datacenters, and semiconductor fabrication plants, we show the hidden opportunities for reducing water impact by optimizing location and time choices, paving the way for water-sustainable computing. The code is available at https://github.com/jojacola/SCARF.

artificial intelligence, large language model, natural language, (14 more...)

arXiv.org Artificial Intelligence

2506.22773

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.15)
North America > United States > Arizona (0.05)
North America > United States > Oregon (0.05)
(14 more...)

Genre: Research Report (0.82)

Industry:

Information Technology (0.96)
Education > Health & Safety > School Nutrition (0.69)
Energy > Power Industry (0.69)
Government > Regional Government > North America Government > United States Government (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.50)

Add feedback

Function-Coherent Gambles

Wheeler, Gregory

arXiv.org Artificial IntelligenceFeb-22-2025

The desirable gambles framework provides a foundational approach to imprecise probability theory but relies heavily on linear utility assumptions. This paper introduces {\em function-coherent gambles}, a generalization that accommodates non-linear utility while preserving essential rationality properties. We establish core axioms for function-coherence and prove a representation theorem that characterizes acceptable gambles through continuous linear functionals. The framework is then applied to analyze various forms of discounting in intertemporal choice, including hyperbolic, quasi-hyperbolic, scale-dependent, and state-dependent discounting. We demonstrate how these alternatives to constant-rate exponential discounting can be integrated within the function-coherent framework. This unified treatment provides theoretical foundations for modeling sophisticated patterns of time preference within the desirability paradigm, bridging a gap between normative theory and observed behavior in intertemporal decision-making under genuine uncertainty.

application, discounting, imprecise probability, (16 more...)

arXiv.org Artificial Intelligence

2503.01855

Country: Europe (0.14)

Genre: Research Report (0.50)

Industry: Banking & Finance > Trading (0.46)

Technology: Information Technology > Artificial Intelligence (0.69)

Add feedback

RVI-SAC: Average Reward Off-Policy Deep Reinforcement Learning

Hisaki, Yukinari, Ono, Isao

arXiv.org Artificial IntelligenceAug-4-2024

These learning (DRL) method utilizing the methods utilize the discounted reward criterion, which is average reward criterion. While most existing applicable to a variety of MDP-formulated tasks (Puterman, DRL methods employ the discounted reward criterion, 1994). In particular, for continuing tasks where there is this can potentially lead to a discrepancy no natural breakpoint in episodes, such as in robot locomotion between the training objective and performance (Todorov et al., 2012) or Access Control Queuing metrics in continuing tasks, making the average Tasks(Sutton & Barto, 2018), where the interaction between reward criterion a recommended alternative. We an agent and an environment can continue indefinitely, the introduce RVI-SAC, an extension of the state-ofthe-art discount rate plays a role in keeping the infinite horizon off-policy DRL method, Soft Actor-Critic return bounded. However, discounting introduces an undesirable (SAC) (Haarnoja et al., 2018a;b), to the average reward effect in continuing tasks by prioritizing rewards criterion. Our proposal consists of (1) Critic closer to the current time over those in the future. An approach updates based on RVI Q-learning (Abounadi et al., to mitigate this effect is to bring the discount rate 2001), (2) Actor updates introduced by the average closer to 1, but it is commonly known that a large discount reward soft policy improvement theorem, and rate can lead to instability and slower convergence(Fujimoto (3) automatic adjustment of Reset Cost enabling et al., 2018; Dewanto & Gallagher, 2021).

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2408.01972

Country:

Europe > Austria > Vienna (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
(5 more...)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Time preference, wealth and utility inequality: A microeconomic interaction and dynamic macroeconomic model connection approach

Kato, Takeshi

arXiv.org Artificial IntelligenceFeb-13-2024

Based on interactions between individuals and others and references to social norms, this study reveals the impact of heterogeneity in time preference on wealth distribution and inequality. We present a novel approach that connects the interactions between microeconomic agents that generate heterogeneity to the dynamic equations for capital and consumption in macroeconomic models. Using this approach, we estimate the impact of changes in the discount rate due to microeconomic interactions on capital, consumption and utility and the degree of inequality. The results show that intercomparisons with others regarding consumption significantly affect capital, i.e. wealth inequality. Furthermore, the impact on utility is never small and social norms can reduce this impact. Our supporting evidence shows that the quantitative results of inequality calculations correspond to survey data from cohort and cross-cultural studies. This study's micro-macro connection approach can be deployed to connect microeconomic interactions, such as exchange, interest and debt, redistribution, mutual aid and time preference, to dynamic macroeconomic models.

artificial intelligence, discount rate, interaction, (18 more...)

arXiv.org Artificial Intelligence

2402.08905

Country:

Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.05)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
North America > United States > Texas (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Industry: Banking & Finance > Economy (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Consensus group decision making under model uncertainty with a view towards environmental policy making

Koundouri, Phoebe, Papayiannis, Georgios I., Petracou, Electra V., Yannacopoulos, Athanasios N.

arXiv.org Artificial IntelligenceDec-1-2023

Group decision making is an important field with interesting applications in various disciplines, among which environmental economics. Group decision, often requires that all or the majority of agents in the group agree to a single proposal or opinion, i.e. consensus. This is particularly true in cases where there is no coercion involved in the implementation of the decision made, so that the implementation of the decision depends on the good will, or rather the acceptance of the common decision by all members of the group. To make the discussion more concrete we consider the following generic situation: Assume that a group of agents, G, has to reach a common decision concerning policies regarding a future contingency X. Policies may refer for instance to the cost of abatement measures for protection against X, which clearly require the acceptance of a commonly acceptable estimate for the value of X by every member of the group as well as the acceptance of a commonly acceptably discount factor. Typically, different member of the group will have different valuations for X, therefore report different costs for the adverse effects of X. Moreover, different members of the group will have different discount rates for calculating the present value of the future adverse effect X.

agent, artificial intelligence, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2312.00436

Country:

North America > United States (0.14)
South America > Argentina > Patagonia > Río Negro Province > Viedma (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre: Research Report (0.40)

Industry:

Law > Environmental Law (1.00)
Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback