AITopics | total reward

Collaborating Authors

total reward

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

486c825db2f776da72d0b7a791f45b8f-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-19-2026, 12:34:50 GMT

artificial intelligence, reviewer, reward trap, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.31)

Add feedback

c4bf73386022473a652a18941e9ea6f8-Paper-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 23:51:49 GMT

machine learning, natural language, reinforcement learning, (22 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Evanston (0.04)
North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
(2 more...)

Genre: Research Report > New Finding (0.92)

Industry:

Leisure & Entertainment > Sports (1.00)
Leisure & Entertainment > Games > Computer Games (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
(3 more...)

Add feedback

For SALE: State-Action Representation Learning for Deep Reinforcement Learning

Neural Information Processing SystemsFeb-16-2026, 22:47:13 GMT

We extensively study the design space of these embeddings and highlight important design considerations.

machine learning, reinforcement learning, time step, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
North America > Puerto Rico (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.49)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Bandit Learning with Positive Externalities

Virag Shah, Jose Blanchet, Ramesh Johari

Neural Information Processing SystemsFeb-15-2026, 03:23:40 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, externality, positive externality, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

9a093d729036a5bd4736e03c5d634501-Paper.pdf

Neural Information Processing SystemsFeb-13-2026, 03:03:21 GMT

algorithm, bandit problem, recovery function, (13 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.04)
Europe > United Kingdom > England > Lancashire > Lancaster (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Asia > Japan > Kyūshū & Okinawa > Kyūshū > Fukuoka Prefecture > Fukuoka (0.04)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)

Add feedback

Risk-SensitiveReinforcementLearning: Near-OptimalRisk-SampleTradeoffinRegret

Neural Information Processing SystemsFeb-11-2026, 06:27:44 GMT

We study risk-sensitive reinforcement learning in episodic Markov decision processes with unknown transition kernels, where the goal is to optimize the total reward under the risk measure of exponential utility. We propose two provably efficient model-free algorithms, Risk-Sensitive Value Iteration (RSVI) and Risk-Sensitive Q-learning (RSQ). These algorithms implement a form of risk-sensitive optimism in the face of uncertainty, which adapts to both riskseeking and risk-averse modes of exploration.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Netherlands > South Holland > Delft (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

2df45244f09369e16ea3f9117ca45157-Supplemental.pdf

Neural Information Processing SystemsFeb-7-2026, 23:13:58 GMT

agent, demonstration, trajectory, (16 more...)

Neural Information Processing Systems

Country: North America > Canada (0.04)

Genre: Workflow (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.69)

Add feedback

Planning with General Objective Functions: Going Beyond Total Rewards

Neural Information Processing SystemsDec-24-2025, 09:58:27 GMT

Standard sequential decision-making paradigms aim to maximize the cumulative reward when interacting with the unknown environment., i.e., maximize $\sum_{h = 1}^H r_h$ where $H$ is the planning horizon. However, this paradigm fails to model important practical applications, e.g., safe control that aims to maximize the lowest reward, i.e., maximize $\min_{h= 1}^H r_h$. In this paper, based on techniques in sketching algorithms, we propose a novel planning algorithm in deterministic systems which deals with a large class of objective functions of the form $f(r_1, r_2, ... r_H)$ that are of interest to practical applications. We show that efficient planning is possible if $f$ is symmetric under permutation of coordinates and satisfies certain technical conditions. Complementing our algorithm, we further prove that removing any of the conditions will make the problem intractable in the worst case and thus demonstrate the necessity of our conditions.

general objective function, name change, planning, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.41)

Add feedback

Bayesian Ambiguity Contraction-based Adaptive Robust Markov Decision Processes for Adversarial Surveillance Missions

Choi, Jimin, Li, Max Z.

arXiv.org Artificial IntelligenceDec-2-2025

Collaborative Combat Aircraft (CCAs) are envisioned to enable autonomous Intelligence, Surveillance, and Reconnaissance (ISR) missions in contested environments, where adversaries may act strategically to deceive or evade detection. These missions pose challenges due to model uncertainty and the need for safe, real-time decision-making. Robust Markov Decision Processes (RMDPs) provide worst-case guarantees but are limited by static ambiguity sets that capture initial uncertainty without adapting to new observations. This paper presents an adaptive RMDP framework tailored to ISR missions with CCAs. We introduce a mission-specific formulation in which aircraft alternate between movement and sensing states. Adversarial tactics are modeled as a finite set of transition kernels, each capturing assumptions about how adversarial sensing or environmental conditions affect rewards. Our approach incrementally refines policies by eliminating inconsistent threat models, allowing agents to shift from conservative to aggressive behaviors while maintaining robustness. We provide theoretical guarantees showing that the adaptive planner converges as credible sets contract to the true threat and maintains safety under uncertainty. Experiments under Gaussian and non-Gaussian threat models across diverse network topologies show higher mission rewards and fewer exposure events compared to nominal and static robust planners.

ambiguity, machine learning, real time system, (20 more...)

arXiv.org Artificial Intelligence

2512.0166

Country: North America > United States > Michigan (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Air (1.00)
Aerospace & Defense > Aircraft (1.00)
Government > Military > Air Force (0.48)
(2 more...)

Technology:

Information Technology > Architecture > Real Time Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.85)

Add feedback

Bandit Learning with Positive Externalities

Virag Shah, Jose Blanchet, Ramesh Johari

Neural Information Processing SystemsNov-20-2025, 20:59:47 GMT

For example, if a news site generates articles that are liberal (resp., conservative), then it is most likely to attract additional users who are liberal (resp., conservative)

algorithm, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country: