AITopics | interest function

Checklist

Neural Information Processing SystemsApr-24-2026, 07:17:19 GMT

In the main text we present the TD and ETD algorithms for policy evaluation under linear function approximation, as a way to recognize the existing literature on emphatic algorithms [27]. We here present the derivation for policy evaluation under general function approximation. Following standard notation [41], capital letters for states, actions or rewards represent the random variable at time t (i.e. St is the random variable at time t) and lowercase letters represent their instantiation (i.e. St = sis the random variable St taking value sat time t).

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

008079ec00eec9760ee93af5434ee932-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 07:17:16 GMT

artificial intelligence, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.46)

Add feedback

Adaptive Interest for Emphatic Reinforcement Learning

Neural Information Processing SystemsDec-23-2025, 16:38:37 GMT

Emphatic algorithms have shown great promise in stabilizing and improving reinforcement learning by selectively emphasizing the update rule. Although the emphasis fundamentally depends on an interest function which defines the intrinsic importance of each state, most approaches simply adopt a uniform interest over all states (except where a hand-designed interest is possible based on domain knowledge). In this paper, we investigate adaptive methods that allow the interest function to dynamically vary over states and iterations. In particular, we leverage meta-gradients to automatically discover online an interest function that would accelerate the agent's learning process. Empirical evaluations on a wide range of environments show that adapting the interest is key to provide significant gains. Qualitative analysis indicates that the learned interest function emphasizes states of particular importance, such as bottlenecks, which can be especially useful in a transfer learning setting.

adaptive interest, emphatic reinforcement learning, name change, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.32)

Add feedback

Adaptive Interest for Emphatic Reinforcement Learning

Neural Information Processing SystemsMay-26-2025, 14:43:35 GMT

Emphatic algorithms have shown great promise in stabilizing and improving reinforcement learning by selectively emphasizing the update rule. Although the emphasis fundamentally depends on an interest function which defines the intrinsic importance of each state, most approaches simply adopt a uniform interest over all states (except where a hand-designed interest is possible based on domain knowledge). In this paper, we investigate adaptive methods that allow the interest function to dynamically vary over states and iterations. In particular, we leverage meta-gradients to automatically discover online an interest function that would accelerate the agent's learning process. Empirical evaluations on a wide range of environments show that adapting the interest is key to provide significant gains.

artificial intelligence, emphatic reinforcement learning, machine learning, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)

Add feedback

Adaptive Interest for Emphatic Reinforcement Learning

Neural Information Processing SystemsOct-9-2024, 08:37:19 GMT

Emphatic algorithms have shown great promise in stabilizing and improving reinforcement learning by selectively emphasizing the update rule. Although the emphasis fundamentally depends on an interest function which defines the intrinsic importance of each state, most approaches simply adopt a uniform interest over all states (except where a hand-designed interest is possible based on domain knowledge). In this paper, we investigate adaptive methods that allow the interest function to dynamically vary over states and iterations. In particular, we leverage meta-gradients to automatically discover online an interest function that would accelerate the agent's learning process. Empirical evaluations on a wide range of environments show that adapting the interest is key to provide significant gains.

adaptive interest, emphatic reinforcement learning, interest function

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)

Add feedback

Checklist

Neural Information Processing SystemsFeb-18-2024, 04:15:31 GMT

In the main text we present the TD and ETD algorithms for policy evaluation under linear function approximation, as a way to recognize the existing literature on emphatic algorithms [27]. We here present the derivation for policy evaluation under general function approximation. Following standard notation [41], capital letters for states, actions or rewards represent the random variable at time t (i.e. S

interest function, objective, value function, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Adaptive Interest for Emphatic Reinforcement Learning

Neural Information Processing SystemsFeb-18-2024, 04:15:28 GMT

Emphatic algorithms have shown great promise in stabilizing and improving reinforcement learning by selectively emphasizing the update rule. Although the emphasis fundamentally depends on an interest function which defines the intrinsic importance of each state, most approaches simply adopt a uniform interest over all states (except where a hand-designed interest is possible based on domain knowledge). In this paper, we investigate adaptive methods that allow the interest function to dynamically vary over states and iterations. In particular, we leverage meta-gradients to automatically discover online an interest function that would accelerate the agent's learning process. Empirical evaluations on a wide range of environments show that adapting the interest is key to provide significant gains. Qualitative analysis indicates that the learned interest function emphasizes states of particular importance, such as bottlenecks, which can be especially useful in a transfer learning setting.

algorithm, interest function, international conference, (12 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > South Korea > Seoul > Seoul (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.46)

Add feedback

Interactive Multi Interest Process Pattern Discovery

Vazifehdoostirani, Mozhgan, Genga, Laura, Lu, Xixi, Verhoeven, Rob, van Laarhoven, Hanneke, Dijkman, Remco

arXiv.org Artificial IntelligenceAug-28-2023

Existing PPDMs typically are unsupervised and focus on a single dimension of interest, such as discovering frequent patterns. We present an interactive multi-interest-driven framework for process pattern discovery aimed at identifying patterns that are optimal according to a multi-dimensional analysis goal. The proposed approach is iterative and interactive, thus taking experts' knowledge into account during the discovery process. The paper focuses on a concrete analysis goal, i.e., deriving process patterns that affect the process outcome. We evaluate the approach on real-world event logs in both interactive and fully automated settings. The approach extracted meaningful patterns validated by expert knowledge in the interactive setting. Patterns extracted in the automated settings consistently led to prediction performance comparable to or better than patterns derived considering single-interest dimensions without requiring user-defined thresholds.

evaluation, interest function, process pattern, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-41620-0_18

2308.14475

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > Netherlands > North Brabant > Eindhoven (0.04)
Asia (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)

Add feedback

Toward Discovering Options that Achieve Faster Planning

Wan, Yi, Sutton, Richard S.

arXiv.org Artificial IntelligenceSep-29-2022

We propose a new objective for option discovery that emphasizes the computational advantage of using options in planning. In a sequential machine, the speed of planning is proportional to the number of elementary operations used to achieve a good policy. For episodic tasks, the number of elementary operations depends on the number of options composed by the policy in an episode and the number of options being considered at each decision point. To reduce the amount of computation in planning, for a given set of episodic tasks and a given number of options, our objective prefers options with which it is possible to achieve a high return by composing few options, and also prefers a smaller set of options to choose from at each decision point. We develop an algorithm that optimizes the proposed objective. In a variant of the classic four-room domain, we show that 1) a higher objective value is typically associated with fewer number of elementary planning operations used by the option-value iteration algorithm to obtain a near-optimal value function, 2) our algorithm achieves an objective value that matches it achieved by two human-designed options 3) the amount of computation used by option-value iteration with options discovered by our algorithm matches it with the human-designed options, 4) the options produced by our algorithm also make intuitive sense--they seem to move to and terminate at the entrances of rooms.

adjustable option, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2205.12515

Country:

North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.14)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

IDNP: Interest Dynamics Modeling using Generative Neural Processes for Sequential Recommendation

Du, Jing, Ye, Zesheng, Yao, Lina, Guo, Bin, Yu, Zhiwen

arXiv.org Artificial IntelligenceAug-9-2022

Recent sequential recommendation models rely increasingly on consecutive short-term user-item interaction sequences to model user interests. These approaches have, however, raised concerns about both short- and long-term interests. (1) {\it short-term}: interaction sequences may not result from a monolithic interest, but rather from several intertwined interests, even within a short period of time, resulting in their failures to model skip behaviors; (2) {\it long-term}: interaction sequences are primarily observed sparsely at discrete intervals, other than consecutively over the long run. This renders difficulty in inferring long-term interests, since only discrete interest representations can be derived, without taking into account interest dynamics across sequences. In this study, we address these concerns by learning (1) multi-scale representations of short-term interests; and (2) dynamics-aware representations of long-term interests. To this end, we present an \textbf{I}nterest \textbf{D}ynamics modeling framework using generative \textbf{N}eural \textbf{P}rocesses, coined IDNP, to model user interests from a functional perspective. IDNP learns a global interest function family to define each user's long-term interest as a function instantiation, manifesting interest dynamics through function continuity. Specifically, IDNP first encodes each user's short-term interactions into multi-scale representations, which are then summarized as user context. By combining latent global interest with user context, IDNP then reconstructs long-term user interest functions and predicts interactions at upcoming query timestep. Moreover, IDNP can model such interest functions even when interaction sequences are limited and non-consecutive. Extensive experiments on four real-world datasets demonstrate that our model outperforms state-of-the-arts on various evaluation metrics.

interaction, representation, sequence, (17 more...)

arXiv.org Artificial Intelligence

2208.046

Country: