AITopics | Markov Models

Collaborating Authors

Markov Models

News Overviews Instructional Materials AI-Alerts Classics

Review for NeurIPS paper: Belief-Dependent Macro-Action Discovery in POMDPs using the Value of Information

Neural Information Processing SystemsJan-26-2025, 03:35:45 GMT

The authors did a good jump of addressing reviewer concerns in the response. There were some lingering concerns about whether the authors had picked the best compare-to choices for their experiments. Additional experiments and/or more careful justification for the choices made would always help. I would recommend that the authors take the reviewers' comments into account in preparing the final version of the paper.

belief-dependent macro-action discovery, information, neurips paper, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.40)

Add feedback

Reviews: Regret Minimization for Reinforcement Learning with Vectorial Feedback and Complex Objectives

Neural Information Processing SystemsJan-26-2025, 03:15:46 GMT

Two out of three reviewers appreciated the contributions of this paper, with one expert reviewer praising almost every aspect of the paper. On the negative side, one reviewer took issue with the proposed setting, highlighting that the utility of the proposed objective function is somewhat dubious in the general context of multi-objective decision making. I agree with this reviewer in that having "multi-objective" in the title of the paper may set the wrong expectations for some readers, and I suggest that the authors consider changing the title of the paper for its final version to avoid such misunderstandings. Furthermore, the final version should discuss the relationship between this paper and the very recent work of Rosenberg and Mansour (2019) that studies essentially the same problem in episodic MDPs. Other than these concerns, the paper is worthy of being published without major changes.

regret minimization, reinforcement learning, vectorial feedback and complex objective, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)

Add feedback

Expert-Free Online Transfer Learning in Multi-Agent Reinforcement Learning

Castagna, Alberto

arXiv.org Artificial IntelligenceJan-26-2025

Reinforcement Learning (RL) enables an intelligent agent to optimise its performance in a task by continuously taking action from an observed state and receiving a feedback from the environment in form of rewards. RL typically uses tables or linear approximators to map state-action tuples that maximises the reward. Combining RL with deep neural networks (DRL) significantly increases its scalability and enables it to address more complex problems than before. However, DRL also inherits downsides from both RL and deep learning. Despite DRL improves generalisation across similar state-action pairs when compared to simpler RL policy representations like tabular methods, it still requires the agent to adequately explore the state-action space. Additionally, deep methods require more training data, with the volume of data escalating with the complexity and size of the neural network. As a result, deep RL requires a long time to collect enough agent-environment samples and to successfully learn the underlying policy. Furthermore, often even a slight alteration to the task invalidates any previous acquired knowledge. To address these shortcomings, Transfer Learning (TL) has been introduced, which enables the use of external knowledge from other tasks or agents to enhance a learning process. The goal of TL is to reduce the learning complexity for an agent dealing with an unfamiliar task by simplifying the exploration process. This is achieved by lowering the amount of new information required by its learning model, resulting in a reduced overall convergence time...

artificial intelligence, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2501.15495

Genre:

Workflow (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.45)

Add feedback

Episodic Novelty Through Temporal Distance

Jiang, Yuhua, Liu, Qihan, Yang, Yiqin, Ma, Xiaoteng, Zhong, Dianyu, Hu, Hao, Yang, Jun, Liang, Bin, Xu, Bo, Zhang, Chongjie, Zhao, Qianchuan

arXiv.org Artificial IntelligenceJan-26-2025

Exploration in sparse reward environments remains a significant challenge in reinforcement learning, particularly in Contextual Markov Decision Processes (CMDPs), where environments differ across episodes. Existing episodic intrinsic motivation methods for CMDPs primarily rely on count-based approaches, which are ineffective in large state spaces, or on similarity-based methods that lack appropriate metrics for state comparison. To address these shortcomings, we propose Episodic Novelty Through Temporal Distance (ETD), a novel approach that introduces temporal distance as a robust metric for state similarity and intrinsic reward computation. By employing contrastive learning, ETD accurately estimates temporal distances and derives intrinsic rewards based on the novelty of states within the current episode. Extensive experiments on various benchmark tasks demonstrate that ETD significantly outperforms state-of-the-art methods, highlighting its effectiveness in enhancing exploration in sparse reward CMDPs.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2501.15418

Country:

Asia > China (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre:

Research Report > New Finding (0.67)
Research Report > Promising Solution (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

Formal Verification of Markov Processes with Learned Parameters

Maaz, Muhammad, Chan, Timothy C. Y.

arXiv.org Artificial IntelligenceJan-26-2025

We introduce the problem of formally verifying properties of Markov processes where the parameters are the output of machine learning models. Our formulation is general and solves a wide range of problems, including verifying properties of probabilistic programs that use machine learning, and subgroup analysis in healthcare modeling. We show that for a broad class of machine learning models, including linear models, tree-based models, and neural networks, verifying properties of Markov chains like reachability, hitting time, and total reward can be formulated as a bilinear program. We develop a decomposition and bound propagation scheme for solving the bilinear program and show through computational experiments that our method solves the problem to global optimality up to 100x faster than state-of-the-art solvers. We also release $\texttt{markovml}$, an open-source tool for building Markov processes, integrating pretrained machine learning models, and verifying their properties, available at https://github.com/mmaaz-git/markovml.

artificial intelligence, machine learning, markov process, (17 more...)

arXiv.org Artificial Intelligence

2501.15767

Country:

North America > Canada > Ontario > Toronto (0.04)
North America > United States (0.04)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.47)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Reviews: Markov Random Fields for Collaborative Filtering

Neural Information Processing SystemsJan-25-2025, 16:34:57 GMT

The paper presents a novel method for recommendation with collaborative filtering based on Markov Random Fields (MRF). Starting from a general approach that regresses the full graph of items, the paper shows that a valid approximation can be obtained by proceeding with subgraphs that represent Markov blankets of an initial set of items. This approach yields significant computing gains, while yielding better recommendation performance compared to the state-of-the-art represented here by variational auto-encoders. As a general comment, I am wondering whether taking into account the popularity bias makes sense in the approach and if the authors thought about it. The claims are well supported by theoretical analysis.

collaborative filtering, markov random field, recommendation performance, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Communications > Social Media (0.63)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.63)

Add feedback

Reviews: Markov Random Fields for Collaborative Filtering

Neural Information Processing SystemsJan-25-2025, 16:34:47 GMT

Reviewers were initially quite favorable with respect to this paper and your response lifted some remaining doubts (especially from Reviewer #1). I am happy to recommend acceptance, congratulations! I would recommend that you take the reviewer comments into account to prepare a camera-ready version. In particular, it seems to be important to incorporate some of the discussion in bullets 1 and 2 in your response (regarding Mult-VAE and the high-level summary or pseudocode).

collaborative filtering, markov random field, reviewer

Neural Information Processing Systems

Technology:

Information Technology > Communications > Social Media (0.40)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.40)

Add feedback

Reviews: Sampling Networks and Aggregate Simulation for Online POMDP Planning

Neural Information Processing SystemsJan-25-2025, 13:23:14 GMT

Author feedback: I thank the authors for the feedback. The feedback was of high quality and satisfied my concerns. I suggest that, perhaps a compressed version, of "Explaining limitations of our work" from the author feedback, which I enjoyed reading, will be added to the final version of the paper. The paper "Sampling Networks and Aggregate Simulation for Online POMDP Planning" proposes a new solution to computing policies for large POMDP problems that is based on factorizing the belief distribution using a mean field approximation during planning and execution and extending aggregate simulation to POMDPs. In short, the proposed POMDP planner projects factorized beliefs forward in time forming at the same time a computational graph and then computes gradients backwards in time over the graph to improve the policy.

approximation, online pomdp planning, sampling network and aggregate simulation, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Reviews: Sampling Networks and Aggregate Simulation for Online POMDP Planning

Neural Information Processing SystemsJan-25-2025, 13:23:03 GMT

All reviewers appreciate a practical approach to tackle POMDP in large state and observation space with factorized belief and aggregated simulation. Reviewers had some concern regarding the limitation of the work by the factorization assumption, but these concerns are addressed in author feedback. Reviewers are particularly happy about the quality of the rebuttal and encourage authors to incorporate the discussion of limitation of the algorithm in final draft.

limitation, online pomdp planning, sampling network and aggregate simulation, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.74)

Add feedback

Reviews: Non-Stationary Markov Decision Processes, a Worst-Case Approach using Model-Based Reinforcement Learning

Neural Information Processing SystemsJan-25-2025, 07:58:15 GMT

UPDATE: I have read the authors response and increased my score. Specifically, the authors fixed my understanding of Property 1 and properly framed the relaxation of the problem in Section 5. Please include similar clarifications in the final work. There was also a lot of discussion among the reviewers about how the paper relates to the Robust MDP literature, which needs to be covered better in the current work. Papers such as "Reinforcement Learning in Robust Markov Decision Processes" and "Online Learning in Markov Decision Processes with Adversarially Chosen Transition Probability Distributions" were brought up by others and seem applicable in the current setting and could be empirical competitors to RATS. I very much like the constraints used to study planning in non-stationary environments in this paper and the min-max inspired RATS algorithm seems like an appropriate game theoretic approach.

assumption, markov decision process, reinforcement learning, (12 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.62)

Add feedback