Markov Models
Reviews: Non-Stationary Markov Decision Processes, a Worst-Case Approach using Model-Based Reinforcement Learning
The reviewers felt that this paper was well-executed, even though the proposed approach is a rather straightforward application of techniques from the robust MDP literature (specifically, minmax planning with appropriately defined uncertainty sets derived from a Lipschitzness assumption). For the final version, the authors should improve the discussion of related literature on robust MDPs (e.g., "Reinforcement Learning in Robust Markov Decision Processes" by Lim et al., NIPS 2013 references therein) and on MDPs with non-stationary transitions (e.g., "Online Learning in Markov Decision Processes with Adversarially Chosen Transition Probability Distributions" by Abbasi-Yadkori et al., NIPS 2013 references therein).
Reviews: Learning Multiple Markov Chains via Adaptive Allocation
This paper aims at learning a collection of transition matrices of ergodic Markov chains, where at each round the algorithm can select one of the chains and observe which state it fell in. The problem consists in designing a strategy such as the learning will occur uniformly over all chains at the best possible rate. The paper is of theoretical nature, the background on chains is properly introduced, the algorithm is clearly described and thoroughly analyzed. The paper in its current form is a stronger submission than its previous version. It is more focused, the assumptions are clearer, it is more detailed, and an overall better read.
Reviews: Learning Multiple Markov Chains via Adaptive Allocation
All reviewers felt that this is a well-executed paper with good writing and solid results, therefore clearly worthy of acceptance. The only general complaint was that the setting may have been somewhat poorly motivated, and the authors should consider providing an illustrative motivating example in the final version of the paper.
Computational Protein Science in the Era of Large Language Models (LLMs)
Fan, Wenqi, Zhou, Yi, Wang, Shijie, Yan, Yuyao, Liu, Hui, Zhao, Qian, Song, Le, Li, Qing
Considering the significance of proteins, computational protein science has always been a critical scientific field, dedicated to revealing knowledge and developing applications within the protein sequence-structure-function paradigm. In the last few decades, Artificial Intelligence (AI) has made significant impacts in computational protein science, leading to notable successes in specific protein modeling tasks. However, those previous AI models still meet limitations, such as the difficulty in comprehending the semantics of protein sequences, and the inability to generalize across a wide range of protein modeling tasks. Recently, LLMs have emerged as a milestone in AI due to their unprecedented language processing & generalization capability. They can promote comprehensive progress in fields rather than solving individual tasks. As a result, researchers have actively introduced LLM techniques in computational protein science, developing protein Language Models (pLMs) that skillfully grasp the foundational knowledge of proteins and can be effectively generalized to solve a diversity of sequence-structure-function reasoning problems. While witnessing prosperous developments, it's necessary to present a systematic overview of computational protein science empowered by LLM techniques. First, we summarize existing pLMs into categories based on their mastered protein knowledge, i.e., underlying sequence patterns, explicit structural and functional information, and external scientific languages. Second, we introduce the utilization and adaptation of pLMs, highlighting their remarkable achievements in promoting protein structure prediction, protein function prediction, and protein design studies. Then, we describe the practical application of pLMs in antibody design, enzyme design, and drug discovery. Finally, we specifically discuss the promising future directions in this fast-growing field.
Review for NeurIPS paper: Natural Policy Gradient Primal-Dual Method for Constrained Markov Decision Processes
Strengths: Comments about the paper: This paper presents convergence analysis of primal-dual natural policy gradient methods under the CMDP framework. Several recent works have shown convergence of policy gradients and optimality bounds (e.g Agarwal et al., Mei et al), but the paper extends similar analysis to (a) natural policy gradients (b) CMDP framework with constraints. Overall, it archives a sublinear rate of convergence in the CMDP framework, similar to other related works with convergence analysis. The analysis of the paper is done for the general MDP case with function approximation and restricted policy classes. It is a very well written paper that is easy to follow with significant theoretical derivation and proof details.
Review for NeurIPS paper: Natural Policy Gradient Primal-Dual Method for Constrained Markov Decision Processes
After reading the authors' rebuttal, the reviewers discussed their concerns about this paper. Ultimately, a consensus was not reached asreviewer #1 feels that the issues raised in her/his review were not properly addressed in the authors' feedback. The other reviewers also share some of the concerns raised by reviewer #1, but, given the rebuttals, they believe the authors can fix them in the final version and make the contribution of their paper clearer. I agree with them and so I suggest to accept the paper, but I recommend that the authors take into consideration the issues raised in the reviews and address them carefully in the final version of the paper.
Learning Restricted Boltzmann Machines with Sparse Latent Variables
Restricted Boltzmann Machines (RBMs) are a common family of undirected graphical models with latent variables. An RBM is described by a bipartite graph, with all observed variables in one layer and all latent variables in the other. We consider the task of learning an RBM given samples generated according to it. The best algorithms for this task currently have time complexity \tilde{O}(n 2) for ferromagnetic RBMs (i.e., with attractive potentials) but \tilde{O}(n d) for general RBMs, where n is the number of observed variables and d is the maximum degree of a latent variable. Let the \textit{MRF neighborhood} of an observed variable be its neighborhood in the Markov Random Field of the marginal distribution of the observed variables.
Reviews: Gradient-based Adaptive Markov Chain Monte Carlo
Originality: First-order Gradient-based MCMC methods have to deal with determining an appropriate length scale for each variable. NUTS is one approach and this paper gives another approach whereby a parameter theta of a proposal distribution is adaptively improved to account for the covariance structure. At the same time theta is adapted to consider the entropy of the proposal distribution. This trade off for theta is rolled into a new speed measure which is the central point of this paper. The paper includes a lower bound of the speed measure that can be directly differentiated resulting in a practical algorithm. The paper also includes a heuristic that makes this adaptive MCMC algorithm applicable to MALA as well.
Reviews: Large Scale Markov Decision Processes with Changing Rewards
I still feel that the work would be greatly improved by adding numerical experiments. In particular, the authors refer to a specific setting called'online MDP', where the dynamics, that is, the transition probabilities, are known while the reward is not. Regret minimization then refers to the idea to minimize the regret'' given that rewards could be chosen/observed in an adversarial manner. The authors start with a (rather technical) introduction, pose related work, and explain the main ideas based on concise preliminaries. Afterwards, an extension to large state spaces by using approximate occupancy measures and thereby avoiding concrete state-mappings is provided.