Markov Models
Identification of Nonparametric Dynamic Causal Structure and Latent Process in Climate System
Fu, Minghao, Huang, Biwei, Li, Zijian, Zheng, Yujia, Ng, Ignavier, Hu, Yingyao, Zhang, Kun
The study of learning causal structure with latent variables has advanced the understanding of the world by uncovering causal relationships and latent factors, e.g., Causal Representation Learning (CRL). However, in real-world scenarios, such as those in climate systems, causal relationships are often nonparametric, dynamic, and exist among both observed variables and latent variables. These challenges motivate us to consider a general setting in which causal relations are nonparametric and unrestricted in their occurrence, which is unconventional to current methods. To solve this problem, with the aid of 3-measurement in temporal structure, we theoretically show that both latent variables and processes can be identified up to minor indeterminacy under mild assumptions. Moreover, we tackle the general nonlinear Causal Discovery (CD) from observations, e.g., temperature, as a specific task of learning independent representation, through the principle of functional equivalence. Based on these insights, we develop an estimation approach simultaneously recovering both the observed causal structure and latent causal process in a nontrivial manner. Simulation studies validate the theoretical foundations and demonstrate the effectiveness of the proposed methodology. In the experiments involving climate data, this approach offers a powerful and in-depth understanding of the climate system.
Inverse Reinforcement Learning with Switching Rewards and History Dependency for Characterizing Animal Behaviors
Ke, Jingyang, Wu, Feiyang, Wang, Jiyi, Markowitz, Jeffrey, Wu, Anqi
Traditional approaches to studying decision-making in neuroscience focus on simplified behavioral tasks where animals perform repetitive, stereotyped actions to receive explicit rewards. While informative, these methods constrain our understanding of decision-making to short timescale behaviors driven by explicit goals. In natural environments, animals exhibit more complex, long-term behaviors driven by intrinsic motivations that are often unobservable. Recent works in time-varying inverse reinforcement learning (IRL) aim to capture shifting motivations in long-term, freely moving behaviors. However, a crucial challenge remains: animals make decisions based on their history, not just their current state. To address this, we introduce SWIRL (SWitching IRL), a novel framework that extends traditional IRL by incorporating time-varying, history-dependent reward functions. SWIRL models long behavioral sequences as transitions between short-term decision-making processes, each governed by a unique reward function. SWIRL incorporates biologically plausible history dependency to capture how past decisions and environmental contexts shape behavior, offering a more accurate description of animal decision-making. We apply SWIRL to simulated and real-world animal behavior datasets and show that it outperforms models lacking history dependency, both quantitatively and qualitatively. This work presents the first IRL model to incorporate history-dependent policies and rewards to advance our understanding of complex, naturalistic decision-making in animals. Historically, decision making in neuroscience has been studied using simplified assays where animals perform repetitive, stereotyped actions (such as licks, nose pokes, or lever presses) in response to sensory stimuli to obtain an explicit reward. While this approach has its advantages, it has limited our understanding of decision making to scenarios where animals are instructed to achieve an explicit goal over brief timescales, usually no more than tens of seconds.
GATE: Adaptive Learning with Working Memory by Information Gating in Multi-lamellar Hippocampal Formation
Liu, Yuechen, Wang, Zishun, Qiao, Chen, Xu, Zongben
Hippocampal formation (HF) can rapidly adapt to varied environments and build flexible working memory (WM). To mirror the HF's mechanism on generalization and WM, we propose a model named Generalization and Associative Temporary Encoding (GATE), which deploys a 3-D multi-lamellar dorsoventral (DV) architecture, and learns to build up internally representation from externally driven information layer-wisely. In each lamella, regions of HF: EC3-CA1-EC5-EC3 forms a re-entrant loop that discriminately maintains information by EC3 persistent activity, and selectively readouts the retained information by CA1 neurons. CA3 and EC5 further provides gating function that controls these processes. After learning complex WM tasks, GATE forms neuron representations that align with experimental records, including splitter, lap, evidence, trace, delay-active cells, as well as conventional place cells. Crucially, DV architecture in GATE also captures information, range from detailed to abstract, which enables a rapid generalization ability when cue, environment or task changes, with learned representations inherited. GATE promises a viable framework for understanding the HF's flexible memory mechanisms and for progressively developing brain-inspired intelligent systems.
Reviews: Unsupervised Risk Estimation Using Only Conditional Independence Structure
I found the paper very well presented and enjoyable to read. The basic problem is interesting, and the approach presented as some salient features, notably the fact that one does not have to make parametric assumption on the underlying distribution. The high-level idea of imposing structural assumptions but nonetheless relying on discriminative models was quite elegant. The basic insight in estimating the risk from unlabelled data is that by encoding a certain structural assumption - namely, that the data comprises three independent views - one implicitly gets information about the class-conditional risks by considering the first three moments of the label vectors. This leads to a system of equations which may be solved to infer the class-conditional risks.
Reviews: Optimal Tagging with Markov Chain Optimization
Optimization of the link structure for PR is not a new topic. Apart from papers mentioned in Related work, there are also those not reviewed, including "PageRank Optimization by Edge Selection" by Csaji et al., "Maximizing PageRank with New Backlinks" by Olsen, "PageRank Optimization in Polynomial Time by Stochastic Shortest Path Reformulation" by Csaji et al. **The novelty** of the study is questionable. The probability of reaching the target state \sigma can be viewed as the state's stationary probability for the graph, where the added edges are directed to the state \sigma and the matrix of transition probabilities is raised to an appropriate power. This observation does not immediately reduce the problem of the paper to a known task, however, it may partially explain the similarity between the theoretical part and the works of Olsen, where the stationary probability is maximized. In particular, Section 4 resembles the work "Maximizing PageRank with New Backlinks" (not cited in the paper), where M. Olsen considered a reduction of a Markov chain optimization problem to the independent set problem, which is equivalent to the vertex cover problem. Theorems 5.1, 5.3 are reasonable, but very simple and resemble Lemmas 1,2 from [15].
Reviews: Pairwise Choice Markov Chains
This paper considers the problem of developing flexible choice models that are not constrained to satisfy traditional, restrictive choice axioms (such as Luce's axiom of independence of irrelevant attributes, IIA), but that can be tractably inferred from data. A (discrete) choice model over n items specifies probabilities of the form p(i,S) Prob( i chosen from S) for each subset of items S \subseteq [n] and each item i \in S. One of the most widely used models of discrete choice is the multinomial logit (MNL) choice model, which can be inferred efficiently from data but which is constrained to satisfy IIA and other restrictive assumptions. The paper proposes a new Markov chain based model of discrete choice that is parametrized by a (n x n) pairwise selection probability matrix. The model avoids several of the earlier restrictive assumptions, but is shown to satisfy an interesting property termed contractibility, which in turn also implies a reasonable property of uniform expansion. Parameter estimation in the model is done by maximum likelihood (the log-likelihood function is non-concave in general, but the experiments suggest that good parameters are learned).
Reviews: Safe Exploration in Finite Markov Decision Processes with Gaussian Processes
The paper is well-written and clear. The proposed idea is interesting. I have the following comments/questions: 1) Does the Liptschiz assumption hold here with a probability or is it assumed to always hold? 2) Figure 1: should it be \bar{s}_2 instead of s_2 in the caption? The use of bar for non-sets is confusing. I do not see the need for the last intersection in Equation 4. 4) When you repeatedly apply Equation 4, the number of states that satisfy the safety constraint shrinks because you use Liptschiz in the worst scenario sense.
Reviews: On Mixtures of Markov Chains
The paper is globally sound and makes a new contribution to an important topic. However some technicalities need to be addressed and a revised version should be encouraged. Major remarks: - There is a confusion on whether the Markov chains under consideration are supposed to be stationary or not. Indeed, the concept of t-trail either requires that the Markov chains under study are stationary or one should specify that all the trails start with same initial distribution, i.e. these trails are observations of (X_1,X_2,X_3) and not (X_s, X_{s 1},X_{s 2}) for some s. I first understood that you adopt the second approach (as you count the parameters of initial distributions as free parameters) but in the real data experiments, you take many (3001)-trails and break them into 3000 overlapping 3-trails (by the way, you need a 3002-trail to obtain these).
Reviews: Fast Mixing Markov Chains for Strongly Rayleigh Measures, DPPs, and Constrained Sampling
Technically the paper is very strong. The results presented by the authors are, to the best of my knowledge, novel and significant. However my main criticism of the paper is that the presentation is very esoteric. The is clear already in the introduction where the authors fail to explain some of the basic notation that is central to the remaining of the paper, see (1)-(3) below. This continues throughout the paper making it hard to read for non-experts in the field, see e.g.
Reviews: Poisson-Gamma dynamical systems
The proposed model is novel and practical, as seen from the experimental result. It is rare to see a Bayesian nonparametric model being applied to large data as it is generally not very scalable. It is a feat to see this model applied to data with high dimensions (9000 dimensions with millions of events). I am interested to know how much time is spent for training? It would be good to also present the computational time (say in the supplementary material).