We analyze the asymptotic behavior of agents engaged in an infinite horizon partially observable stochastic game as formalized by the interactive POMDP framework. We show that when agents' initial beliefs satisfy a truth compatibility condition, their behavior converges to a subjective ɛ-equilibrium in a finite time, and subjective equilibrium in the limit. This result is a generalization of a similar result in repeated games, to partially observable stochastic games. However, it turns out that the equilibrating process is difficult to demonstrate computationally because of the difficulty in coming up with initial beliefs that are both natural and satisfy the truth compatibility condition. Our results, therefore, shed some negative light on using equilibria as a solution concept for decision making in partially observable stochastic games.
When operating in stochastic, partially observable, multiagent settings, it is crucial to accurately predict the actions of other agents. In my thesis work, I propose methodologies for learning the policy of external agents from their observed behavior, in the form of finite state controllers. To perform this task, I adopt Bayesian learning algorithms based on nonparametric prior distributions, that provide the flexibility required to infer models of unknown complexity. These methods are to be embedded in decision making frameworks for autonomous planning in partially observable multiagent systems.
Regulation of gene expression often involves proteins that bind to particular regions of DNA. Determining the binding sites for a protein and its specificity usually requires extensive biochemical and/or genetic experimentation. In this paper we illustrate the use of a neural network to obtain the desired information with much less experimental effort. It is often fairly easy to obtain a set of moderate length sequences, perhaps one or two hundred base-pairs, that each contain binding sites for the protein being studied. For example, the upstream regions of a set of genes that are all regulated by the same protein should each contain binding sites for that protein.
We propose a Bayesian nonparametric approach to the problem of jointly modeling multiple related time series. Our approach is based on the discovery of a set of latent, shared dynamical behaviors. Using a beta process prior, the size of the set and the sharing pattern are both inferred from data. We develop efficient Markov chain Monte Carlo methods based on the Indian buffet process representation of the predictive distribution of the beta process, without relying on a truncated model. In particular, our approach uses the sum-product algorithm to efficiently compute Metropolis-Hastings acceptance probabilities, and explores new dynamical behaviors via birth and death proposals. We examine the benefits of our proposed feature-based model on several synthetic datasets, and also demonstrate promising results on unsupervised segmentation of visual motion capture data.
In an earlier paper, a new theory of measurefree "conditional" objects was presented. In this paper, emphasis is placed upon the motivation of the theory. The central part of this motivation is established through an example involving a knowledge-based system. In order to evaluate combination of evidence for this system, using observed data, auxiliary at tribute and diagnosis variables, and inference rules connecting them, one must first choose an appropriate algebraic logic description pair (ALDP): a formal language or syntax followed by a compatible logic or semantic evaluation (or model). Three common choices- for this highly non-unique choice - are briefly discussed, the logics being Classical Logic, Fuzzy Logic, and Probability Logic. In all three,the key operator representing implication for the inference rules is interpreted as the often-used disjunction of a negation (b => a) = (b'v a), for any events a,b. However, another reasonable interpretation of the implication operator is through the familiar form of probabilistic conditioning. But, it can be shown - quite surprisingly - that the ALDP corresponding to Probability Logic cannot be used as a rigorous basis for this interpretation! To fill this gap, a new ALDP is constructed consisting of "conditional objects", extending ordinary Probability Logic, and compatible with the desired conditional probability interpretation of inference rules. It is shown also that this choice of ALDP leads to feasible computations for the combination of evidence evaluation in the example. In addition, a number of basic properties of conditional objects and the resulting Conditional Probability Logic are given, including a characterization property and a developed calculus of relations.