Goto

Collaborating Authors

 Undirected Networks


Binary hidden Markov models and varieties

arXiv.org Machine Learning

The technological applications of hidden Markov models have been extremely diverse and successful, including natural language processing, gesture recognition, gene sequencing, and Kalman filtering of physical measurements. HMMs are highly non-linear statistical models, and just as linear models are amenable to linear algebraic techniques, non-linear models are amenable to commutative algebra and algebraic geometry. This paper closely examines HMMs in which all the hidden random variables are binary. Its main contributions are (1) a birational parametrization for every such HMM, with an explicit inverse for recovering the hidden parameters in terms of observables, (2) a semialgebraic model membership test for every such HMM, and (3) minimal defining equations for the 4-node fully binary model, comprising 21 quadrics and 29 cubics, which were computed using Grobner bases in the cumulant coordinates of Sturmfels and Zwiernik. The new model parameters in (1) are rationally identifiable in the sense of Sullivant, Garcia-Puente, and Spielvogel, and each model's Zariski closure is therefore a rational projective variety of dimension 5. Grobner basis computations for the model and its graph are found to be considerably faster using these parameters. In the case of two hidden states, item (2) supersedes a previous algorithm of Schonhuth which is only generically defined, and the defining equations (3) yield new invariants for HMMs of all lengths $\geq 4$. Such invariants have been used successfully in model selection problems in phylogenetics, and one can hope for similar applications in the case of HMMs.


A Unified Approach for Modeling and Recognition of Individual Actions and Group Activities

arXiv.org Machine Learning

Recognizing group activities is challenging due to the difficulties in isolating individual entities, finding the respective roles played by the individuals and representing the complex interactions among the participants. Individual actions and group activities in videos can be represented in a common framework as they share the following common feature: both are composed of a set of low-level features describing motions, e.g., optical flow for each pixel or a trajectory for each feature point, according to a set of composition constraints in both temporal and spatial dimensions. In this paper, we present a unified model to assess the similarity between two given individual or group activities. Our approach avoids explicit extraction of individual actors, identifying and representing the inter-person interactions. With the proposed approach, retrieval from a video database can be performed through Query-by-Example; and activities can be recognized by querying videos containing known activities. The suggested video matching process can be performed in an unsupervised manner. We demonstrate the performance of our approach by recognizing a set of human actions and football plays.


Predictive Information Rate in Discrete-time Gaussian Processes

arXiv.org Machine Learning

We derive expressions for the predicitive information rate (PIR) for the class of autoregressive Gaussian processes AR(N), both in terms of the prediction coefficients and in terms of the power spectral density. The latter result suggests a duality between the PIR and the multi-information rate for processes with mutually inverse power spectra (i.e. with poles and zeros of the transfer function exchanged). We investigate the behaviour of the PIR in relation to the multi-information rate for some simple examples, which suggest, somewhat counter-intuitively, that the PIR is maximised for very `smooth' AR processes whose power spectra have multiple poles at zero frequency. We also obtain results for moving average Gaussian processes which are consistent with the duality conjectured earlier. One consequence of this is that the PIR is unbounded for MA(N) processes.


Considering State in Plan Recognition with Lexicalized Grammars

AAAI Conferences

This paper documents extending the ELEXIR (Engine for LEXicalized Intent Recognition) system (Geib 2009; Geib 2011) with a world model. This is a significant increase in the expressiveness of the plan recognition system and allows a number of additions to the algorithm, most significantly conditioning probabilities for recognized plans on the state of the world during execution. Since, ELEXIR falls in the family of gramatical methods for plan recognition in viewing the problem of plan recognition as that of parsing, this paper will also briefly discuss how this extension relates to state of the art proposals in the natural language community regarding probabilistic parsing.


Using POMDPs to Control an Accuracy-Processing Time Trade-Off in Video Surveillance

AAAI Conferences

With rapid profusion of video data, automated surveillanceand intrusion detection is becoming closer to reality. In orderto provide timely responses while limiting false alarms, an intrusiondetection system must balance resources (e.g., time)and accuracy. In this paper, we show how such a system canbe modeled with a partially observable Markov decision process(POMDP), representing possible computer vision filtersand their costs in a way that is similar to human vision systems.The POMDP representation can be optimized to producea dynamic sequence of operations and achieve a tradeoffbetween time and detection quality, taking into accountuncertainty in the filter predictions. In a set of experiments onactual video data, we show that our method can both outperformstatic “expert” models and scale to large dynamic domains.These results suggest that our method could be usedin real-world intrusion detection systems.


Integrating Learner Help Requests Using a POMDP in an Adaptive Training System

AAAI Conferences

This paper describes the development and empirical testing of an intelligent tutoring system (ITS) with two emerging methodologies: (1) a partially observable Markov decision process (POMDP) for representing the learner model and (2) inquiry modeling, which informs the learner model with questions learners ask during instruction. POMDPs have been successfully applied to non-ITS domains but, until recently, have seemed intractable for large-scale intelligent tutoring challenges. New, ITS-specific representations leverage common regularities in intelligent tutoring to make a POMDP practical as a learner model. Inquiry modeling is a novel paradigm for informing learner models by observing rich features of learners’ help requests such as categorical content, context, and timing. The experiment described in this paper demonstrates that inquiry modeling and planning with POMDPs can yield significant and substantive learning improvements in a realistic, scenario-based training task.


Solving Goal Hybrid Markov Decision Processes Using Numeric Classical Planners

AAAI Conferences

We present the domain-independent HRFF algorithm, which solves goal-oriented HMDPs by incrementally aggregating plans generated by the Metric-FF planner into a policy defined over discrete and continuous state variables. HRFF takes into account non-monotonic state variables, and complex combinations of many discrete and continuous probability distributions. We introduce new data structures and algorithmic paradigms to deal with continuous state spaces: hybrid hierarchical hash tables, domain determinization based on dynamic domain sampling or on static computation of probability distributions' modes, optimization settings under Metric-FF based on plan probability and length. We compare with HAO* on the Rover domain and show that HRFF outperforms HAO* by many order of magnitudes in terms of computation time and memory usage. We also experiment challenging and combinatorial HMDP versions of benchmarks from numeric classical planning, with continuous dead-ends and non-monotonic continuous state variables.


Twenty-Five Years of Combining Symbolic and Numeric Learning

AAAI Conferences

For nearly 25 years my research group has investigated the use of domain knowledge, expressed in some version of mathematical logic, that is refined or exploited by numeric-based learning algorithms. These include what we called knowledge-based neural networks and knowledge-based support vector machines. I will cover the key ideas of these methods, as well as the behind-the-scenes motivations that lead to them. I will also describe why we switched from using the phrase 'prior knowledge' to using 'advice.' Finally, I will cover some of our recent work on fast learning and inference for Markov Logic Networks (which can be viewed as a knowledge-based graphical model).


Challenges in Learning Optimum Models for Complex First Order Activity Recognition Settings

AAAI Conferences

Non intrusive activity recognition systems typically read values from sensors deployed in an environment and combine them with user annotated activities to build a probabilistic model. Recently, features constructed from activity specific conjunctions of binary sensor values have been shown to improve the classification accuracy. Such systems employ greedy feature induction techniques to find the observation features and combine them with state transition distribution in a Hidden Markov Model or a Conditional Random Field. An exhaustive search for optimum features is infeasible in this exponential feature space. We have recently extended the rule ensemble learning using hierarchical kernels (RELHKL) framework, that learns a sparse set of simple features and their optimum weights, to structured output spaces for learning optimum observation features along with the transition features and their weights. The exponentially large space of conjunctions is handled efficiently by exploiting its hierarchical structure. Our experiments have shown good improvement over other approaches. Although such approaches solve propositional classification problems optimally, their first-order extension is non-trivial and is a challenging problem. In this paper, we discuss about the challenges involved in leveraging the RELHKL in structured output spaces approach to learn optimum features in complex first order activity recognition settings.


Recognizing Continuous Social Engagement Level in Dyadic Conversation by Using Turn-taking and Speech Emotion Patterns

AAAI Conferences

Recognizing social interests plays an important role of aiding human-computer interaction and human collaborative works. The recognition of social interest could be of great help to determine the smoothness of the interaction, which could be an indicator for group work performance and relationship. From socio-psychological theories, social engagement is the observable form of inner social interest, and represented as patterns of turn-taking and speech emotion during a face-to-face conversation. With these two kinds of features, a multi-layer learning structure is proposed to model the continuous trend of engagement. The level of engagement is classified into “high” and “low” two levels according to human-annotated score. In the result of assessing two-level engagemet, the highest accuracy of our model can reach 79.1%.