Markov Models
An MDP-Based Winning Approach to Autonomous Power Trading: Formalization and Empirical Analysis
Urieli, Daniel (The University of Texas at Austin) | Stone, Peter (The University of Texas at Austin)
With the efforts of moving to sustainable and reliable energy supply, electricity markets are undergoing far-reaching changes. Due to the high-cost of failure in the real-world, it is important to test new market structures in simulation. This is the focus of the Power Trading Agent Competition (Power TAC), which proposes autonomous electricity broker agents as a means for stabilizing the electricity grid. This paper focuses on the question: how should an autonomous electricity broker agent act in competitive electricity markets to maximize its profit. We formalize the complete electricity trading problem as a continuous, high-dimensional Markov Decision Process (MDP), which is computationally intractable to solve. Our formalization provides a guideline for approximating the MDP's solution, and for extending existing solutions. We show that a previously champion broker can be viewed as approximating the solution using a lookahead policy. We present TacTex15, which improves upon this previous approximation and achieves state-of-the-art performance in competitions and controlled experiments. Using thousands of experiments against 2015 finalist brokers, we analyze TacTex15's performance and the reasons for its success. We find that lookahead policies can be effective, but their performance can be sensitive to errors in the transition function prediction, specifically demand-prediction.
Deep Activity Recognition Models with Triaxial Accelerometers
Alsheikh, Mohammad Abu (Nanyang Technological University) | Selim, Ahmed (Trinity College Dublin) | Niyato, Dusit (Nanyang Technological University) | Doyle, Linda (Trinity College Dublin) | Lin, Shaowei (Institute for Infocomm Research) | Tan, Hwee-Pink (Singapore Management University)
Despite the widespread installation of accelerometers in almost all mobile phones and wearable devices, activity recognition using accelerometers is still immature due to the poor recognition accuracy of existing recognition methods and the scarcity of labeled training data. We consider the problem of human activity recognition using triaxial accelerometers and deep learning paradigms. This paper shows that deep activity recognition models (a) provide better recognition accuracy of human activities, (b) avoid the expensive design of handcrafted features in existing systems, and (c) utilize the massive unlabeled acceleration samples for unsupervised feature extraction. We show substantial recognition improvement on real world datasets over state-of-the-art methods of human activity recognition using triaxial accelerometers.
An Analysis of Trimming in Digital Social Networks
Murimi, Renita Margaret (Oklahoma Baptist University)
The study of network sizes in digital social networks is a research question of significant interest. Here, we explore the phenomenon of trimming, which is the decrease in the size of oneโs network, and analyze if the rules of social exchange theory โ namely, status consistency and reciprocity- can affect trimming. To this end, we use a Hidden Markov Model to investigate the relationship between the frequency of interaction and oneโs network size, in which we are able to control for the current size of oneโs digital social network. We find that there are significant patterns in sharing tendencies in digital social networks. One is that users who do not share enough are the group that is most likely to be trimmed from a network. Another is that users prefer to have moderate sized networks, i.e. networks with 500 โ 1000 friends and prefer friends with moderate sharing tendencies (sharing approximately once a week). We also find that oneโs sharing preferences over time tend to align with moderate sharing.
SlimShot: Probabilistic Inference for Web-Scale Knowledge Bases
Gribkoff, Eric (University of Washington) | Suciu, Dan (University of Washington)
Increasingly large Knowledge Bases are being created, by crawling the Web or other corpora of documents, and by extracting facts and relations using machine learning techniques. To manage the uncertainty in the data, these KBs rely on probabilistic engines based on Markov Logic Networks (MLN), for which probabilistic inference remains a major challenge. Today's state of the art systems reduce the task of inference to weighted model counting and use an MCMC algorithm wrapped around SampleSAT to generate approximately uniform samples. This approach offers no theoretical error guarantees and, as we show, suffers from poor performance in practice. In this paper we describe SlimShot (Scalable Lifted Inference and Monte Carlo Sampling Hybrid Optimization Technique), a probabilistic inference engine for Web-Scale knowledge bases. SlimShot converts the MLN to a tuple-independent probabilistic database, then uses a simple Monte Carlo-based inference, with three key enhancements: (1) it combines sampling with safe query evaluation, (2) it estimates a conditional probability by jointly computing the numerator and denominator, and (3) it adjusts the proposal distribution based on the sample cardinality. In combination, these three techniques allow us to give formal error guarantees, and we demonstrate empirically that SlimShot outperforms today's state of the art probabilistic inference engines used in knowledge bases.
Efficient Inference in Dual-Emission FHMM for Energy Disaggregation
Lange, Henning (Aalto University) | Bergรฉs, Mario (Carnegie Mellon University)
In this paper an extension to factorial hidden Semi Markov Models is introduced that allows modeling more than one sequence of emissions of the individual HMM chains, as well as a joint emission of all chains. Since exact inference in factorial hidden Markov Models is computationally intractable, an approximate inference technique is introduced that reduces the computational costs by first constraining the successor state space of the model, allowing state changes at statistically significant points in time (events) and by discarding low probability paths (truncating). Furthermore, by being agnostic about state durations the computational costs are further decreased. These assumptions allow for efficient inference that is less susceptible to local minima and allows one to specify the computational burden a priori. The performance of the inference technique is evaluated empirically on a synthetic data set whereas incorporating the feature emissions is evaluated on real world data in the context of energy disaggregation. Energy disaggregation tackles the problem of decomposing whole home energy measurements into the power traces of constituent appliances, and is a natural application for this type of models.
A statistical learning strategy for closed-loop control of fluid flows
Guรฉniat, Florimond, Mathelin, Lionel, Hussaini, M. Yousuff
This work discusses a closed-loop control strategy for complex systems utilizing scarce and streaming data. A discrete embedding space is first built using hash functions applied to the sensor measurements from which a Markov process model is derived, approximating the complex system's dynamics. A control strategy is then learned using reinforcement learning once rewards relevant with respect to the control objective are identified. This method is designed for experimental configurations, requiring no computations nor prior knowledge of the system, and enjoys intrinsic robustness. It is illustrated on two systems: the control of the transitions of a Lorenz 63 dynamical system, and the control of the drag of a cylinder flow. The method is shown to perform well.
Grid Based Nonlinear Filtering Revisited: Recursive Estimation & Asymptotic Optimality
Kalogerias, Dionysios S., Petropulu, Athina P.
We revisit the development of grid based recursive approximate filtering of general Markov processes in discrete time, partially observed in conditionally Gaussian noise. The grid based filters considered rely on two types of state quantization: The \textit{Markovian} type and the \textit{marginal} type. We propose a set of novel, relaxed sufficient conditions, ensuring strong and fully characterized pathwise convergence of these filters to the respective MMSE state estimator. In particular, for marginal state quantizations, we introduce the notion of \textit{conditional regularity of stochastic kernels}, which, to the best of our knowledge, constitutes the most relaxed condition proposed, under which asymptotic optimality of the respective grid based filters is guaranteed. Further, we extend our convergence results, including filtering of bounded and continuous functionals of the state, as well as recursive approximate state prediction. For both Markovian and marginal quantizations, the whole development of the respective grid based filters relies more on linear-algebraic techniques and less on measure theoretic arguments, making the presentation considerably shorter and technically simpler.
Support Consistency of Direct Sparse-Change Learning in Markov Networks
Liu, Song, Suzuki, Taiji, Relator, Raissa, Sese, Jun, Sugiyama, Masashi, Fukumizu, Kenji
We study the problem of learning sparse structure changes between two Markov networks $P$ and $Q$. Rather than fitting two Markov networks separately to two sets of data and figuring out their differences, a recent work proposed to learn changes \emph{directly} via estimating the ratio between two Markov network models. In this paper, we give sufficient conditions for \emph{successful change detection} with respect to the sample size $n_p, n_q$, the dimension of data $m$, and the number of changed edges $d$. When using an unbounded density ratio model we prove that the true sparse changes can be consistently identified for $n_p = \Omega(d^2 \log \frac{m^2+m}{2})$ and $n_q = \Omega({n_p^2})$, with an exponentially decaying upper-bound on learning error. Such sample complexity can be improved to $\min(n_p, n_q) = \Omega(d^2 \log \frac{m^2+m}{2})$ when the boundedness of the density ratio model is assumed. Our theoretical guarantee can be applied to a wide range of discrete/continuous Markov networks.
AI, NLP: Way Deeper Than Microsoft's Marketing
Welcome to the natural language processing (NLP) edition of my "under-the-hood" series on AI (artificial intelligence) technology. Most major tech companies are announcing initiatives, introducing products (e.g., Amazon (NASDAQ:AMZN) with its Echo device and platform) or experiments in public (e.g., Microsoft's (NASDAQ:MSFT) Tay debacle). On Seeking Alpha (and I am sure in many other places as well), the topic of natural language processing seems to be confusing many people because they are not clear on what new technologies here can do and what they should expect from them in the immediate future. Even though I have touched on the subject before - e.g. on my explanation of IBM's (NYSE:IBM) Watson - I feel like I have not explained the specific topic of natural language processing comprehensively. There is a specific reason why the topic is getting so much attention and it has to do with the convergence of various subfields of computer science and statistics.