Markov Models
A Comprehensive Survey on Heart Sound Analysis in the Deep Learning Era
Ren, Zhao, Chang, Yi, Nguyen, Thanh Tam, Tan, Yang, Qian, Kun, Schuller, Bjรถrn W.
Heart sound auscultation has been demonstrated to be beneficial in clinical usage for early screening of cardiovascular diseases. Due to the high requirement of well-trained professionals for auscultation, automatic auscultation benefiting from signal processing and machine learning can help auxiliary diagnosis and reduce the burdens of training professional clinicians. Nevertheless, classic machine learning is limited to performance improvement in the era of big data. Deep learning has achieved better performance than classic machine learning in many research fields, as it employs more complex model architectures with stronger capability of extracting effective representations. Deep learning has been successfully applied to heart sound analysis in the past years. As most review works about heart sound analysis were given before 2017, the present survey is the first to work on a comprehensive overview to summarise papers on heart sound analysis with deep learning in the past six years 2017--2022. We introduce both classic machine learning and deep learning for comparison, and further offer insights about the advances and future research directions in deep learning for heart sound analysis.
Learning-Based Data Storage [Vision] (Technical Report)
Deep neural network (DNN) and its variants have been extensively used for a wide spectrum of real applications such as image classification, face/speech recognition, fraud detection, and so on. In addition to many important machine learning tasks, as artificial networks emulating the way brain cells function, DNNs also show the capability of storing non-linear relationships between input and output data, which exhibits the potential of storing data via DNNs. We envision a new paradigm of data storage, "DNN-as-a-Database", where data are encoded in well-trained machine learning models. Compared with conventional data storage that directly records data in raw formats, learning-based structures (e.g., DNN) can implicitly encode data pairs of inputs and outputs and compute/materialize actual output data of different resolutions only if input data are provided. This new paradigm can greatly enhance the data security by allowing flexible data privacy settings on different levels, achieve low space consumption and fast computation with the acceleration of new hardware (e.g., Diffractive Neural Network and AI chips), and can be generalized to distributed DNN-based storage/computing. In this paper, we propose this novel concept of learning-based data storage, which utilizes a learning structure called learning-based memory unit (LMU), to store, organize, and retrieve data. As a case study, we use DNNs as the engine in the LMU, and study the data capacity and accuracy of the DNN-based data storage. Our preliminary experimental results show the feasibility of the learning-based data storage by achieving high (100%) accuracy of the DNN storage. We explore and design effective solutions to utilize the DNN-based data storage to manage and query relational tables. We discuss how to generalize our solutions to other data types (e.g., graphs) and environments such as distributed DNN storage/computing.
Game Theoretic Decision Making by Actively Learning Human Intentions Applied on Autonomous Driving
Dai, Siyu, Bae, Sangjae, Isele, David
The ability to estimate human intentions and interact with human drivers intelligently is crucial for autonomous vehicles to successfully achieve their objectives. In this paper, we propose a game theoretic planning algorithm that models human opponents with an iterative reasoning framework and estimates human latent cognitive states through probabilistic inference and active learning. By modeling the interaction as a partially observable Markov decision process with adaptive state and action spaces, our algorithm is able to accomplish real-time lane changing tasks in a realistic driving simulator. We compare our algorithm's lane changing performance in dense traffic with a state-of-the-art autonomous lane changing algorithm to show the advantage of iterative reasoning and active learning in terms of avoiding overly conservative behaviors and achieving the driving objective successfully.
machine-learning-engineer-skills-career-path
Machine Learning (ML) is the branch of Artificial Intelligence in which we use algorithms to learn from data provided to make predictions on unseen data. Recently, the demand for Machine Learning engineers has rapidly grown across healthcare, Finance, e-commerce, etc. According to Glassdoor, the median ML Engineer Salary is $131,290 per annum. In 2021, the global ML market was valued at $15.44 billion. It is expected to grow at a significant compound annual growth rate (CAGR) above 38% until 2029.
My Actions Speak Louder Than Your Words: When User Behavior Predicts Their Beliefs about Agents' Attributes
Gurney, Nikolos, Pynadath, David, Wang, Ning
A widely cited explanation for how humans think about trustworthiness posits that people consider three factors, or traits, of a person (or agent) when they evaluate trustworthiness: ability, benevolence, and integrity [20]. It is common practice for intelligent agent researchers to adapt a psychometric inventory of this three-factor model of trustworthiness for assessing users' perceived trustworthiness of agents [19]. In theory, administering the inventory prior to an interaction allows researchers to assess the role of anticipated agent trustworthiness in users' behavior, while post hoc administration allows researchers to assess whether particular elements of an interaction, perhaps an experimental manipulation, impacted users' opinions of the agent. In practice, however, people frequently misuse information when they form judgments and make decisions [11, 17]. For example, a person who is momentarily happy (sad), perhaps from reminiscing about a positive (negative) event from their recent past, is likely to rate their life satisfaction as higher (lower) than if you asked them when they were in a neutral state [25]. Regardless of the saliency of information, the normative approach is to always use it the same way.
The configurable tree graph (CT-graph): measurable problems in partially observable and distal reward environments for lifelong reinforcement learning
Soltoggio, Andrea, Ben-Iwhiwhu, Eseoghene, Peridis, Christos, Ladosz, Pawel, Dick, Jeffery, Pilly, Praveen K., Kolouri, Soheil
Many real-world problems are characterized by a large number of observations, confounding and spurious correlations, partially observable states, and distal, dynamic rewards with hierarchical reward structures. Such conditions make it hard for both animal and machines to learn complex skills. The learning process requires discovering what is important and what can be ignored, how the reward function is structured, and how to reuse knowledge across different tasks that share common properties. For these reasons, the application of standard reinforcement learning (RL) algorithms (Sutton and Barto, 2018) to solve structured problems is often not effective. Limitations of current RL algorithms include the problem of exploration with sparse rewards (Pathak et al., 2017), dealing with partially observable Markov decision problems (POMDP) (Ladosz et al., 2021), coping with large amounts of confounding stimuli (Thrun, 2000; Kim et al., 2019), and reusing skills for efficiently learning multiple task in a lifelong learning setting (Mendez and Eaton, 2020). Standard reinforcement learning algorithms are best suited when the problem can be formulated as a single-task problem in observable Markov decision problem (MDP). Under these assumptions, with complete observability and with static and frequent rewards, deep reinforcement learning (DRL) (Mnih et al., 2015; Li, 2017) has gained popularity due to the ability to learn an approximated Q-value function directly from raw pixel data in the Atari 2600 platform. This and similar algorithms stack multiple frames to derive states of an MDP, and use a basic ษ-greedy exploration policy. In more complex cases with partial observability and sparse rewards, extensions have been proposed to include more advanced exploration techniques (Ladosz et al., 2022), e.g.
Critic Sequential Monte Carlo
Lioutas, Vasileios, Lavington, Jonathan Wilder, Sefas, Justice, Niedoba, Matthew, Liu, Yunpeng, Zwartsenberg, Berend, Dabiri, Setareh, Wood, Frank, Scibior, Adam
We introduce CriticSMC, a new algorithm for planning as inference built from a composition of sequential Monte Carlo with learned Soft-Q function heuristic factors. These heuristic factors, obtained from parametric approximations of the marginal likelihood ahead, more effectively guide SMC towards the desired target distribution, which is particularly helpful for planning in environments with hard constraints placed sparsely in time. Compared with previous work, we modify the placement of such heuristic factors, which allows us to cheaply propose and evaluate large numbers of putative action particles, greatly increasing inference and planning efficiency. CriticSMC is compatible with informative priors, whose density function need not be known, and can be used as a model-free control algorithm. Our experiments on collision avoidance in a high-dimensional simulated driving task show that CriticSMC significantly reduces collision rates at a low computational cost while maintaining realism and diversity of driving behaviors across vehicles and environment scenarios.
Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback
Jin, Tiancheng, Lancewicki, Tal, Luo, Haipeng, Mansour, Yishay, Rosenberg, Aviv
The standard assumption in reinforcement learning (RL) is that agents observe feedback for their actions immediately. However, in practice feedback is often observed in delay. This paper studies online learning in episodic Markov decision process (MDP) with unknown transitions, adversarially changing costs, and unrestricted delayed bandit feedback. More precisely, the feedback for the agent in episode $k$ is revealed only in the end of episode $k + d^k$, where the delay $d^k$ can be changing over episodes and chosen by an oblivious adversary. We present the first algorithms that achieve near-optimal $\sqrt{K + D}$ regret, where $K$ is the number of episodes and $D = \sum_{k=1}^K d^k$ is the total delay, significantly improving upon the best known regret bound of $(K + D)^{2/3}$.
Robust Control for Dynamical Systems with Non-Gaussian Noise via Formal Abstractions
Badings, Thom (a:1:{s:5:"en_US";s:18:"Radboud University";}) | Romao, Licio (University of Oxford) | Abate, Alessandro (University of Oxford) | Parker, David (University of Oxford) | Poonawala, Hasan A. (University of Kentucky) | Stoelinga, Marielle (Radboud University) | Jansen, Nils (University of Twente)
Controllers for dynamical systems that operate in safety-critical settings must account for stochastic disturbances. Such disturbances are often modeled as process noise in a dynamical system, and common assumptions are that the underlying distributions are known and/or Gaussian. In practice, however, these assumptions may be unrealistic and can lead to poor approximations of the true noise distribution. We present a novel controller synthesis method that does not rely on any explicit representation of the noise distributions. In particular, we address the problem of computing a controller that provides probabilistic guarantees on safely reaching a target, while also avoiding unsafe regions of the state space. First, we abstract the continuous control system into a finite-state model that captures noise by probabilistic transitions between discrete states. As a key contribution, we adapt tools from the scenario approach to compute probably approximately correct (PAC) bounds on these transition probabilities, based on a finite number of samples of the noise. We capture these bounds in the transition probability intervals of a so-called interval Markov decision process (iMDP). This iMDP is, with a user-specified confidence probability, robust against uncertainty in the transition probabilities, and the tightness of the probability intervals can be controlled through the number of samples. We use state-of-the-art verification techniques to provide guarantees on the iMDP and compute a controller for which these guarantees carry over to the original control system. In addition, we develop a tailored computational scheme that reduces the complexity of the synthesis of these guarantees on the iMDP. Benchmarks on realistic control systems show the practical applicability of our method, even when the iMDP has hundreds of millions of transitions.
Characteristics of Restricted Boltzmann Machines part1(Thermodynamics + Machine Learning)
Abstract: Understanding the dynamics of a system is important in many scientific and engineering domains. This problem can be approached by learning state transition rules from observations using machine learning techniques. Such observed time-series data often consist of sequences of many continuous variables with noise and ambiguity, but we often need rules of dynamics that can be modeled with a few essential variables. In this work, we propose a method for extracting a small number of essential hidden variables from high-dimensional time-series data and for learning state transition rules between these hidden variables. The proposed method is based on the Restricted Boltzmann Machine (RBM), which treats observable data in the visible layer and latent features in the hidden layer.