Goto

Collaborating Authors

 Uncertainty


Knowledge Base of an Expert System Used for Dyslalic Children Therapy

arXiv.org Artificial Intelligence

-- In order to improve children speech therapy, we develop a Fuzzy Expert System based on a speech therapy guide. This guide, write in natural language, was formalized using fuzzy logic paradigm. In this manner we obtain a knowledge base with over 150 rules and 19 linguistic variables. All these researches, including expert system validation, are part of TERAPERS project (financed by the National Agency for Scientific Research, Romania). I. INTRODUCTION The main objectives of speech therapy expert system develop by our team are [1]: - personalized therapy (the therapy must be in according with child's problems level, context and possibilities); - speech therapist assistant (the expert system offer some suggestion regarding what exercises are better for a specific moment and from a specific child); - (self) teaching (when system's conclusion is different that speech therapist's conclusion the last one must have the knowledge base change possibility).


Architecture of a Fuzzy Expert System Used for Dyslalic Children Therapy

arXiv.org Artificial Intelligence

In this paper we present architecture of a fuzzy expert system used for therapy of dyslalic children. With fuzzy approach we can create a better model for speech therapist decisions. A software interface was developed for validation of the system. The main objectives of this task are: personalized therapy (the therapy must be in according with child's problems level, context and possibilities), speech therapist assistant (the expert system offer some suggestion regarding what exercises are better for a specific moment and from a specific child), (self) teaching (when system's conclusion is different that speech therapist's conclusion the last one must have the knowledge base change possibility). Keywords: fuzzy expert systems, speech therapy 1. Introduction In this article we refer to LOGOMON system developed in TERAPERS project by the authors.


Supervised Dictionary Learning by a Variational Bayesian Group Sparse Nonnegative Matrix Factorization

arXiv.org Machine Learning

INCE the appearance of the seminal paper [1], NMF has become a popular data decomposition technique due to succesful applications in a still growing number of fields where data are nonnegative, such as pixel intensities in computer vision, amplitude spectra in audio signal analysis and EEG signal analysis, term counts in document clustering problems, and item ratings in collaborative filtering. NMF aims at decompositions, where, and are all nonnegative matrices. Throughout this paper will be regarded as a collection of data samples organized columnwise, as a dictionary of features organized columnwise, and as matrix of coefficients when is projected onto the dictionary. Under assumptions of linearity and nonnegativity, when underlying dimensionality is lower than dimensionality of the original space of the data, dimensionality reduction of the data can effectively be achieved this way. Although the decomposition is nonunique in general, NMF is able to produce strictly additive decompositions perceived as part-based by adding additional bias in the model [1], [2]. To this end, different sparsity promoting regularizers have been proposed for divergence-based NMF [3]. Also, to include higher order data descriptions, many other variants have been developed, e.g.


Bayesian Inference for Gaussian Process Classifiers with Annealing and Pseudo-Marginal MCMC

arXiv.org Machine Learning

Kernel methods have revolutionized the fields of pattern recognition and machine learning. Their success, however, critically depends on the choice of kernel parameters. Using Gaussian process (GP) classification as a working example, this paper focuses on Bayesian inference of covariance (kernel) parameters using Markov chain Monte Carlo (MCMC) methods. The motivation is that, compared to standard optimization of kernel parameters, they have been systematically demonstrated to be superior in quantifying uncertainty in predictions. Recently, the Pseudo-Marginal MCMC approach has been proposed as a practical inference tool for GP models. In particular, it amounts in replacing the analytically intractable marginal likelihood by an unbiased estimate obtainable by approximate methods and importance sampling. After discussing the potential drawbacks in employing importance sampling, this paper proposes the application of annealed importance sampling. The results empirically demonstrate that compared to importance sampling, annealed importance sampling can reduce the variance of the estimate of the marginal likelihood exponentially in the number of data at a computational cost that scales only polynomially. The results on real data demonstrate that employing annealed importance sampling in the Pseudo-Marginal MCMC approach represents a step forward in the development of fully automated exact inference engines for GP models.


Scalable Recommendation with Poisson Factorization

arXiv.org Artificial Intelligence

We develop a Bayesian Poisson matrix factorization model for forming recommendations from sparse user behavior data. These data are large user/item matrices where each user has provided feedback on only a small subset of items, either explicitly (e.g., through star ratings) or implicitly (e.g., through views or purchases). In contrast to traditional matrix factorization approaches, Poisson factorization implicitly models each user's limited attention to consume items. Moreover, because of the mathematical form of the Poisson likelihood, the model needs only to explicitly consider the observed entries in the matrix, leading to both scalable computation and good predictive performance. We develop a variational inference algorithm for approximate posterior inference that scales up to massive data sets. This is an efficient algorithm that iterates over the observed entries and adjusts an approximate posterior over the user/item representations. We apply our method to large real-world user data containing users rating movies, users listening to songs, and users reading scientific papers. In all these settings, Bayesian Poisson factorization outperforms state-of-the-art matrix factorization methods.


Modelling Data Dispersion Degree in Automatic Robust Estimation for Multivariate Gaussian Mixture Models with an Application to Noisy Speech Processing

arXiv.org Machine Learning

The trimming scheme with a prefixed cutoff portion is known as a method of improving the robustness of statistical models such as multivariate Gaussian mixture models (MG- MMs) in small scale tests by alleviating the impacts of outliers. However, when this method is applied to real- world data, such as noisy speech processing, it is hard to know the optimal cut-off portion to remove the outliers and sometimes removes useful data samples as well. In this paper, we propose a new method based on measuring the dispersion degree (DD) of the training data to avoid this problem, so as to realise automatic robust estimation for MGMMs. The DD model is studied by using two different measures. For each one, we theoretically prove that the DD of the data samples in a context of MGMMs approximately obeys a specific (chi or chi-square) distribution. The proposed method is evaluated on a real-world application with a moderately-sized speaker recognition task. Experiments show that the proposed method can significantly improve the robustness of the conventional training method of GMMs for speaker recognition.


Bayesian estimation of possible causal direction in the presence of latent confounders using a linear non-Gaussian acyclic structural equation model with individual-specific effects

arXiv.org Machine Learning

We consider learning the possible causal direction of two observed variables in the presence of latent confounding variables. Several existing methods have been shown to consistently estimate causal direction assuming linear or some type of nonlinear relationship and no latent confounders. However, the estimation results could be distorted if either assumption is actually violated. In this paper, we first propose a new linear non-Gaussian acyclic structural equation model with individual-specific effects that allows latent confounders to be considered. We then propose an empirical Bayesian approach for estimating possible causal direction using the new model. We demonstrate the effectiveness of our method using artificial and real-world data.


Effective Bayesian Modeling of Groups of Related Count Time Series

arXiv.org Machine Learning

Time series of counts arise in a variety of forecasting applications, for which traditional models are generally inappropriate. This paper introduces a hierarchical Bayesian formulation applicable to count time series that can easily account for explanatory variables and share statistical strength across groups of related time series. We derive an efficient approximate inference technique, and illustrate its performance on a number of datasets from supply chain planning.


Topic words analysis based on LDA model

arXiv.org Machine Learning

Social network analysis (SNA), which is a research field describing and modeling the social connection of a certain group of people, is popular among network services. Our topic words analysis project is a SNA method to visualize the topic words among emails from Obama.com to accounts registered in Columbus, Ohio. Based on Latent Dirichlet Allocation (LDA) model, a popular topic model of SNA, our project characterizes the preference of senders for target group of receptors. Gibbs sampling is used to estimate topic and word distribution. Our training and testing data are emails from the carbon-free server Datagreening.com. We use parallel computing tool BashReduce for word processing and generate related words under each latent topic to discovers typical information of political news sending specially to local Columbus receptors. Running on two instances using paralleling tool BashReduce, our project contributes almost 30% speedup processing the raw contents, comparing with processing contents on one instance locally. Also, the experimental result shows that the LDA model applied in our project provides precision rate 53.96% higher than TF-IDF model finding target words, on the condition that appropriate size of topic words list is selected.


Credal Model Averaging for classification: representing prior ignorance and expert opinions

arXiv.org Machine Learning

Bayesian model averaging (BMA) is the state of the art approach for overcoming model uncertainty. Yet, especially on small data sets, the results yielded by BMA might be sensitive to the prior over the models. Credal Model Averaging (CMA) addresses this problem by substituting the single prior over the models by a set of priors (credal set). Such approach solves the problem of how to choose the prior over the models and automates sensitivity analysis. We discuss various CMA algorithms for building an ensemble of logistic regressors characterized by different sets of covariates. We show how CMA can be appropriately tuned to the case in which one is prior-ignorant and to the case in which instead domain knowledge is available. CMA detects prior-dependent instances, namely instances in which a different class is more probable depending on the prior over the models. On such instances CMA suspends the judgment, returning multiple classes. We thoroughly compare different BMA and CMA variants on a real case study, predicting presence of Alpine marmot burrows in an Alpine valley. We find that BMA is almost a random guesser on the instances recognized as prior-dependent by CMA.