Regression
Experiments Using Belief Functions and Weights of Evidence incorporating Statistical Data and Expert Opinions
McLeish, Mary, Yao, P., Cecile, M., Stirtzinger, T.
This paper presents some ideas and results of using uncertainty management methods in the presence of data in preference to other statistical and machine learning methods. A medical domain is used as a test-bed with data available from a large hospital database system which collects symptom and outcome information about patients. Data is often missing, of many variable types and sample sizes for particular outcomes is not large. Uncertainty management methods are useful for such domains and have the added advantage of allowing for expert modification of belief values originally obtained from data. Methodological considerations for using belief functions on statistical data are dealt with in some detail. Expert opinions are Incorporated at various levels of the project development and results are reported on an application to liver disease diagnosis. Recent results contrasting the use of weights of evidence and logistic regression on another medical domain are also presented.
Comparing Expert Systems Built Using Different Uncertain Inference Systems
Vaughan, David S., Perrin, Bruce M., Yadrick, Robert M.
This study compares the inherent intuitiveness or usability of the most prominent methods for managing uncertainty in expert systems, including those of EMYCIN, PROSPECTOR, Dempster-Shafer theory, fuzzy set theory, simplified probability theory (assuming marginal independence), and linear regression using probability estimates. Participants in the study gained experience in a simple, hypothetical problem domain through a series of learning trials. They were then randomly assigned to develop an expert system using one of the six Uncertain Inference Systems (UISs) listed above. Performance of the resulting systems was then compared. The results indicate that the systems based on the PROSPECTOR and EMYCIN models were significantly less accurate for certain types of problems compared to systems based on the other UISs. Possible reasons for these differences are discussed.
Regression for sets of polynomial equations
Kirรกly, Franz Johannes, von Bรผnau, Paul, Mรผller, Jan Saputra, Blythe, Duncan, Meinecke, Frank, Mรผller, Klaus-Robert
We propose a method called ideal regression for approximating an arbitrary system of polynomial equations by a system of a particular type. Using techniques from approximate computational algebraic geometry, we show how we can solve ideal regression directly without resorting to numerical optimization. Ideal regression is useful whenever the solution to a learning problem can be described by a system of polynomial equations. As an example, we demonstrate how to formulate Stationary Subspace Analysis (SSA), a source separation problem, in terms of ideal regression, which also yields a consistent estimator for SSA. We then compare this estimator in simulations with previous optimization-based approaches for SSA.
Bayesian Compressed Regression
Guhaniyogi, Rajarshi, Dunson, David B.
As an alternative to variable selection or shrinkage in high dimensional regression, we propose to randomly compress the predictors prior to analysis. This dramatically reduces storage and computational bottlenecks, performing well when the predictors can be projected to a low dimensional linear subspace with minimal loss of information about the response. As opposed to existing Bayesian dimensionality reduction approaches, the exact posterior distribution conditional on the compressed data is available analytically, speeding up computation by many orders of magnitude while also bypassing robustness issues due to convergence and mixing problems with MCMC. Model averaging is used to reduce sensitivity to the random projection matrix, while accommodating uncertainty in the subspace dimension. Strong theoretical support is provided for the approach by showing near parametric convergence rates for the predictive density in the large p small n asymptotic paradigm. Practical performance relative to competitors is illustrated in simulations and real data applications.
Better subset regression
To find efficient screening methods for high dimensional linear regression models, this paper studies the relationship between model fitting and screening performance. Under a sparsity assumption, we show that a subset that includes the true submodel always yields smaller residual sum of squares (i.e., has better model fitting) than all that do not in a general asymptotic setting. This indicates that, for screening important variables, we could follow a "better fitting, better screening" rule, i.e., pick a "better" subset that has better model fitting. To seek such a better subset, we consider the optimization problem associated with best subset regression. An EM algorithm, called orthogonalizing subset screening, and its accelerating version are proposed for searching for the best subset. Although the two algorithms cannot guarantee that a subset they yield is the best, their monotonicity property makes the subset have better model fitting than initial subsets generated by popular screening methods, and thus the subset can have better screening performance asymptotically. Simulation results show that our methods are very competitive in high dimensional variable screening even for finite sample sizes.
Variational Inference in Nonconjugate Models
Mean-field variational methods are widely used for approximate posterior inference in many probabilistic models. In a typical application, mean-field methods approximately compute the posterior with a coordinate-ascent optimization algorithm. When the model is conditionally conjugate, the coordinate updates are easily derived and in closed form. However, many models of interest---like the correlated topic model and Bayesian logistic regression---are nonconjuate. In these models, mean-field methods cannot be directly applied and practitioners have had to develop variational algorithms on a case-by-case basis. In this paper, we develop two generic methods for nonconjugate models, Laplace variational inference and delta method variational inference. Our methods have several advantages: they allow for easily derived variational algorithms with a wide class of nonconjugate models; they extend and unify some of the existing algorithms that have been derived for specific models; and they work well on real-world datasets. We studied our methods on the correlated topic model, Bayesian logistic regression, and hierarchical Bayesian logistic regression.
K-Nearest Neighbour algorithm coupled with logistic regression in medical case-based reasoning systems. Application to prediction of access to the renal transplant waiting list in Brittany
Campillo-Gimenez, Boris, Jouini, Wassim, Bayat, Sahar, Cuggia, Marc
Introduction. Case Based Reasoning (CBR) is an emerg- ing decision making paradigm in medical research where new cases are solved relying on previously solved similar cases. Usually, a database of solved cases is provided, and every case is described through a set of attributes (inputs) and a label (output). Extracting useful information from this database can help the CBR system providing more reliable results on the yet to be solved cases. Objective. For that purpose we suggest a general frame- work where a CBR system, viz. K-Nearest Neighbor (K-NN) algorithm, is combined with various information obtained from a Logistic Regression (LR) model. Methods. LR is applied, on the case database, to assign weights to the attributes as well as the solved cases. Thus, five possible decision making systems based on K-NN and/or LR were identified: a standalone K-NN, a standalone LR and three soft K-NN algorithms that rely on the weights based on the results of the LR. The evaluation of the described approaches is performed in the field of renal transplant access waiting list. Results and conclusion. The results show that our suggested approach, where the K-NN algorithm relies on both weighted attributes and cases, can efficiently deal with non relevant attributes, whereas the four other approaches suffer from this kind of noisy setups. The robustness of this approach suggests interesting perspectives for medical problem solving tools using CBR methodology.
Scalable Matrix-valued Kernel Learning for High-dimensional Nonlinear Multivariate Regression and Granger Causality
Sindhwani, Vikas, Quang, Minh Ha, Lozano, Aurelie C.
We propose a general matrix-valued multiple kernel learning framework for high-dimensional nonlinear multivariate regression problems. This framework allows a broad class of mixed norm regularizers, including those that induce sparsity, to be imposed on a dictionary of vector-valued Reproducing Kernel Hilbert Spaces. We develop a highly scalable and eigendecomposition-free algorithm that orchestrates two inexact solvers for simultaneously learning both the input and output components of separable matrix-valued kernels. As a key application enabled by our framework, we show how high-dimensional causal inference tasks can be naturally cast as sparse function estimation problems, leading to novel nonlinear extensions of a class of Graphical Granger Causality techniques. Our algorithmic developments and extensive empirical studies are complemented by theoretical analyses in terms of Rademacher generalization bounds.
Soft Rule Ensembles for Statistical Learning
Akdemir, Deniz, Heslot, Nicolas
In this article supervised learning problems are solved using soft rule ensembles. We first review the importance sampling learning ensembles (ISLE) approach that is useful for generating hard rules. The soft rules are then obtained with logistic regression from the corresponding hard rules. In order to deal with the perfect separation problem related to the logistic regression, Firth's bias corrected likelihood is used. Various examples and simulation results show that soft rule ensembles can improve predictive performance over hard rule ensembles.
Nonparametric Basis Pursuit via Sparse Kernel-based Learning
Bazerque, Juan Andres, Giannakis, Georgios B.
Signal processing tasks as fundamental as sampling, reconstruction, minimum mean-square error interpolation and prediction can be viewed under the prism of reproducing kernel Hilbert spaces. Endowing this vantage point with contemporary advances in sparsity-aware modeling and processing, promotes the nonparametric basis pursuit advocated in this paper as the overarching framework for the confluence of kernel-based learning (KBL) approaches leveraging sparse linear regression, nuclear-norm regularization, and dictionary learning. The novel sparse KBL toolbox goes beyond translating sparse parametric approaches to their nonparametric counterparts, to incorporate new possibilities such as multi-kernel selection and matrix smoothing. The impact of sparse KBL to signal processing applications is illustrated through test cases from cognitive radio sensing, microarray data imputation, and network traffic prediction.