AITopics | Directed Networks

Collaborating Authors

Directed Networks

News Overviews Instructional Materials AI-Alerts Classics

Markov Boundary Discovery with Ridge Regularized Linear Models

arXiv.org Machine LearningSep-13-2015

Ridge regularized linear models (RRLMs), such as ridge regression and the SVM, are a popular group of methods that are used in conjunction with coefficient hypothesis testing to discover explanatory variables with a significant multivariate association to a response. However, many investigators are reluctant to draw causal interpretations of the selected variables due to the incomplete knowledge of the capabilities of RRLMs in causal inference. Under reasonable assumptions, we show that a modified form of RRLMs can get very close to identifying a subset of the Markov boundary by providing a worst-case bound on the space of possible solutions. The results hold for any convex loss, even when the underlying functional relationship is nonlinear, and the solution is not unique. Our approach combines ideas in Markov boundary and sufficient dimension reduction theory. Experimental results show that the modified RRLMs are competitive against state-of-the-art algorithms in discovering part of the Markov boundary from gene expression data.

artificial intelligence, machine learning, markov boundary, (19 more...)

arXiv.org Machine Learning

1509.03935

Country: North America > United States > California (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.94)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
(2 more...)

Add feedback

Adaptive Low-Complexity Sequential Inference for Dirichlet Process Mixture Models

Tsiligkaridis, Theodoros, Forsythe, Keith W.

arXiv.org Machine LearningSep-11-2015

We develop a sequential low-complexity inference procedure for Dirichlet process mixtures of Gaussians for online clustering and parameter estimation when the number of clusters are unknown a-priori. We present an easily computable, closed form parametric expression for the conditional likelihood, in which hyperparameters are recursively updated as a function of the streaming data assuming conjugate priors. Motivated by large-sample asymptotics, we propose a novel adaptive low-complexity design for the Dirichlet process concentration parameter and show that the number of classes grow at most at a logarithmic rate. We further prove that in the large-sample limit, the conditional likelihood and data predictive distribution become asymptotically Gaussian. We demonstrate through experiments on synthetic and real data sets that our approach is superior to other online state-of-the-art methods.

artificial intelligence, bayesian inference, machine learning, (13 more...)

arXiv.org Machine Learning

1409.8185

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report (0.84)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Newton-based maximum likelihood estimation in nonlinear state space models

Kok, Manon, Dahlin, Johan, Schön, Thomas B., Wills, Adrian

arXiv.org Machine LearningSep-11-2015

Maximum likelihood (ML) estimation using Newton's method in nonlinear state space models (SSMs) is a challenging problem due to the analytical intractability of the log-likelihood and its gradient and Hessian. We estimate the gradient and Hessian using Fisher's identity in combination with a smoothing algorithm. We explore two approximations of the log-likelihood and of the solution of the smoothing problem. The first is a linearization approximation which is computationally cheap, but the accuracy typically varies between models. The second is a sampling approximation which is asymptotically valid for any SSM but is more computationally costly. We demonstrate our approach for ML parameter estimation on simulated data from two different SSMs with encouraging results. Keywords: Maximum likelihood, parameter estimation, nonlinear state space models, Fisher's identity, extended Kalman filters, particle methods, Newton optimization.

approximation, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

doi: 10.1016/j.ifacol.2015.12.160

1502.03655

Country: Europe > Sweden (0.29)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

A Variational Bayesian State-Space Approach to Online Passive-Aggressive Regression

Salas, Arnold, Roberts, Stephen J., Osborne, Michael A.

arXiv.org Machine LearningSep-8-2015

Online Passive-Aggressive (PA) learning is a class of online margin-based algorithms suitable for a wide range of real-time prediction tasks, including classification and regression. PA algorithms are formulated in terms of deterministic point-estimation problems governed by a set of user-defined hyperparameters: the approach fails to capture model/prediction uncertainty and makes their performance highly sensitive to hyperparameter configurations. In this paper, we introduce a novel PA learning framework for regression that overcomes the above limitations. We contribute a Bayesian state-space interpretation of PA regression, along with a novel online variational inference scheme, that not only produces probabilistic predictions, but also offers the benefit of automatic hyperparameter tuning. Experiments with various real-world data sets show that our approach performs significantly better than a more standard, linear Gaussian state-space model.

artificial intelligence, machine learning, regression, (14 more...)

arXiv.org Machine Learning

1509.02438

Country:

Europe (0.28)
North America > United States (0.14)

Genre: Research Report (0.50)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Supervised Collective Classification for Crowdsourcing

Chen, Pin-Yu, Lien, Chia-Wei, Chu, Fu-Jen, Ting, Pai-Shun, Cheng, Shin-Ming

arXiv.org Machine LearningSep-7-2015

Crowdsourcing utilizes the wisdom of crowds for collective classification via information (e.g., labels of an item) provided by labelers. Current crowdsourcing algorithms are mainly unsupervised methods that are unaware of the quality of crowdsourced data. In this paper, we propose a supervised collective classification algorithm that aims to identify reliable labelers from the training data (e.g., items with known labels). The reliability (i.e., weighting factor) of each labeler is determined via a saddle point algorithm. The results on several crowdsourced data show that supervised methods can achieve better classification accuracy than unsupervised methods, and our proposed method outperforms other algorithms.

artificial intelligence, labeler, machine learning, (18 more...)

arXiv.org Machine Learning

doi: 10.1109/GLOCOMW.2015.7414077

1507.06682

Country: North America > United States (0.68)

Genre: Research Report (0.64)

Industry: Education (0.93)

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback

Simultaneous Clustering and Model Selection for Multinomial Distribution: A Comparative Study

Hasnat, Md. Abul, Velcin, Julien, Bonnevay, Stéphane, Jacques, Julien

arXiv.org Machine LearningSep-6-2015

In this paper, we study different discrete data clustering methods, which use the Model-Based Clustering (MBC) framework with the Multinomial distribution. Our study comprises several relevant issues, such as initialization, model estimation and model selection. Additionally, we propose a novel MBC method by efficiently combining the partitional and hierarchical clustering techniques. We conduct experiments on both synthetic and real data and evaluate the methods using accuracy, stability and computation time. Our study identifies appropriate strategies to be used for discrete data analysis with the MBC methods. Moreover, our proposed method is very competitive w.r.t.

artificial intelligence, dataset, machine learning, (12 more...)

arXiv.org Machine Learning

1505.02324

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback

Stochastic gradient variational Bayes for gamma approximating distributions

Knowles, David A.

arXiv.org Machine LearningSep-4-2015

While stochastic variational inference is relatively well known for scaling inference in Bayesian probabilistic models, related methods also offer ways to circumnavigate the approximation of analytically intractable expectations. The key challenge in either setting is controlling the variance of gradient estimates: recent work has shown that for continuous latent variables, particularly multivariate Gaussians, this can be achieved by using the gradient of the log posterior. In this paper we apply the same idea to gamma distributed latent variables given gamma variational distributions, enabling straightforward "black box" variational inference in models where sparsity and non-negativity are appropriate. We demonstrate the method on a recently proposed gamma process model for network data, as well as a novel sparse factor analysis. We outperform generic sampling algorithms and the approach of using Gaussian variational distributions on transformed variables.

artificial intelligence, inference, machine learning, (17 more...)

arXiv.org Machine Learning

1509.01631

Genre: Research Report (0.50)

Industry:

Health & Medicine (0.46)
Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.51)

Add feedback

Reliable ABC model choice via random forests

Pudlo, Pierre, Marin, Jean-Michel, Estoup, Arnaud, Cornuet, Jean-Marie, Gautier, Mathieu, Robert, Christian P.

arXiv.org Machine LearningSep-2-2015

Approximate Bayesian computation (ABC) methods provide an elaborate approach to Bayesian inference on complex models, including model choice. Both theoretical arguments and simulation experiments indicate, however, that model posterior probabilities may be poorly evaluated by standard ABC techniques. We propose a novel approach based on a machine learning tool named random forests to conduct selection among the highly complex models covered by ABC algorithms. We thus modify the way Bayesian model selection is both understood and operated, in that we rephrase the inferential goal as a classification problem, first predicting the model that best fits the data with random forests and postponing the approximation of the posterior probability of the predicted MAP for a second stage also relying on random forests. Compared with earlier implementations of ABC model choice, the ABC random forest approach offers several potential improvements: (i) it often has a larger discriminative power among the competing models, (ii) it is more robust against the number and choice of statistics summarizing the data, (iii) the computing effort is drastically reduced (with a gain in computation efficiency of at least fifty), and (iv) it includes an approximation of the posterior probability of the selected model. The call to random forests will undoubtedly extend the range of size of datasets and complexity of models that ABC can handle. We illustrate the power of this novel methodology by analyzing controlled experiments as well as genuine population genetics datasets. The proposed methodologies are implemented in the R package abcrf available on the CRAN.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

1406.6288

Country:

North America > United States (0.28)
Europe > France (0.28)
Europe > United Kingdom (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
(2 more...)

Add feedback

Fast rates in statistical and online learning

van Erven, Tim, Grünwald, Peter D., Mehta, Nishant A., Reid, Mark D., Williamson, Robert C.

arXiv.org Machine LearningSep-1-2015

The speed with which a learning algorithm converges as it is presented with more data is a central problem in machine learning --- a fast rate of convergence means less data is needed for the same level of performance. The pursuit of fast rates in online and statistical learning has led to the discovery of many conditions in learning theory under which fast learning is possible. We show that most of these conditions are special cases of a single, unifying condition, that comes in two forms: the central condition for 'proper' learning algorithms that always output a hypothesis in the given model, and stochastic mixability for online algorithms that may make predictions outside of the model. We show that under surprisingly weak assumptions both conditions are, in a certain sense, equivalent. The central condition has a re-interpretation in terms of convexity of a set of pseudoprobabilities, linking it to density estimation under misspecification. For bounded losses, we show how the central condition enables a direct proof of fast rates and we prove its equivalence to the Bernstein condition, itself a generalization of the Tsybakov margin condition, both of which have played a central role in obtaining fast rates in statistical learning. Yet, while the Bernstein condition is two-sided, the central condition is one-sided, making it more suitable to deal with unbounded losses. In its stochastic mixability form, our condition generalizes both a stochastic exp-concavity condition identified by Juditsky, Rigollet and Tsybakov and Vovk's notion of mixability. Our unifying conditions thus provide a substantial step towards a characterization of fast rates in statistical learning, similar to how classical mixability characterizes constant regret in the sequential prediction with expert advice setting.

central condition, fast rate, mixability, (14 more...)

arXiv.org Machine Learning

1507.02592

Country:

North America > United States > New York (0.04)
Europe > Netherlands > South Holland > Leiden (0.04)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)
(7 more...)

Genre: Research Report (0.63)

Industry: Education > Educational Setting > Online (0.51)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

A fast numerical method for max-convolution and the application to efficient max-product inference in Bayesian networks

Serang, Oliver

arXiv.org Machine LearningSep-1-2015

In many fields it is common to have access to information about sums of random variables and to desire information about those variables themselves. In mass spectrometry, when two (or more) analytes with similar mass-to-charge 1 are measured, the intensity of the resulting peak is a function of the sum of abundances of those analytes (this problem occurs not only in the mass spectrometry of small molecules, but also in measuring isotope measurement in elemental and nuclear mass spectrometry). In transcriptomics, the abundance of a particular non-unique read (i.e., an RNA sequence that maps to multiple locations in the transcriptome or genome) provides information about the sum of the abundances of all transcripts that contain the read (each transcript weighted by how many copies of the read it carries). Proteomics has its own version of non-unique reads, shared peptides which can be found in multiple proteins (not only are shared peptides the principal source of difficulty in protein inference [14, 17, 18], they are also responsible for the difficulty evaluating putatative sets of discovered proteins [16, 19]). In population genetics, the prior knowledge about population structure can suggest an expected number of individuals with a particular genotype, which in turn yields probabilistic information about the individuals whose aggregate genotypes are expected to produce that sum (inference is particularly pronounced in polyploids, which increase the dimensionality of the problem [15]).

artificial intelligence, bioinformatics, machine learning, (20 more...)

arXiv.org Machine Learning

doi: 10.1089/cmb.2015.0013

1501.02627

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Biomedical Informatics > Translational Bioinformatics (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.50)

Add feedback