AITopics | Directed Networks

Collaborating Authors

Directed Networks

News Overviews Instructional Materials AI-Alerts Classics

Data Augmentation via Levy Processes

Wager, Stefan, Fithian, William, Liang, Percy

arXiv.org Machine LearningMar-21-2016

If a document is about travel, we may expect that short snippets of the document should also be about travel. We introduce a general framework for incorporating these types of invariances into a discriminative classifier. The framework imagines data as being drawn from a slice of a Lévy process. If we slice the Lévy process at an earlier point in time, we obtain additional pseudo-examples, which can be used to train the classifier. We show that this scheme has two desirable properties: it preserves the Bayes decision boundary, and it is equivalent to fitting a generative model in the limit where we rewind time back to 0. Our construction captures popular schemes such as Gaussian feature noising and dropout training, as well as admitting new generalizations. Black-box discriminative classifiers such as logistic regression, neural networks, and SVMs are the go-to solution in machine learning: they are simple to apply and often perform well. However, an expert may have additional knowledge to exploit, often taking the form of a certain family of transformations that should usually leave labels fixed. For example, in object recognition, an image of a cat rotated, translated, and peppered with a small amount of noise is probably still a cat.

artificial intelligence, machine learning, vy process, (14 more...)

arXiv.org Machine Learning

1603.0634

Country: North America > United States > California (0.28)

Genre:

Research Report > New Finding (0.36)
Research Report > Experimental Study (0.36)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Variational Autoencoders for Feature Detection of Magnetic Resonance Imaging Data

Hjelm, R. Devon, Plis, Sergey M., Calhoun, Vince C.

arXiv.org Machine LearningMar-21-2016

Independent component analysis (ICA), as an approach to the blind source-separation (BSS) problem, has become the de-facto standard in many medical imaging settings. Despite successes and a large ongoing research effort, the limitation of ICA to square linear transformations have not been overcome, so that general INFOMAX is still far from being realized. As an alternative, we present feature analysis in medical imaging as a problem solved by Helmholtz machines, which include dimensionality reduction and reconstruction of the raw data under the same objective, and which recently have overcome major difficulties in inference and learning with deep and nonlinear configurations. We demonstrate one approach to training Helmholtz machines, variational auto-encoders (VAE), as a viable approach toward feature extraction with magnetic resonance imaging (MRI) data.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

1603.06624

Country: North America > United States (0.14)

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.30)

Industry:

Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)

Add feedback

Phase transitions and sample complexity in Bayes-optimal matrix factorization

Kabashima, Yoshiyuki, Krzakala, Florent, Mézard, Marc, Sakata, Ayaka, Zdeborová, Lenka

arXiv.org Machine LearningMar-21-2016

We analyse the matrix factorization problem. Given a noisy measurement of a product of two matrices, the problem is to estimate back the original matrices. It arises in many applications such as dictionary learning, blind matrix calibration, sparse principal component analysis, blind source separation, low rank matrix completion, robust principal component analysis or factor analysis. It is also important in machine learning: unsupervised representation learning can often be studied through matrix factorization. We use the tools of statistical mechanics - the cavity and replica methods - to analyze the achievability and computational tractability of the inference problems in the setting of Bayes-optimal inference, which amounts to assuming that the two matrices have random independent elements generated from some known distribution, and this information is available to the inference algorithm. In this setting, we compute the minimal mean-squared-error achievable in principle in any computational time, and the error that can be achieved by an efficient approximate message passing algorithm. The computation is based on the asymptotic state-evolution analysis of the algorithm. The performance that our analysis predicts, both in terms of the achieved mean-squared-error, and in terms of sample complexity, is extremely promising and motivating for a further development of the algorithm.

artificial intelligence, bayesian inference, machine learning, (14 more...)

arXiv.org Machine Learning

doi: 10.1109/TIT.2016.2556702

1402.1298

Country:

Asia (0.92)
Europe (0.67)
North America > United States > California (0.45)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.44)

Add feedback

The Computational Power of Dynamic Bayesian Networks

Brulé, Joshua

arXiv.org Artificial IntelligenceMar-19-2016

Bayesian networks are probabilistic graphical models that represent a set of random variables and their conditional dependencies via a directed acyclic graph. Explicitly modeling the conditional dependencies between random variables permit efficient algorithms to perform inference and learning in the network. Causal Bayesian networks have the additional requirement that all edges in the network model a causal relationship. Dynamic Bayesian networks are the time-generalization of Bayesian networks and relate variables to each other over adjacent time steps. Dynamic Bayesian networks unify and extend a number of state-space models including hidden Markov models, hierarchical hidden Markov models and Kalman filters. Dynamic Bayesian networks can also be seen as the natural extension of acyclic causal models to models that permit cyclic causal relationships, while avoiding problems with causal models that try to model temporal relationships with an atemporal description [1]. A natural question is what is the expressive power of such networks. The result in this paper shows that although discrete dynamic Bayesian networks are sub-Turing in computational power, introducing continuous random variables with discrete children is sufficient to model Turing-complete computation.

artificial intelligence, bayesian network, machine learning, (14 more...)

arXiv.org Artificial Intelligence

1603.06125

Country:

North America > United States > California (0.29)
North America > United States > Maryland > Prince George's County > College Park (0.14)

Genre: Research Report (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

A Probabilistic Machine Learning Approach to Detect Industrial Plant Faults

Xiao, Wei

arXiv.org Machine LearningMar-18-2016

Fault detection in industrial plants is a hot research area as more and more sensor data are being collected throughout the industrial process. Automatic data-driven approaches are widely needed and seen as a promising area of investment. This paper proposes an effective machine learning algorithm to predict industrial plant faults based on classification methods such as penalized logistic regression, random forest and gradient boosted tree. A fault's start time and end time are predicted sequentially in two steps by formulating the original prediction problems as classification problems. The algorithms described in this paper won first place in the Prognostics and Health Management Society 2015 Data Challenge.

artificial intelligence, machine learning, start time, (18 more...)

arXiv.org Machine Learning

1603.0577

Country: North America > United States (1.00)

Genre: Research Report (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

New Optimisation Methods for Machine Learning

Defazio, Aaron

arXiv.org Machine LearningMar-18-2016

A thesis submitted for the degree of Doctor of Philosophy of The Australian National University. In this work we introduce several new optimisation methods for problems in machine learning. Our algorithms broadly fall into two categories: optimisation of finite sums and of graph structured objectives. The finite sum problem is simply the minimisation of objective functions that are naturally expressed as a summation over a large number of terms, where each term has a similar or identical weight. Such objectives most often appear in machine learning in the empirical risk minimisation framework in the non-online learning setting. The second category, that of graph structured objectives, consists of objectives that result from applying maximum likelihood to Markov random field models. Unlike the finite sum case, all the non-linearity is contained within a partition function term, which does not readily decompose into a summation. For the finite sum problem, we introduce the Finito and SAGA algorithms, as well as variants of each. For graph-structured problems, we take three complementary approaches. We look at learning the parameters for a fixed structure, learning the structure independently, and learning both simultaneously. Specifically, for the combined approach, we introduce a new method for encouraging graph structures with the "scale-free" property. For the structure learning problem, we establish SHORTCUT, a O(n^{2.5}) expected time approximate structure learning method for Gaussian graphical models. For problems where the structure is known but the parameters unknown, we introduce an approximate maximum likelihood learning algorithm that is capable of learning a useful subclass of Gaussian graphical models.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

1510.02533

Country: North America > United States (1.00)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.45)

Industry:

Education (1.00)
Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback

Sparse Coding with Earth Mover's Distance for Multi-Instance Histogram Representation

Zhang, Mohua, Peng, Jianhua, Liu, Xuejie, Wang, Jim Jing-Yan

arXiv.org Machine LearningMar-14-2016

Sparse coding (Sc) has been studied very well as a powerful data representation method. It attempts to represent the feature vector of a data sample by reconstructing it as the sparse linear combination of some basic elements, and a $L_2$ norm distance function is usually used as the loss function for the reconstruction error. In this paper, we investigate using Sc as the representation method within multi-instance learning framework, where a sample is given as a bag of instances, and further represented as a histogram of the quantized instances. We argue that for the data type of histogram, using $L_2$ norm distance is not suitable, and propose to use the earth mover's distance (EMD) instead of $L_2$ norm distance as a measure of the reconstruction error. By minimizing the EMD between the histogram of a sample and the its reconstruction from some basic histograms, a novel sparse coding method is developed, which is refereed as SC-EMD. We evaluate its performances as a histogram representation method in tow multi-instance learning problems --- abnormal image detection in wireless capsule endoscopy videos, and protein binding site retrieval. The encouraging results demonstrate the advantages of the new method over the traditional method using $L_2$ norm distance.

histogram, sparse, vector, (14 more...)

arXiv.org Machine Learning

doi: 10.1007/s00521-016-2269-9

1502.02377

Country:

North America > United States > New York > Erie County > Buffalo (0.14)
Asia > China > Henan Province > Zhengzhou (0.04)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)
North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Gastroenterology (0.49)
Information Technology > Security & Privacy (0.46)
Health & Medicine > Diagnostic Medicine > Imaging (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Add feedback

Learning Network of Multivariate Hawkes Processes: A Time Series Approach

Etesami, Jalal, Kiyavash, Negar, Zhang, Kun, Singhal, Kushagra

arXiv.org Machine LearningMar-14-2016

Learning the influence structure of multiple time series data is of great interest to many disciplines. This paper studies the problem of recovering the causal structure in network of multivariate linear Hawkes processes. In such processes, the occurrence of an event in one process affects the probability of occurrence of new events in some other processes. Thus, a natural notion of causality exists between such processes captured by the support of the excitation matrix. We show that the resulting causal influence network is equivalent to the Directed Information graph (DIG) of the processes, which encodes the causal factorization of the joint distribution of the processes. Furthermore, we present an algorithm for learning the support of excitation matrix (or equivalently the DIG). The performance of the algorithm is evaluated on synthesized multivariate Hawkes networks as well as a stock market and MemeTracker real-world dataset.

artificial intelligence, hawke process, machine learning, (18 more...)

arXiv.org Machine Learning

1603.04319

Country: North America > United States > Illinois (0.14)

Genre: Research Report (0.70)

Industry:

Information Technology (1.00)
Banking & Finance > Trading (0.48)

Technology:

Information Technology > Communications > Social Media (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback

Top-$K$ Ranking from Pairwise Comparisons: When Spectral Ranking is Optimal

Jang, Minje, Kim, Sunghyun, Suh, Changho, Oh, Sewoong

arXiv.org Machine LearningMar-14-2016

We explore the top-$K$ rank aggregation problem. Suppose a collection of items is compared in pairs repeatedly, and we aim to recover a consistent ordering that focuses on the top-$K$ ranked items based on partially revealed preference information. We investigate the Bradley-Terry-Luce model in which one ranks items according to their perceived utilities modeled as noisy observations of their underlying true utilities. Our main contributions are two-fold. First, in a general comparison model where item pairs to compare are given a priori, we attain an upper and lower bound on the sample size for reliable recovery of the top-$K$ ranked items. Second, more importantly, extending the result to a random comparison model where item pairs to compare are chosen independently with some probability, we show that in slightly restricted regimes, the gap between the derived bounds reduces to a constant factor, hence reveals that a spectral method can achieve the minimax optimality on the (order-wise) sample size required for top-$K$ ranking. That is to say, we demonstrate a spectral method alone to be sufficient to achieve the optimality and advantageous in terms of computational complexity, as it does not require an additional stage of maximum likelihood estimation that a state-of-the-art scheme employs to achieve the optimality. We corroborate our main results by numerical experiments.

artificial intelligence, machine learning, ranking, (13 more...)

arXiv.org Machine Learning

1603.04153

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

A Statistical Decision-Theoretic Framework for Social Choice

Soufiani, Hossein Azari, Parkes, David C., Xia, Lirong

arXiv.org Artificial IntelligenceMar-12-2016

In this paper, we take a statistical decision-theoretic viewpoint on social choice, putting a focus on the decision to be made on behalf of a system of agents. In our framework, we are given a statistical ranking model, a decision space, and a loss function defined on (parameter, decision) pairs, and formulate social choice mechanisms as decision rules that minimize expected loss. This suggests a general framework for the design and analysis of new social choice mechanisms. We compare Bayesian estimators, which minimize Bayesian expected loss, for the Mallows model and the Condorcet model respectively, and the Kemeny rule. We consider various normative properties, in addition to computational complexity and asymptotic behavior. In particular, we show that the Bayesian estimator for the Condorcet model satisfies some desired properties such as anonymity, neutrality, and monotonicity, can be computed in polynomial time, and is asymptotically different from the other two rules when the data are generated from the Condorcet model for some ground truth parameter.

artificial intelligence, kendall, machine learning, (18 more...)

arXiv.org Artificial Intelligence

1410.7856

Country: North America > United States > New York (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback