AITopics | Directed Networks

Collaborating Authors

Directed Networks

News Overviews Instructional Materials AI-Alerts Classics

Probabilistic Inference of Twitter Users' Age based on What They Follow

Chamberlain, Benjamin Paul, Humby, Clive, Deisenroth, Marc Peter

arXiv.org Machine LearningFeb-24-2017

Twitter provides an open and rich source of data for studying human behaviour at scale and is widely used in social and network sciences. However, a major criticism of Twitter data is that demographic information is largely absent. Enhancing Twitter data with user ages would advance our ability to study social network structures, information flows and the spread of contagions. Approaches toward age detection of Twitter users typically focus on specific properties of tweets, e.g., linguistic features, which are language dependent. In this paper, we devise a language-independent methodology for determining the age of Twitter users from data that is native to the Twitter ecosystem. The key idea is to use a Bayesian framework to generalise ground-truth age information from a few Twitter users to the entire network based on what/whom they follow.

artificial intelligence, machine learning, twitter user, (16 more...)

arXiv.org Machine Learning

1601.04621

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Add feedback

Sarcasm Detection with Machine Learning in Spark

#artificialintelligenceFeb-23-2017, 22:01:39 GMT

This post is inspired by a site I found whilst searching for a way to detect sarcasm within sentences. As humans we sometimes struggle detecting sarcasm when we have a lot more contextual information available to us. People are emotive when they speak, they use certain tones and these traits can help us understand when someone is being sarcastic. However we don't always catch it! So how the hell could a computer detect this, when all it has is text.

artificial intelligence, machine learning, training data, (13 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Spectral Clustering using PCKID - A Probabilistic Cluster Kernel for Incomplete Data

Løkse, Sigurd, Bianchi, Filippo Maria, Salberg, Arnt-Børre, Jenssen, Robert

arXiv.org Machine LearningFeb-23-2017

In this paper, we propose PCKID, a novel, robust, kernel function for spectral clustering, specifically designed to handle incomplete data. By combining posterior distributions of Gaussian Mixture Models for incomplete data on different scales, we are able to learn a kernel for incomplete data that does not depend on any critical hyperparameters, unlike the commonly used RBF kernel. To evaluate our method, we perform experiments on two real datasets. PCKID outperforms the baseline methods for all fractions of missing values and in some cases outperforms the baseline methods with up to 25 percentage points.

data mining, machine learning, spectral, (15 more...)

arXiv.org Machine Learning

1702.0719

Genre: Research Report (0.64)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Scalable Inference for Nested Chinese Restaurant Process Topic Models

Chen, Jianfei, Zhu, Jun, Lu, Jie, Liu, Shixia

arXiv.org Machine LearningFeb-22-2017

Nested Chinese Restaurant Process (nCRP) topic models are powerful nonparametric Bayesian methods to extract a topic hierarchy from a given text corpus, where the hierarchical structure is automatically determined by the data. Hierarchical Latent Dirichlet Allocation (hLDA) is a popular instance of nCRP topic models. However, hLDA has only been evaluated at small scale, because the existing collapsed Gibbs sampling and instantiated weight variational inference algorithms either are not scalable or sacrifice inference quality with mean-field assumptions. Moreover, an efficient distributed implementation of the data structures, such as dynamically growing count matrices and trees, is challenging. In this paper, we propose a novel partially collapsed Gibbs sampling (PCGS) algorithm, which combines the advantages of collapsed and instantiated weight algorithms to achieve good scalability as well as high model quality. An initialization strategy is presented to further improve the model quality. Finally, we propose an efficient distributed implementation of PCGS through vectorization, pre-processing, and a careful design of the concurrent data structures and communication strategy. Empirical studies show that our algorithm is 111 times more efficient than the previous open-source implementation for hLDA, with comparable or even better model quality. Our distributed implementation can extract 1,722 topics from a 131-million-document corpus with 28 billion tokens, which is 4-5 orders of magnitude larger than the previous largest corpus, with 50 machines in 7 hours.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

1702.07083

Country:

Europe (1.00)
Asia > India (0.28)
Asia > Japan (0.28)
(2 more...)

Genre: Research Report (1.00)

Industry: Consumer Products & Services > Restaurants (0.71)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.89)

Add feedback

The Mathematics of Machine Learning

#artificialintelligenceFeb-21-2017, 03:55:20 GMT

In the last few months, I have had several people contact me about their enthusiasm for venturing into the world of data science and using Machine Learning (ML) techniques to probe statistical regularities and build impeccable data-driven products. However, I've observed that some actually lack the necessary mathematical intuition and framework to get useful results. This is the main reason I decided to write this blog post. Recently, there has been an upsurge in the availability of many easy-to-use machine and deep learning packages such as scikit-learn, Weka, Tensorflow etc. Machine Learning theory is a field that intersects statistical, probabilistic, computer science and algorithmic aspects arising from learning iteratively from data and finding hidden insights which can be used to build intelligent applications. Despite the immense possibilities of Machine and Deep Learning, a thorough mathematical understanding of many of these techniques is necessary for a good grasp of the inner workings of the algorithms and getting good results. There are many reasons why the mathematics of Machine Learning is important and I'll highlight some of them below: The main question when trying to understand an interdisciplinary field such as Machine Learning is the amount of maths necessary and the level of maths needed to understand these techniques.

artificial intelligence, bayesian inference, machine learning, (11 more...)

#artificialintelligence

Industry: Education (0.72)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.31)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.31)

Add feedback

Social Learning and Diffusion of Pervasive Goods: An Empirical Study of an African App Store

Nia, Meisam Hejazi, Ratchford, Brian T., Bruce, Norris

arXiv.org Machine LearningFeb-21-2017

In this study, the authors develop a structural model that combines a macro diffusion model with a micro choice model to control for the effect of social influence on the mobile app choices of customers over app stores. Social influence refers to the density of adopters within the proximity of other customers. Using a large data set from an African app store and Bayesian estimation methods, the authors quantify the effect of social influence and investigate the impact of ignoring this process in estimating customer choices. The findings show that customer choices in the app store are explained better by offline than online density of adopters and that ignoring social influence in estimations results in biased estimates. Furthermore, the findings show that the mobile app adoption process is similar to adoption of music CDs, among all other classic economy goods. A counterfactual analysis shows that the app store can increase its revenue by 13.6% through a viral marketing policy (e.g., a sharing with friends and family button).

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

1702.06661

Country: North America > United States (0.67)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry:

Media (0.92)
Telecommunications (0.92)
Information Technology (0.68)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)

Add feedback

Amortised MAP Inference for Image Super-resolution

Sønderby, Casper Kaae, Caballero, Jose, Theis, Lucas, Shi, Wenzhe, Huszár, Ferenc

arXiv.org Machine LearningFeb-21-2017

Image super-resolution (SR) is an underdetermined inverse problem, where a large number of plausible high-resolution images can explain the same downsampled image. Most current single image SR methods use empirical risk minimisation, often with a pixel-wise mean squared error (MSE) loss. However, the outputs from such methods tend to be blurry, over-smoothed and generally appear implausible. A more desirable approach would employ Maximum a Posteriori (MAP) inference, preferring solutions that always have a high probability under the image prior, and thus appear more plausible. Direct MAP estimation for SR is non-trivial, as it requires us to build a model for the image prior from samples. Furthermore, MAP inference is often performed via optimisation-based iterative algorithms which don't compare well with the efficiency of neural-network-based alternatives. Here we introduce new methods for amortised MAP inference whereby we calculate the MAP estimate directly using a convolutional neural network. We first introduce a novel neural network architecture that performs a projection to the affine subspace of valid SR solutions ensuring that the high resolution output of the network is always consistent with the low resolution input. We show that, using this architecture, the amortised MAP inference problem reduces to minimising the cross-entropy between two distributions, similar to training generative models. We propose three methods to solve this optimisation problem: (1) Generative Adversarial Networks (GAN) (2) denoiser-guided SR which backpropagates gradient-estimates from denoising to train the network, and (3) a baseline method using a maximum-likelihood-trained image prior. Our experiments show that the GAN based approach performs best on real image data. Lastly, we establish a connection between GANs and amortised variational inference as in e.g. variational autoencoders.

artificial intelligence, discriminator, machine learning, (19 more...)

arXiv.org Machine Learning

1610.0449

Country: Europe (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback

Towards a Common Implementation of Reinforcement Learning for Multiple Robotic Tasks

Martínez-Tenor, Angel, Fernández-Madrigal, Juan Antonio, Cruz-Martín, Ana, González-Jiménez, Javier

arXiv.org Artificial IntelligenceFeb-21-2017

Mobile robots are increasingly being employed for performing complex tasks in dynamic environments. Reinforcement learning (RL) methods are recognized to be promising for specifying such tasks in a relatively simple manner. However, the strong dependency between the learning method and the task to learn is a well-known problem that restricts practical implementations of RL in robotics, often requiring major modifications of parameters and adding other techniques for each particular task. In this paper we present a practical core implementation of RL which enables the learning process for multiple robotic tasks with minimal per-task tuning or none. Based on value iteration methods, this implementation includes a novel approach for action selection, called Q-biased softmax regression (QBIASSR), which avoids poor performance of the learning process when the robot reaches new unexplored states. Our approach takes advantage of the structure of the state space by attending the physical variables involved (e.g., distances to obstacles, X,Y,{\theta} pose, etc.), thus experienced sets of states may favor the decision-making process of unexplored or rarely-explored states. This improvement has a relevant role in reducing the tuning of the algorithm for particular tasks. Experiments with real and simulated robots, performed with the software framework also introduced here, show that our implementation is effectively able to learn different robotic tasks without tuning the learning method. Results also suggest that the combination of true online SARSA({\lambda}) with QBIASSR can outperform the existing RL core algorithms in low-dimensional robotic tasks.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.eswa.2017.11.011

1702.06329

Country:

North America > United States (0.93)
Europe (0.67)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

On the Sample Complexity of Learning Graphical Games

Honorio, Jean

arXiv.org Machine LearningFeb-20-2017

We analyze the sample complexity of learning graphical games from purely behavioral data. We assume that we can only observe the players' joint actions and not their payoffs. We analyze the sufficient and necessary number of samples for the correct recovery of the set of pure-strategy Nash equilibria (PSNE) of the true game. Our analysis focuses on directed graphs with $n$ nodes and at most $k$ parents per node. Sparse graphs correspond to ${k \in O(1)}$ with respect to $n$, while dense graphs correspond to ${k \in O(n)}$. By using VC dimension arguments, we show that if the number of samples is greater than ${O(k n \log^2{n})}$ for sparse graphs or ${O(n^2 \log{n})}$ for dense graphs, then maximum likelihood estimation correctly recovers the PSNE with high probability. By using information-theoretic arguments, we show that if the number of samples is less than ${\Omega(k n \log^2{n})}$ for sparse graphs or ${\Omega(n^2 \log{n})}$ for dense graphs, then any conceivable method fails to recover the PSNE with arbitrary probability.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

1601.07243

Country: North America > United States (1.00)

Genre: Research Report (0.50)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)

Add feedback

Approximate Bayes learning of stochastic differential equations

Batz, Philipp, Ruttor, Andreas, Opper, Manfred

arXiv.org Machine LearningFeb-17-2017

Gaussian processes are used as flexible models for these functions and estimates are calculated directly from dense data sets using Gaussian process regression. We also develop an approximate expectation maximization algorithm to deal with the unobserved, latent dynamics between sparse observations. The posterior over states is approximated by a piecewise linearized process of the Ornstein-Uhlenbeck type and the maximum a posteriori estimation of the drift is facilitated by a sparse Gaussian process approximation. I. INTRODUCTION Dynamical systems in the physical world evolve in continuous time and often the (noisy) dynamics is described naturally in terms of (stochastic) differential equations [1]. However, due to missing information and/or the complexity of a system it may be difficult to derive such a model from first principles. Instead, the goal often is to fit it to observations of the state at discrete points in time [2]. So far most inference approaches for these systems have dealt with the estimation of parameters contained in the drift function (e.g. Assumptions for the stochastic part were often simple: additive noise with the diffusion constant as the only parameter to estimate. But as both drift and diffusion can be nonlinear functions of the state vector, a nonparametric estimation would be a natural generalization, when a large number of data points is available. Previous nonparametric approaches were based on solving the adjoint Fokker-Planck equation [5] and on kernel estimators [6] and are effectively restricted to one-dimensional models. An alternative would be a Bayesian nonparametric approach, where prior knowledge on the unknown functions--such as smoothness, variability, or periodicity--can be encoded in a probability distribution. A recent result by [7, 8] presented an important step in this direction. The authors have shown that Gaussian processes (GPs) provide a natural family of prior probability measures over drift functions. If a path of the stochastic dynamics is observed densely, the posterior process over the drift is also a GP. Unfortunately, this simplicity is lost, when observations are not dense, but separated by larger time intervals. In [7] the case of sparse observations has been treated by a Monte Carlo approach, which alternates between sampling complete diffusion paths of the stochastic differential equation (SDE) and sampling from GP for the drift given a philipp.batz@tu-berlin.de

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

1702.0539

Country: Europe (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback