AITopics

1809.02188

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.86)

Yang, Sikun, Koeppl, Heinz

Collapsed Variational Inference for Nonparametric Bayesian Group Factor Analysis

arXiv.org Machine LearningSep-10-2018

Group factor analysis (GFA) methods have been widely used to infer the common structure and the group-specific signals from multiple related datasets in various fields including systems biology and neuroimaging. To date, most available GFA models require Gibbs sampling or slice sampling to perform inference, which prevents the practical application of GFA to large-scale data. In this paper we present an efficient collapsed variational inference (CVI) algorithm for the nonparametric Bayesian group factor analysis (NGFA) model built upon an hierarchical beta Bernoulli process. Our CVI algorithm proceeds by marginalizing out the group-specific beta process parameters, and then approximating the true posterior in the collapsed space using mean field methods. Experimental results on both synthetic and real-world data demonstrate the effectiveness of our CVI algorithm for the NGFA compared with state-of-the-art GFA methods.

artificial intelligence, bayesian inference, machine learning, (15 more...)

1809.03566

Country:

Europe > Finland > Paijanne Tavastia > Lahti (0.05)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Khodayar, Mahdi, Mohammadi, Saeed, Khodayar, Mohammad, Wang, Jianhui, Liu, Guangyi

Convolutional Graph Auto-encoder: A Deep Generative Neural Architecture for Probabilistic Spatio-temporal Solar Irradiance Forecasting

arXiv.org Machine LearningSep-10-2018

Machine Learning on graph-structured data is an important and omnipresent task for a vast variety of applications including anomaly detection and dynamic network analysis. In this paper, a deep generative model is introduced to capture continuous probability densities corresponding to the nodes of an arbitrary graph. In contrast to all learning formulations in the area of discriminative pattern recognition, we propose a scalable generative optimization/algorithm theoretically proved to capture distributions at the nodes of a graph. Our model is able to generate samples from the probability densities learned at each node. This probabilistic data generation model, i.e. convolutional graph auto-encoder (CGAE), is devised based on the localized first-order approximation of spectral graph convolutions, deep learning, and the variational Bayesian inference. We apply our CGAE to a new problem, the spatio-temporal probabilistic solar irradiance prediction. Multiple solar radiation measurement sites in a wide area in northern states of the US are modeled as an undirected graph. Using our proposed model, the distribution of future irradiance given historical radiation observations is estimated for every site/node. Numerical results on the National Solar Radiation Database show state-of-the-art performance for probabilistic radiation prediction on geographically distributed irradiance data in terms of reliability, sharpness, and continuous ranked probability score.

artificial intelligence, machine learning, prediction, (17 more...)

1809.03538

Country:

North America > United States > Michigan (0.04)
Europe > Portugal > Coimbra > Coimbra (0.04)
North America > United States > Texas > Dallas County > Dallas (0.04)
North America > United States > California > Santa Clara County > San Jose (0.04)

Genre: Research Report (0.64)

Industry:

Energy > Renewable > Solar (1.00)
Energy > Power Industry (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Machine LearningSep-9-2018

Variational Approximation Accuracy in Bayesian Non-negative Matrix Factorization

Hayashi, Naoki

Non-negative matrix factorization (NMF) is a knowledge discovery method that is used for many fields, besides, its variational inference and Gibbs sampling method are also well-known. However, the variational approximation accuracy is not yet clarified, since NMF is not statistically regular and the prior used in the variational Bayesian NMF (VBNMF) has zero or divergence points. In this paper, using algebraic geometrical methods, we theoretically analyze the difference of the negative log evidence/marginal likelihood (free energy) between VBNMF and Bayesian NMF, and give a lower bound of the approximation accuracy, asymptotically. The results quantitatively show how well the VBNMF algorithm can approximate Bayesian NMF.

artificial intelligence, machine learning, matrix factorization, (14 more...)

1809.02963

Country: North America > United States (0.04)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Machine LearningSep-7-2018

Deep Recurrent Survival Analysis

Ren, Kan, Qin, Jiarui, Zheng, Lei, Yang, Zhengyu, Zhang, Weinan, Qiu, Lin, Yu, Yong

Survival analysis is a hotspot in statistical research for modeling time-to-event information with data censorship handling, which has been widely used in many applications such as clinical research, information system and other fields with survivorship bias. Many works have been proposed for survival analysis ranging from traditional statistic methods to machine learning models. However, the existing methodologies either utilize counting-based statistics on the segmented data, or have a pre-assumption on the event probability distribution w.r.t. time. Moreover, few works consider sequential patterns within the feature space. In this paper, we propose a Deep Recurrent Survival Analysis model which combines deep learning for conditional probability prediction at fine-grained level of the data, and survival analysis for tackling the censorship. By capturing the time dependency through modeling the conditional probability of the event for each sample, our method predicts the likelihood of the true event occurrence and estimates the survival rate over time, i.e., the probability of the non-occurrence of the event, for the censored data. Meanwhile, without assuming any specific form of the event probability distribution, our model shows great advantages over the previous works on fitting various sophisticated data distributions. In the experiments on the three real-world tasks from different fields, our model significantly outperforms the state-of-the-art solutions under various metrics.

artificial intelligence, machine learning, survival analysis, (21 more...)

1809.02403

Country:

Europe > Portugal > Lisbon > Lisbon (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (1.00)
Law > Civil Rights & Constitutional Law (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Bayesian Nonparametric Spectral Estimation

Tobar, Felipe

Spectral estimation (SE) aims to identify how the energy of a signal (e.g., a time series) is distributed across different frequencies. This can become particularly challenging when only partial and noisy observations are available, where current methods fail to handle uncertainty appropriately. In this context, we propose a joint probabilistic model for signals, observations and spectra, where SE is addressed as an inference problem. Assuming a Gaussian process prior over the signal, we apply Bayes' rule to find the analytic posterior distribution of the spectrum given a set of observations. Besides its expressiveness and natural account of spectral uncertainty, the proposed model also provides a functional-form representation of the power spectral density, which can be optimised efficiently. Comparison with previous approaches is addressed theoretically, showing that the proposed method is an infinite-dimensional variant of the Lomb-Scargle approach, and also empirically through three experiments.

artificial intelligence, bayesian inference, machine learning, (16 more...)

1809.02196

Country:

North America > United States > New York (0.05)
South America > Chile (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Logistic Regression Augmented Community Detection for Network Data with Application in Identifying Autism-Related Gene Pathways

Zhao, Yunpeng, Pan, Qing, Du, Chengan

When searching for gene pathways leading to specific disease outcomes, additional information on gene characteristics is often available that may facilitate to differentiate genes related to the disease from irrelevant background when connections involving both types of genes are observed and their relationships to the disease are unknown. We propose method to single out irrelevant background genes with the help of auxiliary information through a logistic regression, and cluster relevant genes into cohesive groups using the adjacency matrix. Expectation-maximization algorithm is modified to maximize a joint pseudo-likelihood assuming latent indicators for relevance to the disease and latent group memberships as well as Poisson or multinomial distributed link numbers within and between groups. A robust version allowing arbitrary linkage patterns within the background is further derived. Asymptotic consistency of label assignments under the stochastic blockmodel is proven. Superior performance and robustness in finite samples are observed in simulation studies. The proposed robust method identifies previously missed gene sets underlying autism related neurological diseases using diverse data sources including de novo mutations, gene expressions and protein-protein interactions.

artificial intelligence, data mining, machine learning, (17 more...)

1809.02262

Country: North America > United States > New York > New York County > New York City (0.04)

Genre:

Research Report > Experimental Study (0.69)
Research Report > New Finding (0.69)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Neurology > Autism (0.71)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.63)
(2 more...)

Dynamic Hierarchical Empirical Bayes: A Predictive Model Applied to Online Advertising

Yuan, Yuan, Dong, Xiaojing, Dong, Chen, Sun, Yiwen, Yan, Zhenyu, Pani, Abhishek

Predicting keywords performance, such as number of impressions, click-through rate (CTR), conversion rate (CVR), revenue per click (RPC), and cost per click (CPC), is critical for sponsored search in the online advertising industry. An interesting phenomenon is that, despite the size of the overall data, the data are very sparse at the individual unit level. To overcome the sparsity and leverage hierarchical information across the data structure, we propose a Dynamic Hierarchical Empirical Bayesian (DHEB) model that dynamically determines the hierarchy through a data-driven process and provides shrinkage-based estimations. Our method is also equipped with an efficient empirical approach to derive inferences through the hierarchy. We evaluate the proposed method in both simulated and real-world datasets and compare to several competitive models. The results favor the proposed method among all comparisons in terms of both accuracy and efficiency. In the end, we design a two-phase system to serve prediction in real time.

artificial intelligence, data mining, machine learning, (19 more...)

1809.02213

Country:

Europe > United Kingdom > England > Greater London > London (0.05)
South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
North America > United States > District of Columbia > Washington (0.04)
(4 more...)

Genre: Research Report (0.50)

Industry:

Marketing (1.00)
Information Technology > Services (0.85)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
(2 more...)

Hands-on Experience with Gaussian Processes (GPs): Implementing GPs in Python - I

Tiwari, Kshitij

This document serves to complement our website which was developed with the aim of exposing the students to Gaussian Processes (GPs). GPs are non-parametric Bayesian regression models that are largely used by statisticians and geospatial data scientists for modeling spatial data. Several open source libraries spanning from Matlab [1], Python [2], R [3] etc., are already available for simple plug-and-use. The objective of this handout and in turn the website was to allow the users to develop stand-alone GPs in Python by relying on minimal external dependencies. To this end, we only use the default python modules and assist the users in developing their own GPs from scratch giving them an in-depth knowledge of what goes on under the hood. The module covers GP inference using maximum likelihood estimation (MLE) and gives examples of 1D (dummy) spatial data.

artificial intelligence, machine learning, maximum likelihood estimation, (16 more...)

1809.01913

Country:

North America > United States > Virginia (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Finland (0.04)

Genre:

Research Report (0.51)
Instructional Material (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Hawkins, Cole, Zhang, Zheng

Variational Bayesian Inference for Robust Streaming Tensor Factorization and Completion

Streaming tensor factorization is a powerful tool for processing high-volume and multi-way temporal data in Internet networks, recommender systems and image/video data analysis. Existing streaming tensor factorization algorithms rely on least-squares data fitting and they do not possess a mechanism for tensor rank determination. This leaves them susceptible to outliers and vulnerable to over-fitting. This paper presents a Bayesian robust streaming tensor factorization model to identify sparse outliers, automatically determine the underlying tensor rank and accurately fit low-rank structure. We implement our model in Matlab and compare it with existing algorithms on tensor datasets generated from dynamic MRI and Internet traffic.

artificial intelligence, bayesian inference, machine learning, (15 more...)

1809.02153

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Africa > Senegal > Kolda Region > Kolda (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.83)