AITopics

1902.03932

Country:

North America > United States > New York (0.04)
Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Schubert, Erich, Zimek, Arthur

ELKI: A large open-source library for data analysis - ELKI Release 0.7.5 "Heidelberg"

arXiv.org Machine LearningFeb-10-2019

This paper documents the release of the ELKI data mining framework, version 0.7.5. ELKI is an open source (AGPLv3) data mining software written in Java. The focus of ELKI is research in algorithms, with an emphasis on unsupervised methods in cluster analysis and outlier detection. In order to achieve high performance and scalability, ELKI offers data index structures such as the R*-tree that can provide major performance gains. ELKI is designed to be easy to extend for researchers and students in this domain, and welcomes contributions of additional methods. ELKI aims at providing a large collection of highly parameterizable algorithms, in order to allow easy and fair evaluation and benchmarking of algorithms. We will first outline the motivation for this release, the plans for the future, and then give a brief overview over the new functionality in this version. We also include an appendix presenting an overview on the overall implemented functionality.

density-based clustering, survey article, upstream oil & gas, (22 more...)

1902.03616

Country:

North America > United States > Wisconsin (0.14)
North America > United States > Virginia (0.13)
North America > United States > New York (0.13)
(3 more...)

Genre: Research Report > Experimental Study (0.45)

Industry: Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
(4 more...)

Vergari, Antonio, Molina, Alejandro, Peharz, Robert, Ghahramani, Zoubin, Kersting, Kristian, Valera, Isabel

Automatic Bayesian Density Analysis

arXiv.org Machine LearningFeb-10-2019

Making sense of a dataset in an automatic and unsupervised fashion is a challenging problem in statistics and AI. Classical approaches for {exploratory data analysis} are usually not flexible enough to deal with the uncertainty inherent to real-world data: they are often restricted to fixed latent interaction models and homogeneous likelihoods; they are sensitive to missing, corrupt and anomalous data; moreover, their expressiveness generally comes at the price of intractable inference. As a result, supervision from statisticians is usually needed to find the right model for the data. However, since domain experts are not necessarily also experts in statistics, we propose Automatic Bayesian Density Analysis (ABDA) to make exploratory data analysis accessible at large. Specifically, ABDA allows for automatic and efficient missing value estimation, statistical data type and likelihood discovery, anomaly detection and dependency structure mining, on top of providing accurate density estimation. Extensive empirical evidence shows that ABDA is a suitable tool for automatic exploratory analysis of mixed continuous and discrete tabular data.

abda, dataset, likelihood model, (15 more...)

1807.09306

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Middle East > Malta > Port Region > Southern Harbour District > Floriana (0.04)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Consumer Products & Services > Food, Beverage, Tobacco & Cannabis > Beverages (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Valenzuela, Cristobal, Tobar, Felipe

Low-pass filtering as Bayesian inference

arXiv.org Machine LearningFeb-9-2019

This is because the concentration of energy at a specific range of frequencies might be indicative of mechanical faults [1], cardiac anomalies [2], astronomical discoveries [3, 4], and whale calls from submarine audio recordings [5] to name a few. The standard practice to isolate components within a specific frequency range from a time-series observation, referred to as filtering, isto convolve the observations with an object called linear filter. This convolution removes all frequencies that do not correspond to the desired frequency range, thus, filtering out unimportant frequencies. Thetheoretical rationale behind this approach is supported by the application of the Convolution Theorem [6] to power spectral densities (PSD): the PSD of a filtered time series corresponds to the PSD of the linear filter (user-designed) multiplied by the PSD of the observed time series (not controllable). This result allows for designing thelinear filter so as to remove unwanted frequency components to then perform the numerical convolution.

frequency, gaussian process, time sery, (14 more...)

1902.03427

Country:

South America > Chile (0.04)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.65)

James, Nick, Marchant, Roman, Gerlach, Richard, Cripps, Sally

Bayesian Nonparametric Adaptive Spectral Density Estimation for Financial Time Series

Discrimination between non-stationarity and long-range dependency is a difficult and long-standing issue in modelling financial time series. This paper uses an adaptive spectral technique which jointly models the non-stationarity and dependency of financial time series in a non-parametric fashion assuming that the time series consists of a finite, but unknown number, of locally stationary processes, the locations of which are also unknown. The model allows a non-parametric estimate of the dependency structure by modelling the auto-covariance function in the spectral domain. All our estimates are made within a Bayesian framework where we use aReversible Jump Markov Chain Monte Carlo algorithm for inference. We study the frequentist properties of our estimates via a simulation study, and present a novel way of generating time series data from a nonparametric spectrum. Results indicate that our techniques perform well across a range of data generating processes. We apply our method to a number of real examples and our results indicate that several financial time series exhibit both long-range dependency and non-stationarity.

nonparametric adaptive spectral density estimation, time sery, volatility, (9 more...)

1902.0335

Country:

Europe > United Kingdom (0.14)
North America > United States > New York (0.04)

Genre: Research Report (0.70)

Industry: Banking & Finance (0.97)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Seshadri, Arjun, Peysakhovich, Alexander, Ugander, Johan

Discovering Context Effects from Raw Choice Data

Many applications in preference learning assume that decisions come from the maximization of a stable utility function. Yet a large experimental literature shows that individual choices and judgements can be affected by "irrelevant" aspects of the context in which they are made. An important class of such contexts is the composition of the choice set. In this work, our goal is to discover such choice set effects from raw choice data. We introduce an extension of the Multinomial Logit (MNL) model, called the context dependent random utility model (CDM), which allows for a particular class of choice set effects. We show that the CDM can be thought of as a second-order approximation to a general choice system, can be inferred optimally using maximum likelihood and, importantly, is easily interpretable. We apply the CDM to both real and simulated choice data to perform principled exploratory analyses for the presence of choice set effects.

cdm, discovering context effect, probability, (15 more...)

1902.03266

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Pacific Ocean > North Pacific Ocean > San Francisco Bay (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Fong, Edwin, Lyddon, Simon, Holmes, Chris

Scalable Nonparametric Sampling from Multimodal Posteriors with the Posterior Bootstrap

Increasingly complex datasets pose a number of challenges for Bayesian inference. Conventional posterior sampling based on Markov chain Monte Carlo can be too computationally intensive, is serial in nature and mixes poorly between posterior modes. Further, all models are misspecified, which brings into question the validity of the conventional Bayesian update. We present a scalable Bayesian nonparametric learning routine that enables posterior sampling through the optimization of suitably randomized objective functions. A Dirichlet process prior on the unknown data distribution accounts for model misspecification, and admits an embarrassingly parallel posterior bootstrap algorithm that generates independent and exact samples from the nonparametric posterior distribution. Our method is particularly adept at sampling from multimodal posterior distributions via a random restart mechanism. We demonstrate our method on Gaussian mixture model and sparse logistic regression examples.

multimodal posterior, posterior, scalable nonparametric sampling, (13 more...)

1902.03175

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.90)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Yassine, Hachem, Badiu, Mihai-Alin, Coon, Justin

Model-Based Detector for SSDs in the Presence of Inter-cell Interference

In this paper, we consider the problem of reducing the bit error rate of flash-based solid state drives (SSDs) when cells are subject to inter-cell interference (ICI). By observing that the outputs of adjacent victim cells can be correlated due to common aggressors, we propose a novel channel model to accurately represent the true flash channel. This model, equivalent to a finite-state Markov channel model, allows the use of the sum-product algorithm to calculate more accurate posterior distributions of individual cell inputs given the joint outputs of victim cells. These posteriors can be easily mapped to the log-likelihood ratios that are passed as inputs to the soft LDPC decoder. When the output is available with high precision, our simulation showed that a significant reduction in the bit-error rate can be obtained, reaching $99.99\%$ reduction compared to current methods, when the diagonal coupling is very strong. In the realistic case of low-precision output, our scheme provides less impressive improvements due to information loss in the process of quantization. To improve the performance of the new detector in the quantized case, we propose a new iterative scheme that alternates multiple times between the detector and the decoder. Our simulations showed that the iterative scheme can significantly improve the bit error rate even in the quantized case.

aggressor, detector, quantization, (14 more...)

1902.01212

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > Denmark > North Jutland > Aalborg (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Pillutla, Krishna, Roulet, Vincent, Kakade, Sham M., Harchaoui, Zaid

A Smoother Way to Train Structured Prediction Models

We present a framework to train a structured prediction model by performing smoothing on the inference algorithm it builds upon. Smoothing overcomes the non-smoothness inherent to the maximum margin structured prediction objective, and paves the way for the use of fast primal gradient-based optimization algorithms. We illustrate the proposed framework by developing a novel primal incremental optimization algorithm for the structural support vector machine. The proposed algorithm blends an extrapolation scheme for acceleration and an adaptive smoothing scheme and builds upon the stochastic variance-reduced gradient algorithm. We establish its worst-case global complexity bound and study several practical variants, including extensions to deep structured prediction. We present experimental results on two real-world problems, namely named entity recognition and visual object localization. The experimental results show that the proposed framework allows us to build upon efficient inference algorithms to develop large-scale optimization algorithms for structured prediction which can achieve competitive performance on the two real-world problems.

algorithm, optimization algorithm, oracle call, (11 more...)

1902.03228

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(4 more...)

Kajihara, Takafumi, Kanagawa, Motonobu, Nakaguchi, Yuuki, Khandelwal, Kanishka, Fukumiziu, Kenji

Model Selection for Simulator-based Statistical Models: A Kernel Approach

arXiv.org Machine LearningFeb-7-2019

We propose a novel approach to model selection for simulator-based statistical models. The proposed approach defines a mixture of candidate models, and then iteratively updates the weight coefficients for those models as well as the parameters in each model simultaneously; this is done by recursively applying Bayes' rule, using the recently proposed kernel recursive ABC algorithm. The practical advantage of the method is that it can be used even when a modeler lacks appropriate prior knowledge about the parameters in each model. We demonstrate the effectiveness of the proposed approach with a number of experiments, including model selection for dynamical systems in ecology and epidemiology.

coefficient, experiment, model selection, (16 more...)

1902.02517

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.05)
Asia > Middle East > Lebanon (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)