AITopics

doi: 10.1007/s00180-017-0721-7

1703.07305

Country: North America > United States (0.27)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

Mathews, George M., Vial, John

Overcoming model simplifications when quantifying predictive uncertainty

arXiv.org Machine LearningMar-21-2017

It is generally accepted that all models are wrong -- the difficulty is determining which are useful. Here, a useful model is considered as one that is capable of combining data and expert knowledge, through an inversion or calibration process, to adequately characterize the uncertainty in predictions of interest. This paper derives conditions that specify which simplified models are useful and how they should be calibrated. To start, the notion of an optimal simplification is defined. This relates the model simplifications to the nature of the data and predictions, and determines when a standard probabilistic calibration scheme is capable of accurately characterizing uncertainty. Furthermore, two additional conditions are defined for suboptimal models that determine when the simplifications can be safely ignored. The first allows a suboptimally simplified model to be used in a way that replicates the performance of an optimal model. This is achieved through the judicial selection of a prior term for the calibration process that explicitly includes the nature of the data, predictions and modelling simplifications. The second considers the dependency structure between the predictions and the available data to gain insights into when the simplifications can be overcome by using the right calibration data. Furthermore, the derived conditions are related to the commonly used calibration schemes based on Tikhonov and subspace regularization. To allow concrete insights to be obtained, the analysis is performed under a linear expansion of the model equations and where the predictive uncertainty is characterized via second order moments only.

decision support system, machine learning, prediction, (20 more...)

1703.07198

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Modeling & Simulation (0.68)
Information Technology > Decision Support Systems (0.67)

Serban, Iulian Vlad, Lowe, Ryan, Henderson, Peter, Charlin, Laurent, Pineau, Joelle

A Survey of Available Corpora for Building Data-Driven Dialogue Systems

arXiv.org Artificial IntelligenceMar-20-2017

During the past decade, several areas of speech and language understanding have witnessed substantial breakthroughs from the use of data-driven models. In the area of dialogue systems, the trend is less obvious, and most practical systems are still built through significant engineering and expert knowledge. Nevertheless, several recent results suggest that data-driven approaches are feasible and quite promising. To facilitate research in this area, we have carried out a wide survey of publicly available datasets suitable for data-driven learning of dialogue systems. We discuss important characteristics of these datasets, how they can be used to learn diverse dialogue strategies, and their other potential uses. We also examine methods for transfer learning between datasets and the use of external knowledge. Finally, we discuss appropriate choice of evaluation metrics for the learning objective.

information retrieval, machine learning, reinforcement learning, (25 more...)

arXiv.org Artificial Intelligence

1512.05742

Country:

North America > United States (1.00)
Europe (1.00)

Genre:

Overview (1.00)
Research Report > New Finding (0.34)

Industry:

Media > Television (1.00)
Media > Film (1.00)
Health & Medicine (1.00)
(5 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
(13 more...)

Independence clustering (without a matrix)

Ryabko, Daniil

The independence clustering problem is considered in the following formulation: given a set $S$ of random variables, it is required to find the finest partitioning $\{U_1,\dots,U_k\}$ of $S$ into clusters such that the clusters $U_1,\dots,U_k$ are mutually independent. Since mutual independence is the target, pairwise similarity measurements are of no use, and thus traditional clustering algorithms are inapplicable. The distribution of the random variables in $S$ is, in general, unknown, but a sample is available. Thus, the problem is cast in terms of time series. Two forms of sampling are considered: i.i.d.\ and stationary time series, with the main emphasis being on the latter, more general, case. A consistent, computationally tractable algorithm for each of the settings is proposed, and a number of open directions for further research are outlined.

algorithm, artificial intelligence, machine learning, (17 more...)

1703.067

Country: Asia > Middle East (0.28)

Genre: Research Report (0.64)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Bordes, Florian, Honari, Sina, Vincent, Pascal

Learning to Generate Samples from Noise through Infusion Training

In this work, we investigate a novel training procedure to learn a generative model as the transition operator of a Markov chain, such that, when applied repeatedly on an unstructured random noise sample, it will denoise it into a sample that matches the target distribution from the training set. The novel training procedure to learn this progressive denoising operation involves sampling from a slightly different chain than the model chain used for generation in the absence of a denoising target. In the training chain we infuse information from the training target example that we would like the chains to reach with a high probability. The thus learned transition operator is able to produce quality and varied samples in a small number of steps. Experiments show competitive results compared to the samples generated with a basic Generative Adversarial Net

artificial intelligence, infusion rate, machine learning, (16 more...)

1703.06975

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > Canada > Quebec > Montreal (0.14)

Genre:

Research Report (0.50)
Workflow (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Bayesian Adaptive Data Analysis Guarantees from Subgaussianity

Elder, Sam

The new field of adaptive data analysis seeks to provide algorithms and provable guarantees for models of machine learning that allow researchers to reuse their data, which normally falls outside of the usual statistical paradigm of static data analysis. In 2014, Dwork, Feldman, Hardt, Pitassi, Reingold and Roth introduced one potential model and proposed several solutions based on differential privacy. In previous work in 2016, we described a problem with this model and instead proposed a Bayesian variant, but also found that the analogous Bayesian methods cannot achieve the same statistical guarantees as in the static case. In this paper, we prove the first positive results for the Bayesian model, showing that with a Dirichlet prior, the posterior mean algorithm indeed matches the statistical guarantees of the static case. The main ingredient is a new theorem showing that the $\mathrm{Beta}(\alpha,\beta)$ distribution is subgaussian with variance proxy $O(1/(\alpha+\beta+1))$, a concentration result also of independent interest. We provide two proofs of this result: a probabilistic proof utilizing a simple condition for the raw moments of a positive random variable and a learning-theoretic proof based on considering the beta distribution as a posterior, both of which have implications to other related problems.

artificial intelligence, beta distribution, machine learning, (19 more...)

1611.00065

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Challenges in Bayesian Adaptive Data Analysis

Elder, Sam

Traditional statistical analysis requires that the analysis process and data are independent. By contrast, the new field of adaptive data analysis hopes to understand and provide algorithms and accuracy guarantees for research as it is commonly performed in practice, as an iterative process of interacting repeatedly with the same data set, such as repeated tests against a holdout set. Previous work has defined a model with a rather strong lower bound on sample complexity in terms of the number of queries, $n\sim\sqrt q$, arguing that adaptive data analysis is much harder than static data analysis, where $n\sim\log q$ is possible. Instead, we argue that those strong lower bounds point to a limitation of the previous model in that it must consider wildly asymmetric scenarios which do not hold in typical applications. To better understand other difficulties of adaptivity, we propose a new Bayesian version of the problem that mandates symmetry. Since the other lower bound techniques are ruled out, we can more effectively see difficulties that might otherwise be overshadowed. As a first contribution to this model, we produce a new problem using error-correcting codes on which a large family of methods, including all previously proposed algorithms, require roughly $n\sim\sqrt[4]q$. These early results illustrate new difficulties in adaptive data analysis regarding slightly correlated queries on problems with concentrated uncertainty.

artificial intelligence, bayesian inference, machine learning, (18 more...)

1604.02492

Country: North America > United States (0.92)

Genre: Research Report > Experimental Study (1.00)

Industry: Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Sarvadevabhatla, Ravi Kiran, Suresh, Sudharshan, Babu, R. Venkatesh

Object category understanding via eye fixations on freehand sketches

arXiv.org Artificial IntelligenceMar-19-2017

HEN shown photographic images under a free-viewing (i.e task-free) paradigm, human eyes preferentially fixate on image locations which are visually salient. Multiple studies [1]-[5] have demonstrated that this fixation mechanism is bottom-up, predominantly driven by image content and richness of detail (color, texture etc.). This explanation, while satisfactory for photographic images, seems inadequate for certain categories of images such as line drawings. In particular, one class of line drawings - hand-drawn sketches - are sparse and largely devoid of detailed content. In addition, they are typically binary images containing virtually no color-based information (see Figure 1). Even so, multiple studies have demonstrated a "fixations-intonothing" phenomenon [6]-[9], wherein the eye fixations on the same stimulus by multiple subjects fall on empty regions, yet exhibit enough regularity to make gaze-based inferences. One possible explanation is that the first eye fixation conveys all there is to know ('Gestalt') about the underlying scene semantics [10] and the regularity in rest of the fixations is a statistical anomaly. However, a more intriguing explanation is that these empty region fixations aim to implicitly verify the overall consistency of the scene content depicted in the sketch [11], [12]. Which of these explanations is correct?

category, data mining, machine learning, (22 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TIP.2017.2675539

1703.06554

Country:

Asia (0.46)
North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.68)
(3 more...)

Hasani, Ramin M., Wang, Guodong, Grosu, Radu

An Automated Auto-encoder Correlation-based Health-Monitoring and Prognostic Method for Machine Bearings

arXiv.org Machine LearningMar-18-2017

This paper studies an intelligent ultimate technique for health-monitoring and prognostic of common rotary machine components, particularly bearings. During a run-to-failure experiment, rich unsupervised features from vibration sensory data are extracted by a trained sparse auto-encoder. Then, the correlation of the extracted attributes of the initial samples (presumably healthy at the beginning of the test) with the succeeding samples is calculated and passed through a moving-average filter. The normalized output is named auto-encoder correlation-based (AEC) rate which stands for an informative attribute of the system depicting its health status and precisely identifying the degradation starting point. We show that AEC technique well-generalizes in several run-to-failure tests. AEC collects rich unsupervised features form the vibration data fully autonomous. We demonstrate the superiority of the AEC over many other state-of-the-art approaches for the health monitoring and prognostic of machine bearings.

artificial intelligence, bearing, machine learning, (12 more...)

1703.06272

Country: North America > United States (0.29)

Genre: Research Report > Promising Solution (0.34)

Industry:

Health & Medicine > Consumer Health (1.00)
Transportation > Ground > Road (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.31)

Henelius, Andreas, Ukkonen, Antti, Puolamäki, Kai

Finding Statistically Significant Attribute Interactions

arXiv.org Machine LearningMar-16-2017

In many data exploration tasks it is meaningful to identify groups of attribute interactions that are specific to a variable of interest. For instance, in a dataset where the attributes are medical markers and the variable of interest (class variable) is binary indicating presence/absence of disease, we would like to know which medical markers interact with respect to the binary class label. These interactions are useful in several practical applications, for example, to gain insight into the structure of the data, in feature selection, and in data anonymisation. We present a novel method, based on statistical significance testing, that can be used to verify if the data set has been created by a given factorised class-conditional joint distribution, where the distribution is parametrised by a partition of its attributes. Furthermore, we provide a method, named astrid, for automatically finding a partition of attributes describing the distribution that has generated the data. State-of-the-art classifiers are utilised to capture the interactions present in the data by systematically breaking attribute interactions and observing the effect of this breaking on classifier performance. We empirically demonstrate the utility of the proposed method with examples using real and synthetic data.

classifier, dataset, interaction, (13 more...)

1612.07597

Country:

Europe > Austria > Vienna (0.14)
Europe > Finland > Uusimaa > Helsinki (0.04)
Oceania > New Zealand > North Island > Waikato (0.04)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.46)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.94)