AITopics

1812.071

Country:

North America > United States (1.00)
Asia (1.00)
Europe (0.67)

Genre:

Workflow (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Media (1.00)
Leisure & Entertainment > Sports (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
(8 more...)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(3 more...)

arXiv.org Artificial IntelligenceDec-8-2018

Sampling-based Bayesian Inference with gradient uncertainty

Park, Chanwoo, Kim, Jae Myung, Ha, Seok Hyeon, Lee, Jungwoo

Deep neural networks(NNs) have achieved impressive performance, often exceed human performance on many computer vision tasks. However, one of the most challenging issues that still remains is that NNs are overconfident in their predictions, which can be very harmful when this arises in safety critical applications. In this paper, we show that predictive uncertainty can be efficiently estimated when we incorporate the concept of gradients uncertainty into posterior sampling. The proposed method is tested on two different datasets, MNIST for in-distribution confusing examples and notMNIST for out-of-distribution data. We show that our method is able to efficiently represent predictive uncertainty on both datasets.

artificial intelligence, bayesian inference, machine learning, (16 more...)

1812.03285

Country: North America > Canada (0.14)

Genre: Research Report (0.83)

Industry: Health & Medicine (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)
(2 more...)

Ratner, Alexander, Hancock, Braden, Dunnmon, Jared, Sala, Frederic, Pandey, Shreyash, Ré, Christopher

Training Complex Models with Multi-Task Weak Supervision

arXiv.org Machine LearningDec-7-2018

As machine learning models continue to increase in complexity, collecting large hand-labeled training sets has become one of the biggest roadblocks in practice. Instead, weaker forms of supervision that provide noisier but cheaper labels are often used. However, these weak supervision sources have diverse and unknown accuracies, may output correlated labels, and may label different tasks or apply at different levels of granularity. We propose a framework for integrating and modeling such weak supervision sources by viewing them as labeling different related sub-tasks of a problem, which we refer to as the multi-task weak supervision setting. We show that by solving a matrix completion-style problem, we can recover the accuracies of these multi-task sources given their dependency structure, but without any labeled data, leading to higher-quality supervision for training an end model. Theoretically, we show that the generalization error of models trained with this approach improves with the number of unlabeled data points, and characterize the scaling with respect to the task and dependency structures. On three fine-grained classification problems, we show that our approach leads to average gains of 20.2 points in accuracy over a traditional supervised approach, 6.8 points over a majority vote baseline, and 4.1 points over a previously proposed weak supervision method that models tasks separately.

accuracy, artificial intelligence, machine learning, (18 more...)

1810.0284

Country:

North America > United States (0.46)
Asia (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine (0.93)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Kłopotek, Mieczysław A., Wierzchoń, Sławomir T.

On Marginally Correct Approximations of Dempster-Shafer Belief Functions from Data

arXiv.org Artificial IntelligenceDec-7-2018

Mathematical Theory of Evidence (MTE), a foundation for reasoning under partial ignorance, is blamed to leave frequencies outside (or aside of) its framework. The seriousness of this accusation is obvious: no experiment may be run to compare the performance of MTE-based models of real world processes against real world data. In this paper we consider this problem from the point of view of conditioning in the MTE. We describe the class of belief functions for which marginal consistency with observed frequencies may be achieved and conditional belief functions are proper belief functions,%\ and deal with implications for (marginal) approximation of general belief functions by this class of belief functions and for inference models in MTE.

artificial intelligence, belief function, machine learning, (16 more...)

1812.02942

Country: North America > United States > California > San Mateo County (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Malioutov, Dmitry, Meel, Kuldeep S.

MLIC: A MaxSAT-Based framework for learning interpretable classification rules

arXiv.org Artificial IntelligenceDec-5-2018

The wide adoption of machine learning approaches in the industry, government, medicine and science has renewed the interest in interpretable machine learning: many decisions are too important to be delegated to black-box techniques such as deep neural networks or kernel SVMs. Historically, problems of learning interpretable classifiers, including classification rules or decision trees, have been approached by greedy heuristic methods as essentially all the exact optimization formulations are NP-hard. Our primary contribution is a MaxSAT-based framework, called MLIC, which allows principled search for interpretable classification rules expressible in propositional logic. Our approach benefits from the revolutionary advances in the constraint satisfaction community to solve large-scale instances of such problems. In experimental evaluations over a collection of benchmarks arising from practical scenarios, we demonstrate its effectiveness: we show that the formulation can solve large classification problems with tens or hundreds of thousands of examples and thousands of features, and to provide a tunable balance of accuracy vs. interpretability. Furthermore, we show that in many problems interpretability can be obtained at only a minor cost in accuracy. The primary objective of the paper is to show that recent advances in the MaxSAT literature make it realistic to find optimal (or very high quality near-optimal) solutions to large-scale classification problems. The key goal of the paper is to excite researchers in both interpretable classification and in the CP community to take it further and propose richer formulations, and to develop bespoke solvers attuned to the problem of interpretable ML.

accuracy, classifier, mlic, (16 more...)

doi: 10.1007/978-3-319-98334-9_21

1812.01843

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
(3 more...)

Pernkopf, Franz, Roth, Wolfgang, Zoehrer, Matthias, Pfeifenberger, Lukas, Schindler, Guenther, Froening, Holger, Tschiatschek, Sebastian, Peharz, Robert, Mattina, Matthew, Ghahramani, Zoubin

Efficient and Robust Machine Learning for Real-World Systems

While machine learning is traditionally a resource intensive task, embedded systems, autonomous navigation and the vision of the Internet-of-Things fuel the interest in resource efficient approaches. These approaches require a carefully chosen trade-off between performance and resource consumption in terms of computation and energy. On top of this, it is crucial to treat uncertainty in a consistent manner in all but the simplest applications of machine learning systems. In particular, a desideratum for any real-world system is to be robust in the presence of outliers and corrupted data, as well as being `aware' of its limits, i.e.\ the system should maintain and provide an uncertainty estimate over its own predictions. These complex demands are among the major challenges in current machine learning research and key to ensure a smooth transition of machine learning technology into every day's applications. In this article, we provide an overview of the current state of the art of machine learning techniques facilitating these real-world requirements. First we provide a comprehensive review of resource-efficiency in deep neural networks with focus on techniques for model size reduction, compression and reduced precision. These techniques can be applied during training or as post-processing and are widely used to reduce both computational complexity and memory footprint. As most (practical) neural networks are limited in their ways to treat uncertainty, we contrast them with probabilistic graphical models, which readily serve these desiderata by means of probabilistic inference. In that way, we provide an extensive overview of the current state-of-the-art of robust and efficient machine learning for real-world systems.

artificial intelligence, bayesian inference, machine learning, (15 more...)

1812.0224

Country:

North America > United States (0.67)
Europe > Austria > Styria (0.14)

Genre: Overview (1.00)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Maïnassara, Yacouba Boubacar, Kadmiri, Othman, Saussereau, Bruno

Estimation of multivariate asymmetric power GARCH models

It is now widely accepted that volatility models have to incorporate the so-called leverage effect in order to to model the dynamics of daily financial returns. We suggest a new class of multivariate power transformed asymmetric models. It includes several functional forms of multivariate GARCH models which are of great interest in financial modeling and time series literature. We provide an explicit necessary and sufficient condition to establish the strict stationarity of the model. We derive the asymptotic properties of the quasi-maximum likelihood estimator of the parameters. These properties are established both when the power of the transformation is known or is unknown. The asymptotic results are illustrated by Monte Carlo experiments. An application to real financial data is also proposed. Introduction The ARCH (AutoRegressive Conditional Heteroscedastic) model has been introduced by Engle (1982) in an univariate context. Since this work a lot of extensions have been proposed. A first one has been suggested four years latter, namely the GARCH (Generalised ARCH) model by Bollerslev (1986). This model had for goal to improve modeling by considering the past conditional variance (volatility). Their concept are based on the past conditional heteroscedasticity which depends on the past values of the return. A consequence is the volatility has the same magnitude for a negative or positive return. Financial series have their own characteristics which are usually difficult to reproduce artificially. An important characteristic is the leverage effect which consider negative returns differently than the positive returns. This is in contradiction with the construction of the GARCH model, because it cannot consider the asymmetry. The TGARCH (Threshold GARCH) model introduced by Rabemananjara and Zakoïan (1993) improve modeling because it considers the asymmetry since the volatility is determined by the past negative observations and the past positive observations with different weights. Various asymmetric GARCH processes are introduced in the econometric literature, for instance the EGARCH (Exponential GARCH) and the log GARCH models (see Francq et al. (2013) who studied the asymptotic properties of an EGARCH (1, 1) models).

artificial intelligence, machine learning, td 1, (19 more...)

1812.02061

Country: Europe > France (0.27)

Genre: Research Report (0.63)

Industry: Banking & Finance (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Spiliopoulos, Konstantinos

Information geometry for approximate Bayesian computation

The goal of this paper is to explore the basic Approximate Bayesian Computation (ABC) algorithm via the lens of information theory. ABC is a widely used algorithm in cases where the likelihood of the data is hard to work with or intractable, but one can simulate from it. We use relative entropy ideas to analyze the behavior of the algorithm as a function of the thresholding parameter and of the size of the data. Relative entropy here is data driven as it depends on the values of the observed statistics. We allow different thresholding parameters for each different direction (i.e. for different observed statistic) and compute the weighted effect on each direction. The latter allows to find important directions via sensitivity analysis leading to potentially larger acceptance regions, which in turn brings the computational cost of the algorithm down for the same level of accuracy. In addition, we also investigate the bias of the estimators for generic observables as a function of both the thresholding parameters and the size of the data. Our analysis provides error bounds on performance for positive tolerances and finite sample sizes. Simulation studies complement and illustrate the theoretical results.

acceptance region, artificial intelligence, machine learning, (16 more...)

1812.02127

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.85)

Mantovani, Rafael Gomes, Horváth, Tomáš, Cerri, Ricardo, Junior, Sylvio Barbon, Vanschoren, Joaquin, de Carvalho, André Carlos Ponce de Leon Ferreira

An empirical study on hyperparameter tuning of decision trees

Machine learning algorithms often contain many hyperparameters whose values affect the predictive performance of the induced models in intricate ways. Due to the high number of possibilities for these hyperparameter configurations, and their complex interactions, it is common to use optimization techniques to find settings that lead to high predictive accuracy. However, we lack insight into how to efficiently explore this vast space of configurations: which are the best optimization techniques, how should we use them, and how significant is their effect on predictive or runtime performance? This paper provides a comprehensive approach for investigating the effects of hyperparameter tuning on three Decision Tree induction algorithms, CART, C4.5 and CTree. These algorithms were selected because they are based on similar principles, have presented a high predictive performance in several previous works and induce interpretable classification models. Additionally, they contain many interacting hyperparameters to be adjusted. Experiments were carried out with different tuning strategies to induce models and evaluate the relevance of hyperparameters using 94 classification datasets from OpenML. Experimental results indicate that hyperparameter tuning provides statistically significant improvements for C4.5 and CTree in only one-third of the datasets, and in most of the datasets for CART. Different tree algorithms may present different tuning scenarios, but in general, the tuning techniques required relatively few iterations to find accurate solutions. Furthermore, the best technique for all the algorithms was the Irace. Finally, we find that tuning a specific small subset of hyperparameters contributes most of the achievable optimal predictive performance.

artificial intelligence, machine learning, optimization problem, (16 more...)

1812.02207

Country:

Europe (1.00)
North America > United States > California (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (0.92)

Industry: Health & Medicine > Therapeutic Area (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

A Technical Survey on Statistical Modelling and Design Methods for Crowdsourcing Quality Control

Jin, Yuan, Carman, Mark, Zhu, Ye, Xiang, Yong

Online crowdsourcing provides a scalable and inexpensive means to collect knowledge (e.g. labels) about various types of data items (e.g. text, audio, video). However, it is also known to result in large variance in the quality of recorded responses which often cannot be directly used for training machine learning systems. To resolve this issue, a lot of work has been conducted to control the response quality such that low-quality responses cannot adversely affect the performance of the machine learning systems. Such work is referred to as the quality control for crowdsourcing. Past quality control research can be divided into two major branches: quality control mechanism design and statistical models. The first branch focuses on designing measures, thresholds, interfaces and workflows for payment, gamification, question assignment and other mechanisms that influence workers' behaviour. The second branch focuses on developing statistical models to perform effective aggregation of responses to infer correct responses. The two branches are connected as statistical models (i) provide parameter estimates to support the measure and threshold calculation, and (ii) encode modelling assumptions used to derive (theoretical) performance guarantees for the mechanisms. There are surveys regarding each branch but they lack technical details about the other branch. Our survey is the first to bridge the two branches by providing technical details on how they work together under frameworks that systematically unify crowdsourcing aspects modelled by both of them to determine the response quality. We are also the first to provide taxonomies of quality control papers based on the proposed frameworks. Finally, we specify the current limitations and the corresponding future directions for the quality control research.

artificial intelligence, machine learning, proceedings, (19 more...)

1812.02736

Genre:

Overview (1.00)
Research Report > New Finding (0.92)

Industry: Leisure & Entertainment > Games > Computer Games (0.67)

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
(2 more...)