AITopics

1808.09446

Country:

Asia > China > Jiangsu Province > Nanjing (0.05)
Europe > United Kingdom > England > Surrey > Guildford (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

#artificialintelligenceAug-27-2018, 08:16:06 GMT

The Bayesian Probability: Basis and Particular Utility in AI

PROBABILITY was initially called and for a quite a long time the doctrine of chances and was the mathematical description of game of chance (dice, cards and so on) and used to describe and quantify randomness or aleatory of uncertainty. Statisticians use it to describe uncertainty. How can you use probability to describe learning? How can you use it to describe an accumulation of information overtime so yo can modify probability, based on additional knowledge? However, using Bayes theorem is a thing and being Bayesian is something else.

artificial intelligence, bayesian inference, machine learning, (15 more...)

#artificialintelligence

Genre:

Research Report > Strength High (0.50)
Research Report > Experimental Study (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Siddhant, Aditya, Lipton, Zachary C.

Deep Bayesian Active Learning for Natural Language Processing: Results of a Large-Scale Empirical Study

arXiv.org Machine LearningAug-27-2018

Several recent papers investigate Active Learning (AL) for mitigating the data dependence of deep learning for natural language processing. However, the applicability of AL to real-world problems remains an open question. While in supervised learning, practitioners can try many different methods, evaluating each against a validation set before selecting a model, AL affords no such luxury. Over the course of one AL run, an agent annotates its dataset exhausting its labeling budget. Thus, given a new task, an active learner has no opportunity to compare models and acquisition functions. This paper provides a large scale empirical study of deep active learning, addressing multiple tasks and, for each, multiple datasets, multiple models, and a full suite of acquisition functions. We find that across all settings, Bayesian active learning by disagreement, using uncertainty estimates provided either by Dropout or Bayes-by Backprop significantly improves over i.i.d. baselines and usually outperforms classic uncertainty sampling.

artificial intelligence, machine learning, natural language, (17 more...)

1808.05697

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.71)

Kamada, Shin, Ichimura, Takumi, Harada, Toshihide

Adaptive Structural Learning of Deep Belief Network for Medical Examination Data and Its Knowledge Extraction by using C4.5

arXiv.org Artificial IntelligenceAug-27-2018

Deep Learning has a hierarchical network architecture to represent the complicated feature of input patterns. The adaptive structural learning method of Deep Belief Network (DBN) has been developed. The method can discover an optimal number of hidden neurons for given input data in a Restricted Boltzmann Machine (RBM) by neuron generation-annihilation algorithm, and generate a new hidden layer in DBN by the extension of the algorithm. In this paper, the proposed adaptive structural learning of DBN was applied to the comprehensive medical examination data for the cancer prediction. The prediction system shows higher classification accuracy (99.8% for training and 95.5% for test) than the traditional DBN. Moreover, the explicit knowledge with respect to the relation between input and output patterns was extracted from the trained DBN network by C4.5. Some characteristics extracted in the form of IF-THEN rules to find an initial cancer at the early stage were reported in this paper.

artificial intelligence, machine learning, neuron, (13 more...)

1808.08777

Country:

Asia > Japan > Honshū > Chūgoku > Hiroshima Prefecture > Hiroshima (0.06)
North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.96)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Adler, Jonas, Lunz, Sebastian, Verdier, Olivier, Schönlieb, Carola-Bibiane, Öktem, Ozan

Task adapted reconstruction for inverse problems

arXiv.org Artificial IntelligenceAug-27-2018

The paper considers the problem of performing a task defined on a model parameter that is only observed indirectly through noisy data in an ill-posed inverse problem. A key aspect is to formalize the steps of reconstruction and task as appropriate estimators (non-randomized decision rules) in statistical estimation problems. The implementation makes use of (deep) neural networks to provide a differentiable parametrization of the family of estimators for both steps. These networks are combined and jointly trained against suitable supervised training data in order to minimize a joint differentiable loss function, resulting in an end-to-end task adapted reconstruction method. The suggested framework is generic, yet adaptable, with a plug-and-play structure for adjusting both the inverse problem and the task at hand. More precisely, the data model (forward operator and statistical model of the noise) associated with the inverse problem is exchangeable, e.g., by using neural network architecture given by a learned iterative method. Furthermore, any task that is encodable as a trainable neural network can be used. The approach is demonstrated on joint tomographic image reconstruction, classification and joint tomographic image reconstruction segmentation.

artificial intelligence, machine learning, reconstruction, (19 more...)

doi: 10.1088/1361-6420/ac28ec

1809.00948

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Washington > King County > Seattle (0.14)
Europe > Sweden > Stockholm > Stockholm (0.04)
(6 more...)

Genre: Research Report (0.81)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Artificial IntelligenceAug-26-2018

Water Disaggregation via Shape Features based Bayesian Discriminative Sparse Coding

Wang, Bingsheng, Zhang, Xuchao, Lu, Chang-Tien, Chen, Feng

As the issue of freshwater shortage is increasing daily, it is critical to take effective measures for water conservation. According to previous studies, device level consumption could lead to significant freshwater conservation. Existing water disaggregation methods focus on learning the signatures for appliances; however, they are lack of the mechanism to accurately discriminate parallel appliances' consumption. In this paper, we propose a Bayesian Discriminative Sparse Coding model using Laplace Prior (BDSC-LP) to extensively enhance the disaggregation performance. To derive discriminative basis functions, shape features are presented to describe the low-sampling-rate water consumption patterns. A Gibbs sampling based inference method is designed to extend the discriminative capability of the disaggregation dictionaries. Extensive experiments were performed to validate the effectiveness of the proposed model using both real-world and synthetic datasets.

artificial intelligence, data mining, machine learning, (18 more...)

1808.08951

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry: Water & Waste Management > Water Management > Water Supplies & Services (0.66)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Foulds, James, Pan, Shimei

An Intersectional Definition of Fairness

arXiv.org Machine LearningAug-25-2018

With the rising influence of machine learning algorithms on many important aspects of our daily lives, there are growing concerns that biases inherent in data can lead the behavior of these algorithms to discriminate against certain populations [1, 2, 4, 6, 8, 28, 29, 15]. In recent years, substantial research effort has been devoted to the development of mathematical definitions of bias, or its opposite, fairness, in algorithms and in data [15, 18, 26, 23, 19, 32]. In this work, we focus on the fairness scenario where there are multiple protected attributes that we aim to ensure fairness for, and which may potentially overlap with each other, such as gender, race, and sexual orientation. Our guiding principle is intersectionality, the core theoretical framework underlying the thirdwave feminist movement [13]. The principle of intersectionality states that racism, sexism, and other social systems which harm marginalized groups are interlocking in their effects, such that the lived experience of, e.g., black women, is very different than that of, e.g., white women. Intersectionality was defined by Kimberlé Crenshaw in the 1980's [13] and popularized in the 1990's, e.g. by Patricia Hill Collins [10], although the ideas are much older [11, 35]. In the context of machine learning and fairness, intersectionality was recently considered by [7], who studied the impact of the intersection of gender and skin color on computer vision performance, and by [23, 19], who aimed to protect certain subgroups in order to prevent "fairness gerrymandering."

artificial intelligence, bayesian inference, machine learning, (19 more...)

1807.08362

Country:

Europe > Sweden > Stockholm > Stockholm (0.04)
North America > United States > Ohio > Summit County > Akron (0.04)
North America > United States > New York (0.04)
(4 more...)

Genre:

Overview (0.66)
Research Report (0.64)

Industry:

Government (0.55)
Law > Civil Rights & Constitutional Law (0.48)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Scipioni, Michele, Pedemonte, Stefano, Santarelli, Maria Filomena, Landini, Luigi

Probabilistic Graphical Modeling approach to dynamic PET direct parametric map estimation and image reconstruction

arXiv.org Machine LearningAug-24-2018

In the context of dynamic emission tomography, the conventional processing pipeline consists of independent image reconstruction of single time frames, followed by the application of a suitable kinetic model to time activity curves (TACs) at the voxel or region-of-interest level. The relatively new field of 4D PET direct reconstruction, by contrast, seeks to move beyond this scheme and incorporate information from multiple time frames within the reconstruction task. Existing 4D direct models are based on a deterministic description of voxels' TACs, captured by the chosen kinetic model, considering the photon counting process the only source of uncertainty. In this work, we introduce a new probabilistic modeling strategy based on the key assumption that activity time course would be subject to uncertainty even if the parameters of the underlying dynamic process were known. This leads to a hierarchical Bayesian model, which we formulate using the formalism of Probabilistic Graphical Modeling (PGM). The inference of the joint probability density function arising from PGM is addressed using a new gradient-based iterative algorithm, which presents several advantages compared to existing direct methods: it is flexible to an arbitrary choice of linear and nonlinear kinetic model; it enables the inclusion of arbitrary (sub)differentiable priors for parametric maps; it is simpler to implement and suitable to integration in computing frameworks for machine learning. Computer simulations and an application to real patient scan showed how the proposed approach allows us to weight the importance of the kinetic model, providing a bridge between indirect and deterministic direct methods.

artificial intelligence, bayesian inference, machine learning, (18 more...)

1808.08286

Country:

Europe > Italy > Tuscany > Pisa Province > Pisa (0.05)
Europe > Germany (0.04)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

arXiv.org Machine LearningAug-24-2018

Unknown Examples & Machine Learning Model Generalization

Chung, Yeounoh, Haas, Peter J., Upfal, Eli, Kraska, Tim

Over the past decades, researchers and ML practitioners have come up with better and better ways to build, understand and improve the quality of ML models, but mostly under the key assumption that the training data is distributed identically to the testing data. In many real-world applications, however, some potential training examples are unknown to the modeler, due to sample selection bias or, more generally, covariate shift, i.e., a distribution shift between the training and deployment stage. The resulting discrepancy between training and testing distributions leads to poor generalization performance of the ML model and hence biased predictions. We provide novel algorithms that estimate the number and properties of these unknown training examples---unknown unknowns. This information can then be used to correct the training set, prior to seeing any test data. The key idea is to combine species-estimation techniques with data-driven methods for estimating the feature values for the unknown unknowns. Experiments on a variety of ML models and datasets indicate that taking the unknown examples into account can yield a more robust ML model that generalizes better.

artificial intelligence, machine learning, training data, (18 more...)

1808.08294

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (1.00)

Industry:

Education > Educational Setting (0.46)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Uehara, Masatoshi, Matsuda, Takeru, Komaki, Fumiyasu

Analysis of Noise Contrastive Estimation from the Perspective of Asymptotic Variance

arXiv.org Machine LearningAug-23-2018

There are many models, often called unnormalized models, whose normalizing constants are not calculated in closed form. Maximum likelihood estimation is not directly applicable to unnormalized models. Score matching, contrastive divergence method, pseudo-likelihood, Monte Carlo maximum likelihood, and noise contrastive estimation (NCE) are popular methods for estimating parameters of such models. In this paper, we focus on NCE. The estimator derived from NCE is consistent and asymptotically normal because it is an M-estimator. NCE characteristically uses an auxiliary distribution to calculate the normalizing constant in the same spirit of the importance sampling. In addition, there are several candidates as objective functions of NCE. We focus on how to reduce asymptotic variance. First, we propose a method for reducing asymptotic variance by estimating the parameters of the auxiliary distribution. Then, we determine the form of the objective functions, where the asymptotic variance takes the smallest values in the original estimator class and the proposed estimator classes. We further analyze the robustness of the estimator.

artificial intelligence, asymptotic variance, machine learning, (18 more...)

1808.07983

Country:

North America > United States > New York (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.74)