AITopics

2010.02756

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
(23 more...)

Genre: Research Report (0.64)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Tóthová, Katarína, Parisot, Sarah, Lee, Matthew, Puyol-Antón, Esther, King, Andrew, Pollefeys, Marc, Konukoglu, Ender

Probabilistic 3D surface reconstruction from sparse MRI information

arXiv.org Artificial IntelligenceOct-5-2020

Surface reconstruction from magnetic resonance (MR) imaging data is indispensable in medical image analysis and clinical research. A reliable and effective reconstruction tool should: be fast in prediction of accurate well localised and high resolution models, evaluate prediction uncertainty, work with as little input data as possible. Current deep learning state of the art (SOTA) 3D reconstruction methods, however, often only produce shapes of limited variability positioned in a canonical position or lack uncertainty evaluation. In this paper, we present a novel probabilistic deep learning approach for concurrent 3D surface reconstruction from sparse 2D MR image data and aleatoric uncertainty prediction. Our method is capable of reconstructing large surface meshes from three quasi-orthogonal MR imaging slices from limited training sets whilst modelling the location of each mesh vertex through a Gaussian distribution. Prior shape information is encoded using a built-in linear principal component analysis (PCA) model. Extensive experiments on cardiac MR data show that our probabilistic approach successfully assesses prediction uncertainty while at the same time qualitatively and quantitatively outperforms SOTA methods in shape prediction. Compared to SOTA, we are capable of properly localising and orientating the prediction via the use of a spatially aware neural network.

prediction, reconstruction, surface reconstruction, (15 more...)

2010.02041

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > Canada > Quebec > Capitale-Nationale Region > Québec (0.05)
North America > Canada > Quebec > Capitale-Nationale Region > Quebec City (0.05)
(8 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area (0.94)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Campbell, Ryan, Finlay, Chris, Oberman, Adam M

Adversarial Boot Camp: label free certified robustness in one epoch

arXiv.org Machine LearningOct-5-2020

Machine learning models are vulnerable to adversarial attacks. One approach to addressing this vulnerability is certification, which focuses on models that are guaranteed to be robust for a given perturbation size. A drawback of recent certified models is that they are stochastic: they require multiple computationally expensive model evaluations with random noise added to a given input. In our work, we present a deterministic certification approach which results in a certifiably robust model. This approach is based on an equivalence between training with a particular regularized loss, and the expected values of Gaussian averages. We achieve certified models on ImageNet-1k by retraining a model with this loss for one epoch without the use of label information.

artificial intelligence, machine learning, robustness, (18 more...)

2010.02508

Country:

North America > Canada > Quebec > Montreal (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.14)
(7 more...)

Genre: Research Report (0.83)

Industry: Information Technology > Security & Privacy (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Zhang, Ruixiang, Koyama, Masanori, Ishiguro, Katsuhiko

Learning Structured Latent Factors from Dependent Data:A Generative Model Framework from Information-Theoretic Perspective

arXiv.org Machine LearningOct-2-2020

Learning controllable and generalizable representation of multivariate data with desired structural properties remains a fundamental problem in machine learning. In this paper, we present a novel framework for learning generative models with various underlying structures in the latent space. We represent the inductive bias in the form of mask variables to model the dependency structure in the graphical model and extend the theory of multivariate information bottleneck to enforce it. Our model provides a principled approach to learn a set of semantically meaningful latent factors that reflect various types of desired structures like capturing correlation or encoding invariance, while also offering the flexibility to automatically estimate the dependency structure from data. We show that our framework unifies many existing generative models and can be applied to a variety of tasks including multi-modal data modeling, algorithmic fairness, and invariant risk minimization.

learning structured latent factor, machine learning, natural language, (14 more...)

2007.10623

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
Europe > Austria > Vienna (0.14)
(17 more...)

Genre: Research Report (0.41)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.82)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.70)

Journal of Artificial Intelligence ResearchSep-27-2020

Improved High Dimensional Discrete Bayesian Network Inference using Triplet Region Construction

Lin, Peng ( Capital University of Economics and Business) | Neil, Martin | Fenton, Norman

Performing efficient inference on high dimensional discrete Bayesian Networks (BNs) is challenging. When using exact inference methods the space complexity can grow exponentially with the tree-width, thus making computation intractable. This paper presents a general purpose approximate inference algorithm, based on a new region belief approximation method, called Triplet Region Construction (TRC). TRC reduces the cluster space complexity for factorized models from worst-case exponential to polynomial by performing graph factorization and producing clusters of limited size. Unlike previous generations of region-based algorithms, TRC is guaranteed to converge and effectively addresses the region choice problem that bedevils other region-based algorithms used for BN inference. Our experiments demonstrate that it also achieves significantly more accurate results than competing algorithms.

artificial intelligence, interaction triplet, machine learning, (17 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.12198

AI Access Foundation

12198

Journal of Artificial Intelligence Research

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Florida > Broward County > Fort Lauderdale (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(15 more...)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Dubois, Yann, Kiela, Douwe, Schwab, David J., Vedantam, Ramakrishna

Learning Optimal Representations with the Decodable Information Bottleneck

arXiv.org Machine LearningSep-27-2020

We address the question of characterizing and finding optimal representations for supervised learning. Traditionally, this question has been tackled using the Information Bottleneck, which compresses the inputs while retaining information about the targets, in a decoder-agnostic fashion. In machine learning, however, our goal is not compression but rather generalization, which is intimately linked to the predictive family or decoder of interest (e.g. linear classifier). We propose the Decodable Information Bottleneck (DIB) that considers information retention and compression from the perspective of the desired predictive family. As a result, DIB gives rise to representations that are optimal in terms of expected test performance and can be estimated with guarantees. Empirically, we show that the framework can be used to enforce a small generalization gap on downstream classifiers and to predict the generalization ability of neural networks.

artificial intelligence, machine learning, representation, (18 more...)

2009.12789

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.28)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.14)
(25 more...)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games (0.46)

Dabre, Raj, Fujita, Atsushi

Softmax Tempering for Training Neural Machine Translation Models

arXiv.org Artificial IntelligenceSep-20-2020

Neural machine translation (NMT) models are typically trained using a softmax cross-entropy loss where the softmax distribution is compared against smoothed gold labels. In low-resource scenarios, NMT models tend to over-fit because the softmax distribution quickly approaches the gold label distribution. To address this issue, we propose to divide the logits by a temperature coefficient, prior to applying softmax, during training. In our experiments on 11 language pairs in the Asian Language Treebank dataset and the WMT 2019 English-to-German translation task, we observed significant improvements in translation quality by up to 3.9 BLEU points. Furthermore, softmax tempering makes the greedy search to be as good as beam search decoding in terms of translation quality, enabling 1.5 to 3.5 times speed-up. We also study the impact of softmax tempering on multilingual NMT and recurrently stacked NMT, both of which aim to reduce the NMT model size by parameter sharing thereby verifying the utility of temperature in developing compact NMT models. Finally, an analysis of softmax entropies and gradients reveal the impact of our method on the internal behavior of NMT models.

machine learning, natural language, softmax, (18 more...)

2009.09372

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > Germany > Berlin (0.04)
(11 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

arXiv.org Artificial IntelligenceSep-17-2020

A Probabilistic End-To-End Task-Oriented Dialog Model with Latent Belief States towards Semi-Supervised Learning

Zhang, Yichi, Ou, Zhijian, Wang, Huixin, Feng, Junlan

Structured belief states are crucial for user goal tracking and database query in task-oriented dialog systems. However, training belief trackers often requires expensive turn-level annotations of every user utterance. In this paper we aim at alleviating the reliance on belief state labels in building end-to-end dialog systems, by leveraging unlabeled dialog data towards semi-supervised learning. We propose a probabilistic dialog model, called the LAtent BElief State (LABES) model, where belief states are represented as discrete latent variables and jointly modeled with system responses given user inputs. Such latent variable modeling enables us to develop semi-supervised learning under the principled variational learning framework. Furthermore, we introduce LABES-S2S, which is a copy-augmented Seq2Seq model instantiation of LABES. In supervised experiments, LABES-S2S obtains strong results on three benchmark datasets of different scales. In utilizing unlabeled dialog data, semi-supervised LABES-S2S significantly outperforms both supervised-only and semi-supervised baselines. Remarkably, we can reduce the annotation demands to 50% without performance loss on MultiWOZ.

belief state, machine learning, natural language, (20 more...)

2009.08115

Country:

Europe > France (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > China > Beijing > Beijing (0.04)
(3 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Machine LearningSep-16-2020

Defending SVMs against Poisoning Attacks: the Hardness and DBSCAN Approach

Ding, Hu, Yang, Fan, Huang, Jiawei

Adversarial machine learning has attracted a great amount of attention in recent years. In a poisoning attack, the adversary can inject a small number of specially crafted samples into the training data which make the decision boundary severely deviate and cause unexpected misclassification. Due to the great importance and popular use of support vector machines (SVM), we consider defending SVM against poisoning attacks in this paper. We study two commonly used strategies for defending: designing robust SVM algorithms and data sanitization. Though several robust SVM algorithms have been proposed before, most of them either are in lack of adversarial-resilience, or rely on strong assumptions about the data distribution or the attacker's behavior. Moreover, the research on their complexities is still quite limited. We are the first, to the best of our knowledge, to prove that even the simplest hard-margin one-class SVM with outliers problem is NP-complete, and has no fully PTAS unless P$=$NP (that means it is hard to achieve an even approximate algorithm). For the data sanitization defense, we link it to the intrinsic dimensionality of data; in particular, we provide a sampling theorem in doubling metrics for explaining the effectiveness of DBSCAN (as a density-based outlier removal method) for defending against poisoning attacks. In our empirical experiments, we compare several defenses including the DBSCAN and robust SVM methods, and investigate the influences from the intrinsic dimensionality and data density to their performances.

artificial intelligence, dimension, machine learning, (17 more...)

2006.07757

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > China (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(17 more...)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.89)

arXiv.org Artificial IntelligenceSep-15-2020

Question Directed Graph Attention Network for Numerical Reasoning over Text

Chen, Kunlong, Xu, Weidi, Cheng, Xingyi, Xiaochuan, Zou, Zhang, Yuyu, Song, Le, Wang, Taifeng, Qi, Yuan, Chu, Wei

Although NumNet achieves superior performance than Numerical reasoning over texts, such as addition, other numerically-aware models (Hu et al., 2019a; Andor subtraction, sorting and counting, is a et al., 2019; Geva et al., 2020; Chen et al., 2020), we challenging machine reading comprehension argue that NumNet is insufficient for sophisticated numerical task, since it requires both natural language understanding reasoning, since it lacks two critical ingredients and arithmetic computation. To for numerical reasoning: address this challenge, we propose a heterogeneous 1. Number Type and Entity Mention. The number graph representation for the context of comparison graph in NumNet is not able to identify the passage and question needed for such reasoning, different number types, and lacks the information of and design a question directed graph entities mentioned in the document that connect the attention network to drive multi-step numerical number nodes.

artificial intelligence, machine learning, natural language, (19 more...)

2009.07448

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Hong Kong (0.04)
Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)
(7 more...)

Genre: Research Report (0.82)

Industry:

Leisure & Entertainment > Sports > Football (1.00)
Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)