AITopics

2010.12908

Country:

North America > United States > California > Los Angeles County > Long Beach (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
(25 more...)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Journal of Artificial Intelligence ResearchOct-23-2020

Computing Bayes-Nash Equilibria in Combinatorial Auctions with Verification

Bosshard, Vitor (University of Zurich) | Bünz, Benedikt (Stanford University) | Lubin, Benjamin (Boston University) | Seuken, Sven (University of Zurich)

We present a new algorithm for computing pure-strategy ε-Bayes-Nash equilibria (ε-BNEs) in combinatorial auctions with continuous value and action spaces. An essential innovation of our algorithm is to separate the algorithm's search phase (for finding the ε-BNE) from the verification phase (for computing the ε). Using this approach, we obtain an algorithm that is both very fast and provides theoretical guarantees on the ε it finds. Our main technical contribution is a verification method which allows us to upper bound the ε across the whole continuous value space without making assumptions about the mechanism. Using our algorithm, we can now compute ε-BNEs in multi-minded domains that are significantly more complex than what was previously possible to solve. We release our code under an open-source license to enable researchers to perform algorithmic analyses of auctions, to enable bidders to analyze different strategies, and to facilitate many other applications.

algorithm, artificial intelligence, machine learning, (20 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.11525

AI Access Foundation

11525

Journal of Artificial Intelligence Research

Country:

Europe > Switzerland > Zürich > Zürich (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(14 more...)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)
Information Technology > Artificial Intelligence > Machine Learning (0.68)

Nakagawa, Akira, Kato, Keizo

Quantitative Understanding of VAE by Interpreting ELBO as Rate Distortion Cost of Transform Coding

arXiv.org Machine LearningOct-23-2020

Variational autoencoder (VAE) estimates the posterior parameters (mean and variance) of latent variables corresponding to each input data. While it is used for many tasks, the transparency of the model is still an underlying issue. This paper provides a quantitative understanding of VAE property by interpreting VAE as a non-linearly scaled isometric embedding. According to the Rate-distortion theory, the optimal transform coding is achieved by using a PCA-like orthonormal transform where the transform space is isometric to the input. From this analogy, we show theoretically and experimentally that VAE can be mapped to an implicit isometric embedding with a scale factor derived from the posterior parameter. As a result, we can estimate the data probabilities in the input space from the prior, loss metrics, and corresponding posterior parameters. In addition, the quantitative importance of each latent variable can be evaluated like the eigenvalue of PCA. Variational autoencoder (VAE) (Kingma & Welling, 2014) is one of the most successful generative models, estimating posterior parameters of latent variables for each input data. In VAE, the latent representation is obtained by maximizing an evidence lower bound (ELBO). A number of studies (Higgins et al., 2017; Kim & Mnih, 2018; Lopez et al., 2018; Chen et al., 2018; Locatello et al., 2019; Rolínek et al., 2019) have tried to reveal the property of latent variables. To maximize ELBO, Alemi et al. (2018) analysed the rate-distortion (RD) tradeoff. However, the quantitative behavior of the latent space at the optimum RD tradeoff condition is still not clarified well. RD theory (Berger, 1971), which is successfully applied to image compression, formulates that a PCA-like orthonormal transform with uniform coding noise optimizes the RD tradeoff.

artificial intelligence, machine learning, variance, (16 more...)

2007.1519

Country:

North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.81)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.86)

Kumar, Aounon, Levine, Alexander, Feizi, Soheil, Goldstein, Tom

Certifying Confidence via Randomized Smoothing

arXiv.org Machine LearningOct-22-2020

Randomized smoothing has been shown to provide good certified-robustness guarantees for high-dimensional classification problems. It uses the probabilities of predicting the top two most-likely classes around an input point under a smoothing distribution to generate a certified radius for a classifier's prediction. However, most smoothing methods do not give us any information about the confidence with which the underlying classifier (e.g., deep neural network) makes a prediction. In this work, we propose a method to generate certified radii for the prediction confidence of the smoothed classifier. We consider two notions for quantifying confidence: average prediction score of a class and the margin by which the average prediction score of one class exceeds that of another. We modify the Neyman-Pearson lemma (a key theorem in randomized smoothing) to design a procedure for computing the certified radius where the confidence is guaranteed to stay above a certain threshold. Our experimental results on CIFAR-10 and ImageNet datasets show that using information about the distribution of the confidence scores allows us to achieve a significantly better certified radius than ignoring it. Thus, we demonstrate that extra information about the base classifier at the input point can help improve certified guarantees for the smoothed classifier.

artificial intelligence, certificate, machine learning, (18 more...)

2009.08061

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.15)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Maryland (0.04)
(9 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Krishnan, Vishaal, Makdah, Abed AlRahman Al, Pasqualetti, Fabio

Lipschitz Bounds and Provably Robust Training by Laplacian Smoothing

arXiv.org Machine LearningOct-22-2020

In this work we propose a graph-based learning framework to train models with provable robustness to adversarial perturbations. In contrast to regularization-based approaches, we formulate the adversarially robust learning problem as one of loss minimization with a Lipschitz constraint, and show that the saddle point of the associated Lagrangian is characterized by a Poisson equation with weighted Laplace operator. Further, the weighting for the Laplace operator is given by the Lagrange multiplier for the Lipschitz constraint, which modulates the sensitivity of the minimizer to perturbations. We then design a provably robust training scheme using graph-based discretization of the input space and a primal-dual algorithm to converge to the Lagrangian's saddle point. Our analysis establishes a novel connection between elliptic operators with constraint-enforced weighting and adversarial learning. We also study the complementary problem of improving the robustness of minimizers with a margin on their loss, formulated as a loss-constrained minimization problem of the Lipschitz constant. We propose a technique to obtain robustified minimizers, and evaluate fundamental Lipschitz lower bounds by approaching Lipschitz constant minimization via a sequence of gradient $p$-norm minimization problems. Ultimately, our results show that, for a desired nominal performance, there exists a fundamental lower bound on the sensitivity to adversarial perturbations that depends only on the loss function and the data distribution, and that improvements in robustness beyond this bound can only be made at the expense of nominal performance. Our training schemes provably achieve these bounds both under constraints on performance and~robustness.

artificial intelligence, lipschitz constant, machine learning, (15 more...)

2006.03712

Country:

North America > United States > California > Riverside County > Riverside (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
North America > United States > Colorado > Denver County > Denver (0.04)
(2 more...)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Pfrommer, Samuel, Gama, Fernando, Ribeiro, Alejandro

Discriminability of Single-Layer Graph Neural Networks

arXiv.org Machine LearningOct-21-2020

Network data can be conveniently modeled as a graph signal, where data values are assigned to the nodes of a graph describing the underlying network topology. Successful learning from network data requires methods that effectively exploit this graph structure. Graph neural networks (GNNs) provide one such method and have exhibited promising performance on a wide range of problems. Understanding why GNNs work is of paramount importance, particularly in applications involving physical networks. We focus on the property of discriminability and establish conditions under which the inclusion of pointwise nonlinearities to a stable graph filter bank leads to an increased discriminative capacity for high-eigenvalue content. We define a notion of discriminability tied to the stability of the architecture, show that GNNs are at least as discriminative as linear graph filter banks, and characterize the signals that cannot be discriminated by either.

artificial intelligence, gnn, machine learning, (16 more...)

2010.08847

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
(10 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceOct-19-2020

Multi-teacher Knowledge Distillation for Knowledge Graph Completion

Wang, Kai, Liu, Yu, Ma, Qian, Sheng, Quan Z.

Link prediction based on knowledge graph embedding (KGE) aims to predict new triples to complete knowledge graphs (KGs) automatically. However, recent KGE models tend to improve performance by excessively increasing vector dimensions, which would cause enormous training costs and save storage in practical applications. To address this problem, we first theoretically analyze the capacity of low-dimensional space for KG embeddings based on the principle of minimum entropy. Then, we propose a novel knowledge distillation framework for knowledge graph embedding, utilizing multiple low-dimensional KGE models as teachers. Under a novel iterative distillation strategy, the MulDE model produces soft labels according to training epochs and student performance adaptively. The experimental results show that MulDE can effectively improve the performance and training speed of low-dimensional KGE models. The distilled 32-dimensional models are very competitive compared to some of state-or-the-art (SotA) high-dimensional methods on several commonly-used datasets.

artificial intelligence, dimension, machine learning, (17 more...)

2010.07152

Country:

Europe > Slovenia > Central Slovenia > Municipality of Ljubljana > Ljubljana (0.05)
Asia > China > Liaoning Province > Dalian (0.05)
Asia > Taiwan > Taiwan Province > Taipei (0.04)
(4 more...)

Genre: Research Report > New Finding (0.66)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Ehret, Benjamin, Henning, Christian, Cervera, Maria R., Meulemans, Alexander, von Oswald, Johannes, Grewe, Benjamin F.

Continual Learning in Recurrent Neural Networks

arXiv.org Machine LearningOct-13-2020

While a diverse collection of continual learning (CL) methods has been proposed to prevent catastrophic forgetting, a thorough investigation of their effectiveness for processing sequential data with recurrent neural networks (RNNs) is lacking. Here, we provide the first comprehensive evaluation of established CL methods on a variety of sequential data benchmarks. Specifically, we shed light on the particularities that arise when applying weight-importance methods, such as elastic weight consolidation, to RNNs. In contrast to feedforward networks, RNNs iteratively reuse a shared set of weights and require working memory to process input samples. We show that the performance of weight-importance methods is not directly affected by the length of the processed sequences, but rather by high working memory requirements, which lead to an increased need for stability at the cost of decreased plasticity for learning subsequent tasks. We additionally provide theoretical arguments supporting this interpretation by studying linear RNNs. Our study shows that established CL methods can be successfully ported to the recurrent case, and that a recent regularization approach based on hypernetworks outperforms weight-importance methods, thus emerging as a promising candidate for CL in RNNs. Overall, we provide insights on the differences between CL in feedforward networks and RNNs, while guiding towards effective solutions to tackle CL on sequential data.

artificial intelligence, experiment, machine learning, (20 more...)

2006.12109

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > New York > New York County > New York City (0.04)
(11 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.68)
Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

Robine, Jan, Uelwer, Tobias, Harmeling, Stefan

Discrete Latent Space World Models for Reinforcement Learning

arXiv.org Artificial IntelligenceOct-12-2020

Sample efficiency remains a fundamental issue of reinforcement learning. Model-based algorithms try to make better use of data by simulating the environment with a model. We propose a new neural network architecture for world models based on a vector quantized-variational autoencoder (VQ-VAE) to encode observations and a convolutional LSTM to predict the next embedding indices. A model-free PPO agent is trained purely on simulated experience from the world model. We adopt the setup introduced by Kaiser et al. (2020), which only allows 100K interactions with the real environment, and show that we reach better performance than their SimPLe algorithm in five out of six randomly selected Atari environments, while our model is significantly smaller.

artificial intelligence, machine learning, world model, (15 more...)

2010.05767

Country:

North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Europe > Germany > North Rhine-Westphalia > Düsseldorf Region > Düsseldorf (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

arXiv.org Artificial IntelligenceOct-11-2020

Examining the Ordering of Rhetorical Strategies in Persuasive Requests

Shaikh, Omar, Chen, Jiaao, Saad-Falcon, Jon, Chau, Duen Horng, Yang, Diyi

Interpreting how persuasive language influences audiences has implications across many domains like advertising, argumentation, and propaganda. Persuasion relies on more than a message's content. Arranging the order of the message itself (i.e., ordering specific rhetorical strategies) also plays an important role. To examine how strategy orderings contribute to persuasiveness, we first utilize a Variational Autoencoder model to disentangle content and rhetorical strategies in textual requests from a large-scale loan request corpus. We then visualize interplay between content and strategy through an attentional LSTM that predicts the success of textual requests. We find that specific (orderings of) strategies interact uniquely with a request's content to impact success rate, and thus the persuasiveness of a request.

machine learning, natural language, triplet, (19 more...)

2010.04625

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Hong Kong (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(10 more...)

Genre: Research Report (0.64)

Industry:

Government (0.49)
Media (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)