AITopics

doi: 10.3233/FI-2016-1446

1512.08899

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Abductive Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Stable Distribution Alignment Using the Dual of the Adversarial Distance

Usman, Ben, Saenko, Kate, Kulis, Brian

Methods that align distributions by minimizing an adversarial distance between them have recently achieved impressive results. However, these approaches are difficult to optimize with gradient descent and they often do not converge well without careful hyperparameter tuning and proper initialization. We investigate whether turning the adversarial min-max problem into an optimization problem by replacing the maximization part with its dual improves the quality of the resulting alignment and explore its connections to Maximum Mean Discrepancy. Our empirical results suggest that using the dual formulation for the restricted family of linear discriminators results in a more stable convergence to a desirable solution when compared with the performance of a primal min-max GAN-like objective and an MMD objective under the same restrictions. We test our hypothesis on the problem of aligning two synthetic point clouds on a plane and on a real-image domain adaptation problem on digits. In both cases, the dual formulation yields an iterative procedure that gives more stable and monotonic improvement over time.

artificial intelligence, machine learning, objective, (15 more...)

1707.04046

Genre: Research Report > New Finding (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Overcoming Catastrophic Forgetting by Incremental Moment Matching

Lee, Sang-Woo, Kim, Jin-Hwa, Jun, Jaehyun, Ha, Jung-Woo, Zhang, Byoung-Tak

Catastrophic forgetting is a problem of neural networks that loses the information of the first task after training the second task. Here, we propose a method, i.e. incremental moment matching (IMM), to resolve this problem. IMM incrementally matches the moment of the posterior distribution of the neural network which is trained on the first and the second task, respectively. To make the search space of posterior parameter smooth, the IMM procedure is complemented by various transfer learning techniques including weight transfer, L2-norm of the old and the new parameter, and a variant of dropout with the old parameter. We analyze our approach on a variety of datasets including the MNIST, CIFAR-10, Caltech-UCSD- Birds, and Lifelog datasets. The experimental results show that IMM achieves state-of-the-art performance by balancing the information between an old and a new network.

artificial intelligence, machine learning, neural network, (16 more...)

1703.08475

Country: North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Activation Maximization Generative Adversarial Nets

Zhou, Zhiming, Cai, Han, Rong, Shu, Song, Yuxuan, Ren, Kan, Zhang, Weinan, Yu, Yong, Wang, Jun

Class labels have been empirically shown useful in improving the sample quality of generative adversarial nets (GANs). In this paper, we mathematically study the properties of the current variants of GANs that make use of class label information. With class aware gradient and cross-entropy decomposition, we reveal how class labels and associated losses influence GAN's training. Based on that, we propose Activation Maximization Generative Adversarial Networks (AM-GAN) as an advanced solution. Comprehensive experiments have been conducted to validate our analysis and evaluate the effectiveness of our solution, where AM-GAN outperforms other strong baselines and achieves state-of-the-art Inception Score (8.91) on CIFAR-10. In addition, we demonstrate that, with the Inception ImageNet classifier, Inception Score mainly tracks the diversity of the generator, and there is, however, no reliable evidence that it can reflect the true sample quality. We thus propose a new metric, called AM Score, to provide more accurate estimation on the sample quality. Our proposed model also outperforms the baseline methods in the new metric.

artificial intelligence, inception score, machine learning, (18 more...)

1703.02

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.49)

Ganzfried, Sam, Yusuf, Farzana

Optimal Number of Choices in Rating Contexts

In many settings people must give numerical scores to entities from a small discrete set. For instance, rating physical attractiveness from 1--5 on dating sites, or papers from 1--10 for conference reviewing. We study the problem of understanding when using a different number of options is optimal. For concreteness we assume the true underlying scores are integers from 1--100. We consider the case when scores are uniform random and Gaussian. We study when using 2, 3, 4, 5, and 10 options is optimal in these models. One may expect that using more options would always improve performance in this model, but we show that this is not necessarily the case, and that using fewer choices---even just two---can surprisingly be optimal in certain situations. While in theory for this setting it would be optimal to use all 100 options, in practice this is prohibitive, and it is preferable to utilize a smaller number of options due to humans' limited computational resources. Our results suggest that using a smaller number of options than is typical could be optimal in certain situations. This would have many potential applications, as settings requiring entities to be ranked by humans are ubiquitous.

artificial intelligence, error 0, social media, (16 more...)

1605.06588

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Government > Regional Government > Europe Government > France Government (0.46)

Technology:

Information Technology > Communications > Social Media (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.48)

New Scientist Online NewsJan-29-2018, 17:36:52 GMT

Facebook is making a chatbot that can fill awkward silences

There are a lot of things that chatbots have yet to master and high on the list is small talk. But researchers at Facebook think the best way to make software prattle away is to give it a personality. Workers were asked to chat in pairs and to give statements describing themselves, including their likes and dislikes. The crowdworkers' chatter was linked to these description statements and used to train the chatbots.

information technology services, management and information, natural language, (4 more...)

New Scientist Online News

AI-Alerts: 2018 > 2018-01 > AAAI AI-Alert for Jan 30, 2018 (1.00)

Industry: Information Technology > Services (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Chatbots (1.00)

Gatt, Albert, Krahmer, Emiel

Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation

This paper surveys the current state of the art in Natural Language Generation (NLG), defined as the task of generating text or speech from non-linguistic input. A survey of NLG is timely in view of the changes that the field has undergone over the past decade or so, especially in relation to new (usually data-driven) methods, as well as new applications of NLG technology. This survey therefore aims to (a) give an up-to-date synthesis of research on the core tasks in NLG and the architectures adopted in which such tasks are organised; (b) highlight a number of relatively recent research topics that have arisen partly as a result of growing synergies between NLG and other areas of artificial intelligence; (c) draw attention to the challenges in NLG evaluation, relating them to similar challenges faced in other areas of Natural Language Processing, with an emphasis on different evaluation methods and the relationships between them.

computer game, sentence planning and realisation, soccer, (23 more...)

1703.09902

Country:

North America > United States > Massachusetts (0.27)
Europe > Netherlands (0.27)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
(12 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Media > News (1.00)
Leisure & Entertainment > Sports > Soccer (1.00)
Leisure & Entertainment > Games > Computer Games (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.92)

Garcia-Gasulla, Dario, Parés, Ferran, Vilalta, Armand, Moreno, Jonatan, Ayguadé, Eduard, Labarta, Jesús, Cortés, Ulises, Suzumura, Toyotaro

On the Behavior of Convolutional Nets for Feature Extraction

Deep neural networks are representation learning techniques. During training, a deep net is capable of generating a descriptive language of unprecedented size and detail in machine learning. Extracting the descriptive language coded within a trained CNN model (in the case of image data), and reusing it for other purposes is a field of interest, as it provides access to the visual descriptors previously learnt by the CNN after processing millions of images, without requiring an expensive training phase. Contributions to this field (commonly known as feature representation transfer or transfer learning) have been purely empirical so far, extracting all CNN features from a single layer close to the output and testing their performance by feeding them to a classifier. This approach has provided consistent results, although its relevance is limited to classification tasks. In a completely different approach, in this paper we statistically measure the discriminative power of every single feature found within a deep CNN, when used for characterizing every class of 11 datasets. We seek to provide new insights into the behavior of CNN features, particularly the ones from convolutional layers, as this can be relevant for their application to knowledge representation and reasoning. Our results confirm that low and middle level features may behave differently to high level features, but only under certain conditions. We find that all CNN features can be used for knowledge representation purposes both by their presence or by their absence, doubling the information a single CNN feature may provide. We also study how much noise these features may include, and propose a thresholding approach to discard most of it. All these insights have a direct application to the generation of CNN embedding spaces.

artificial intelligence, deep learning, machine learning, (19 more...)

1703.01127

Country:

North America > United States (0.28)
Europe > Spain (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Lample, Guillaume, Chaplot, Devendra Singh

Playing FPS Games with Deep Reinforcement Learning

Advances in deep reinforcement learning have allowed autonomous agents to perform well on Atari games, often outperforming humans, using only raw pixels to make their decisions. However, most of these games take place in 2D environments that are fully observable to the agent. In this paper, we present the first architecture to tackle 3D environments in first-person shooter games, that involve partially observable states. Typically, deep reinforcement learning methods only utilize visual input for training. We present a method to augment these models to exploit game feature information such as the presence of enemies or items, during the training phase. Our model is trained to simultaneously learn these features along with minimizing a Q-learning objective, which is shown to dramatically improve the training speed and performance of our agent. Our architecture is also modularized to allow different models to be independently trained for different phases of the game. We show that the proposed architecture substantially outperforms built-in AI agents of the game as well as average humans in deathmatch scenarios.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

1609.05521

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)

Al-Shedivat, Maruan, Dubey, Avinava, Xing, Eric P.

Contextual Explanation Networks

We introduce contextual explanation networks (CENs)---a class of models that learn to predict by generating and leveraging intermediate explanations. CENs are deep networks that generate parameters for context-specific probabilistic graphical models which are further used for prediction and play the role of explanations. Contrary to the existing post-hoc model-explanation tools, CENs learn to predict and to explain jointly. Our approach offers two major advantages: (i) for each prediction, valid instance-specific explanations are generated with no computational overhead and (ii) prediction via explanation acts as a regularization and boosts performance in low-resource settings. We prove that local approximations to the decision boundary of our networks are consistent with the generated explanations. Our results on image and text classification and survival analysis tasks demonstrate that CENs are competitive with the state-of-the-art while offering additional insights behind each prediction, valuable for decision support.

explanation, machine learning, natural language, (17 more...)

1705.10301

Country:

Africa > Uganda (0.28)
North America (0.28)

Genre: Research Report > New Finding (0.88)

Industry:

Leisure & Entertainment (1.00)
Health & Medicine > Therapeutic Area (1.00)
Media > Film (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)