AITopics | Riezler, Stefan

Collaborating Authors

Riezler, Stefan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Sparse Stochastic Zeroth-Order Optimization with an Application to Bandit Structured Prediction

Sokolov, Artem, Hitschler, Julian, Riezler, Stefan

arXiv.org Machine LearningJun-12-2018

Stochastic zeroth-order (SZO), or gradient-free, optimization allows to optimize arbitrary functions by relying only on function evaluations under parameter perturbations, however, the iteration complexity of SZO methods suffers a factor proportional to the dimensionality of the perturbed function. We show that in scenarios with natural sparsity patterns as in structured prediction applications, this factor can be reduced to the expected number of active features over input-output pairs. We give a general proof that applies sparse SZO optimization to Lipschitz-continuous, nonconvex, stochastic objectives, and present an experimental evaluation on linear bandit structured prediction tasks with sparse word-based feature representations that confirm our theoretical results.

inductive learning, optimization, perturbation, (19 more...)

arXiv.org Machine Learning

1806.04458

Country:

Europe (1.00)
North America > Canada (0.47)
Asia > Middle East > Israel (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Reliability and Learnability of Human Bandit Feedback for Sequence-to-Sequence Reinforcement Learning

Kreutzer, Julia, Uyheng, Joshua, Riezler, Stefan

arXiv.org Machine LearningMay-27-2018

We present a study on reinforcement learning (RL) from human bandit feedback for sequence-to-sequence learning, exemplified by the task of bandit neural machine translation (NMT). We investigate the reliability of human bandit feedback, and analyze the influence of reliability on the learnability of a reward estimator, and the effect of the quality of reward estimates on the overall RL task. Our analysis of cardinal (5-point ratings) and ordinal (pairwise preferences) feedback shows that their intra- and inter-annotator $\alpha$-agreement is comparable. Best reliability is obtained for standardized cardinal feedback, and cardinal feedback is also easiest to learn and generalize from. Finally, improvements of over 1 BLEU can be obtained by integrating a regression-based reward estimator trained on cardinal feedback for 800 translations into RL for NMT. This shows that RL is possible even from small amounts of fairly reliable human feedback, pointing to a great potential for applications at larger scale.

deep learning, neural network, translation, (20 more...)

arXiv.org Machine Learning

1805.10627

Country:

Asia (1.00)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Los Angeles County (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.68)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Reinforcement Learning Approach to Interactive-Predictive Neural Machine Translation

Lam, Tsz Kin, Kreutzer, Julia, Riezler, Stefan

arXiv.org Machine LearningMay-7-2018

We present an approach to interactive-predictive neural machine translation that attempts to reduce human effort from three directions: Firstly, instead of requiring humans to select, correct, or delete segments, we employ the idea of learning from human reinforcements in form of judgments on the quality of partial translations. Secondly, human effort is further reduced by using the entropy of word predictions as uncertainty criterion to trigger feedback requests. Lastly, online updates of the model parameters after every interaction allow the model to adapt quickly. We show in simulation experiments that reward signals on partial translations significantly improve character F-score and BLEU compared to feedback on full translations only, while human effort can be reduced to an average number of $5$ feedback requests for every input.

artificial intelligence, machine translation, translation, (19 more...)

arXiv.org Machine Learning

1805.01553

Country:

Europe (1.00)
Asia > Middle East (0.68)
North America > United States > New York > New York County > New York City (0.14)

Genre: Research Report (0.50)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Improving a Neural Semantic Parser by Counterfactual Learning from Human Bandit Feedback

Lawrence, Carolin, Riezler, Stefan

arXiv.org Machine LearningMay-3-2018

Counterfactual learning from human bandit feedback describes a scenario where user feedback on the quality of outputs of a historic system is logged and used to improve a target system. We show how to apply this learning framework to neural semantic parsing. From a machine learning perspective, the key challenge lies in a proper reweighting of the estimator so as to avoid known degeneracies in counterfactual learning, while still being applicable to stochastic gradient optimization. To conduct experiments with human users, we devise an easy-to-use interface to collect human feedback on semantic parses. Our work is the first to show that semantic parsers can be improved significantly by counterfactual learning from logged human feedback data.

artificial intelligence, natural language, query, (18 more...)

arXiv.org Machine Learning

1805.01252

Country:

Europe (1.00)
North America > United States > New York > New York County > New York City (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

Can Neural Machine Translation be Improved with User Feedback?

Kreutzer, Julia, Khadivi, Shahram, Matusov, Evgeny, Riezler, Stefan

arXiv.org Machine LearningApr-16-2018

We present the first real-world application of methods for improving neural machine translation (NMT) with human reinforcement, based on explicit and implicit user feedback collected on the eBay e-commerce platform. Previous work has been confined to simulation experiments, whereas in this paper we work with real logged feedback for offline bandit learning of NMT parameters. We conduct a thorough analysis of the available explicit user judgments---five-star ratings of translation quality---and show that they are not reliable enough to yield significant improvements in bandit learning. In contrast, we successfully utilize implicit task-based feedback collected in a cross-lingual search task to improve task-specific and machine translation quality metrics.

machine translation, survey article, translation, (21 more...)

arXiv.org Machine Learning

1804.05958

Country:

Europe (1.00)
Asia (1.00)
North America > United States > New York > New York County > New York City (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry: Information Technology > Services (0.70)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Counterfactual Learning from Bandit Feedback under Deterministic Logging: A Case Study in Statistical Machine Translation

Lawrence, Carolin, Sokolov, Artem, Riezler, Stefan

arXiv.org Machine LearningDec-14-2017

The goal of counterfactual learning for statistical machine translation (SMT) is to optimize a target SMT system from logged data that consist of user feedback to translations that were predicted by another, historic SMT system. A challenge arises by the fact that risk-averse commercial SMT systems deterministically log the most probable translation. The lack of sufficient exploration of the SMT output space seemingly contradicts the theoretical requirements for counterfactual learning. We show that counterfactual learning from deterministic bandit logs is possible nevertheless by smoothing out deterministic components in learning. This can be achieved by additive and multiplicative control variates that avoid degenerate behavior in empirical risk minimization. Our simulation experiments show improvements of up to 2 BLEU points by counterfactual learning from deterministic bandit feedback.

artificial intelligence, machine translation, natural language, (17 more...)

arXiv.org Machine Learning

1707.09118

Country:

Europe > Spain (0.28)
North America > United States > Illinois (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Middle East > Republic of Türkiye (0.14)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Shared Task on Bandit Learning for Machine Translation

Sokolov, Artem, Kreutzer, Julia, Sunderland, Kellen, Danchenko, Pavel, Szymaniak, Witold, Fürstenau, Hagen, Riezler, Stefan

arXiv.org Machine LearningJul-27-2017

We introduce and describe the results of a novel shared task on bandit learning for machine translation. The task was organized jointly by Amazon and Heidelberg University for the first time at the Second Conference on Machine Translation (WMT 2017). The goal of the task is to encourage research on learning machine translation from weak user feedback instead of human references or post-edits. On each of a sequence of rounds, a machine translation system is required to propose a translation for an input, and receives a real-valued estimate of the quality of the proposed translation for learning. This paper describes the shared task's learning and evaluation setup, using services hosted on Amazon Web Services (AWS), the data and evaluation metrics, and the results of various machine translation architectures and learning protocols.

artificial intelligence, machine translation, translation, (20 more...)

arXiv.org Machine Learning

1707.0905

Country:

North America > United States (0.46)
Europe > Spain (0.28)
North America > Canada (0.28)
Europe > United Kingdom > Scotland (0.14)

Genre: Research Report (0.50)

Industry:

Information Technology > Services (0.67)
Education > Educational Setting > Online (0.30)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Bandit Structured Prediction for Neural Sequence-to-Sequence Learning

Kreutzer, Julia, Sokolov, Artem, Riezler, Stefan

arXiv.org Machine LearningApr-21-2017

Bandit structured prediction describes a stochastic optimization framework where learning is performed from partial feedback. This feedback is received in the form of a task loss evaluation to a predicted output structure, without having access to gold standard structures. We advance this framework by lifting linear bandit learning to neural sequence-to-sequence learning problems using attention-based recurrent neural networks. Furthermore, we show how to incorporate control variates into our learning algorithms for variance reduction and improved generalization. We present an evaluation on a neural machine translation task that shows improvements of up to 5.89 BLEU points for domain adaptation from simulated bandit feedback.

control variate, deep learning, neural network, (16 more...)

arXiv.org Machine Learning

1704.06497

Country:

North America > United States (1.00)
Europe > United Kingdom > Scotland (0.14)
Asia > Middle East > Qatar (0.14)

Genre: Research Report (1.00)

Industry: Education (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Stochastic Structured Prediction under Bandit Feedback

Sokolov, Artem, Kreutzer, Julia, Riezler, Stefan, Lo, Christopher

Neural Information Processing SystemsDec-31-2016

Stochastic structured prediction under bandit feedback follows a learning protocol where on each of a sequence of iterations, the learner receives an input, predicts an output structure, and receives partial feedback in form of a task loss evaluation of the predicted structure. We present applications of this learning scenario to convex and non-convex objectives for structured prediction and analyze them as stochastic first-order methods. We present an experimental evaluation on problems of natural language processing over exponential output spaces, and compare convergence speed across different objectives under the practical criterion of optimal task performance on development data and the optimization-theoretic criterion of minimal squared gradient norm. Best results under both criteria are obtained for a non-convex objective for pairwise preference learning under bandit feedback.

algorithm, artificial intelligence, inductive learning, (19 more...)

Neural Information Processing Systems

Country:

Europe (0.46)
North America > United States (0.28)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.92)

Add feedback