Goto

Collaborating Authors

 paml



Joint Repetition Suppression and Content Moderation of Large Language Models

arXiv.org Artificial Intelligence

Natural language generation (NLG) is one of the most impactful fields in NLP, and recent years have witnessed its evolution brought about by large language models (LLMs). As the key instrument for writing assistance applications, they are generally prone to replicating or extending offensive content provided in the input. In low-resource data regime, they can also lead to repetitive outputs. Usually, offensive content and repetitions are mitigated with post-hoc methods, including n-gram level blocklists, top-k and nucleus sampling. In this paper, we apply non-exact repetition suppression using token and sequence level unlikelihood loss, and further explore the framework of unlikelihood training objective in order to jointly endow the model with abilities to avoid generating offensive words and phrases from the beginning. Finally, with comprehensive experiments, we demonstrate that our proposed methods work exceptionally in controlling the repetition and content quality of LLM outputs.


New Neural Network with 500 Billion Parameters

#artificialintelligence

Google just published a research article about its Pathways Language Model (PaML), a neural network with 500 billion parameters. It is unclear to me how many layers and how many neurons (also called nodes) it can handle. A parameter in this context is a weight attached to a link between two connected neurons. So the number of neurons is at most 500 billion, but it is most likely much smaller. By contrast, the average human brain has 86 billion neurons.


Probabilistic Active Meta-Learning

arXiv.org Machine Learning

Data-efficient learning algorithms are essential in many practical applications where data collection is expensive, e.g., in robotics due to the wear and tear. To address this problem, meta-learning algorithms use prior experience about tasks to learn new, related tasks efficiently. Typically, a set of training tasks is assumed given or randomly chosen. However, this setting does not take into account the sequential nature that naturally arises when training a model from scratch in real-life: how do we collect a set of training tasks in a data-efficient manner? In this work, we introduce task selection based on prior experience into a meta-learning algorithm by conceptualizing the learner and the active meta-learning setting using a probabilistic latent variable model. We provide empirical evidence that our approach improves data-efficiency when compared to strong baselines on simulated robotic experiments.


Policy-Aware Model Learning for Policy Gradient Methods

arXiv.org Artificial Intelligence

A model-based reinforcement learning (MBRL) agent gradually learns a model of the environment as it interacts with it, and uses the learned model to plan and find a good policy. This can be done by planning with samples coming from the model, instead of or in addition to the samples from the environment, e.g., Sutton (1990); Peng & Williams (1993); Sutton et al. (2008); Deisenroth et al. (2015); Talvitie (2017); Ha & Schmidhuber (2018). If learning a model is easier than learning the policy or value function in a model-free manner, MBRL will lead to a reduction in the number of required interactions with the real-world and will improve the sample complexity of the agent. However, this is contingent on the ability of the agent to learn an accurate model of the real environment. Therefore, the problem of learning a good model of the environment is of paramount importance in the success of MBRL. This paper addresses the question of how we can approach the problem of learning a model of the environment, and proposes a method called policy-aware model learning (PAML). The conventional approach to model learning in MBRL is to learn a model that is a good predictor of the environment. If the learned model is accurate enough, this leads to a value function or a policy that is close to the optimal one. Learning a good predictive model can be achieved by minimizing some form of a probabilistic loss.


Personalizing Dialogue Agents via Meta-Learning

arXiv.org Artificial Intelligence

Existing personalized dialogue models use human designed persona descriptions to improve dialogue consistency. Collecting such descriptions from existing dialogues is expensive and requires hand-crafted feature designs. In this paper, we propose to extend Model-Agnostic Meta-Learning (MAML)(Finn et al., 2017) to personalized dialogue learning without using any persona descriptions. Our model learns to quickly adapt to new personas by leveraging only a few dialogue samples collected from the same user, which is fundamentally different from conditioning the response on the persona descriptions. Empirical results on Persona-chat dataset (Zhang et al., 2018) indicate that our solution outperforms non-meta-learning baselines using automatic evaluation metrics, and in terms of human-evaluated fluency and consistency.


SAS is The Leader in The Forrester Wave : Multimodal Predictive Analytics and Machine Learning (PAML) Platforms, Q3 2018

#artificialintelligence

According to SAS, SAS Visual Data Mining and Machine Learning offers users a single platform to solve complex analytical problems. Combining data preparation, visualization, advanced analytics and model deployment, it unifies the entire machine learning process, from data access/transformation and preparation to scoring, in one environment. Running on the SAS Viya engine, SAS Visual Data Mining and Machine Learning includes the latest statistical, machine learning, deep learning and text analysis algorithms that accelerate structured and unstructured data explorations, while also supporting popular open source languages.