AITopics | ardm

Collaborating Authors

ardm

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Discriminator Guidance for Autoregressive Diffusion Models

Kelvinius, Filip Ekström, Lindsten, Fredrik

arXiv.org Machine LearningOct-24-2023

We introduce discriminator guidance in the setting of Autoregressive Diffusion Models. The use of a discriminator to guide a diffusion process has previously been used for continuous diffusion models, and in this work we derive ways of using a discriminator together with a pretrained generative model in the discrete case. First, we show that using an optimal discriminator will correct the pretrained model and enable exact sampling from the underlying data distribution. Second, to account for the realistic scenario of using a sub-optimal discriminator, we derive a sequential Monte Carlo algorithm which iteratively takes the predictions from the discrimiator into account during the generation process. We test these approaches on the task of generating molecular graphs and show how the discriminator improves the generative performance over using only the pretrained model.

artificial intelligence, discriminator, machine learning, (15 more...)

arXiv.org Machine Learning

2310.15817

Country: Europe > Sweden > Östergötland County > Linköping (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Autoregressive Diffusion Models

Hoogeboom, Emiel, Gritsenko, Alexey A., Bastings, Jasmijn, Poole, Ben, Berg, Rianne van den, Salimans, Tim

arXiv.org Machine LearningOct-5-2021

We introduce Autoregressive Diffusion Models (ARDMs), a model class encompassing and generalizing order-agnostic autoregressive models (Uria et al., 2014) and absorbing discrete diffusion (Austin et al., 2021), which we show are special cases of ARDMs under mild assumptions. ARDMs are simple to implement and easy to train. Unlike standard ARMs, they do not require causal masking of model representations, and can be trained using an efficient objective similar to modern probabilistic diffusion models that scales favourably to highly-dimensional data. At test time, ARDMs support parallel generation which can be adapted to fit any given generation budget. We find that ARDMs require significantly fewer steps than discrete diffusion models to attain the same performance. Finally, we apply ARDMs to lossless compression, and show that they are uniquely suited to this task. Contrary to existing approaches based on bits-back coding, ARDMs obtain compelling results not only on complete datasets, but also on compressing single data points. Moreover, this can be done using a modest number of network calls for (de)compression due to the model's adaptable parallel generation. Deep generative models have made great progress in modelling different sources of data, such as images, text and audio. These models have a wide variety of applications, such as denoising, inpainting, translating and representation learning. A popular type of likelihood-based models are Autoregressive Models (ARMs). Although very effective, ARMs require a pre-specified order in which to generate data, which may not be an obvious choice for some data modalities, for example images. Further, although the likelihood of ARMs can be retrieved with a single neural network call, sampling from a model requires the same number of network calls as the dimensionality of the data. Recently, modern probabilistic diffusion models have introduced a new training paradigm: Instead of optimizing the entire likelihood of a datapoint, a component of the likelihood bound can be sampled and optimized instead. Works on diffusion on discrete spaces (Sohl-Dickstein et al., 2015; Hoogeboom et al., 2021; Austin et al., 2021) describe a discrete destruction process for which the Work done during as research intern at Google Brain.

ardm, generative process, international conference, (16 more...)

arXiv.org Machine Learning

2110.02037

Country:

Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > China > Beijing > Beijing (0.04)
Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)

Genre: Research Report (0.50)

Industry: Information Technology (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Alternating Recurrent Dialog Model with Large-scale Pre-trained Language Models

Wu, Qingyang, Zhang, Yichi, Li, Yu, Yu, Zhou

arXiv.org Artificial IntelligenceOct-8-2019

Existing dialog system models require extensive human annotations and are difficult to generalize to different tasks. The recent success of large pre-trained language models such as BERT and GPT -2 (Devlin et al., 2019; Radford et al., 2019) have suggested the effectiveness of incorporating language priors in downstream NLP tasks. However, how much pre-trained language models can help dialog response generation is still under exploration. In this paper, we propose a simple, general, and effective framework: Alternating Recurrent Dialog Model (ARDM). ARDM models each speaker separately and takes advantage of the large pre-trained language model. It requires no supervision from human annotations such as belief states or dialog acts to achieve effective conversations. ARDM outperforms or is on par with state-of-the-art methods on two popular task-oriented dialog datasets: CamRest676 and MultiWOZ. Moreover, we can generalize ARDM to more challenging, non-collaborative tasks such as persuasion. In persuasion tasks, ARDM is capable of generating humanlike responses to persuade people to donate to a charity. It has been a longstanding ambition for artificial intelligence researchers to create an intelligent conversational agent that can generate humanlike responses. Recently data-driven dialog models are more and more popular. However, most current state-of-the-art approaches still rely heavily on extensive annotations such as belief states and dialog acts (Lei et al., 2018). However, dialog content can vary considerably in different dialog tasks. Having a different intent or dialog act annotation scheme for each task is costly. For some tasks, it is even impossible, such as open-domain social chat. Thus, it is difficult to utilize these methods on challenging dialog tasks, such as persuasion and negotiation, where dialog states and acts are difficult to annotate.

ardm, language model, pre-trained language model, (16 more...)

arXiv.org Artificial Intelligence

1910.03756

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
Oceania > Australia > Victoria > Melbourne (0.04)
Asia > Middle East > Syria (0.04)
(8 more...)

Genre: Research Report > Promising Solution (0.54)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback