Training and Inference on Any-Order Autoregressive Models the Right Way

Oct-9-2024, 19:45:15 GMT–Neural Information Processing Systems

Conditional inference on arbitrary subsets of variables is a core problem in probabilistic inference with important applications such as masked language modeling and image inpainting. In recent years, the family of Any-Order Autoregressive Models (AO-ARMs) -- closely related to popular models such as BERT and XLNet -- has shown breakthrough performance in arbitrary conditional tasks across a sweeping range of domains. But, in spite of their success, in this paper we identify significant improvements to be made to previous formulations of AO-ARMs. First, we show that AO-ARMs suffer from redundancy in their probabilistic model, i.e., they define the same distribution in multiple different ways. We alleviate this redundancy by training on a smaller set of univariate conditionals that still maintains support for efficient arbitrary conditional inference. Second, we upweight the training loss for univariate conditionals that are evaluated more frequently during inference.

any-order autoregressive model, training and inference, univariate conditional, (2 more...)

Neural Information Processing Systems

Oct-9-2024, 19:45:15 GMT

Conferences Web Page

Add feedback

Genre:
- Play > Prospect (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Uncertainty (0.64)
  - Natural Language (0.64)