Uni [MASK]: Unified Inference in Sequential Decision Problems Micah Carroll 1, Orr Paradise

Neural Information Processing Systems 

Randomly masking and predicting word tokens has been a successful approach in pre-training language models for a variety of downstream tasks.