A Pseudo-Semantic Loss for Autoregressive Models with Logical Constraints

Neural Information Processing Systems 

This often requires maximizing the likelihood of a symbolic constraint w.r.t. the neural network's output distribution. Such output distributions are typically assumed to be fully-factorized. This limits the applicability of neuro-symbolic learning to the more expressive autoregressive distributions, e.g., transformers. Under such distributions, computing the likelihood of even simple constraints is #P-hard. Instead of attempting to enforce the constraint on the entire output distribution, we propose to do so on a random, local approximation thereof.