Flow Matching with General Discrete Paths: A Kinetic-Optimal Perspective

Shaul, Neta, Gat, Itai, Havasi, Marton, Severo, Daniel, Sriram, Anuroop, Holderrieth, Peter, Karrer, Brian, Lipman, Yaron, Chen, Ricky T. Q.

arXiv.org Artificial Intelligence 

The design space of discrete-space diffusion or flow generative models are significantly less well-understood than their continuous-space counterparts, with many works focusing only on a simple masked construction. In this work, we aim to take a holistic approach to the construction of discrete generative models based on continuous-time Markov chains, and for the first time, allow the use of arbitrary discrete probability paths, or colloquially, corruption processes. Through the lens of optimizing the symmetric kinetic energy, we propose velocity formulas that can be applied to any given probability path, completely decoupling the probability and velocity, and giving the user the freedom to specify any desirable probability path based on expert knowledge specific to the data domain. Furthermore, we find that a special construction of mixture probability paths optimizes the symmetric kinetic energy for the discrete case. We find that we can outperform the mask construction even in text with kinetic-optimal mixture paths, while we can make use of domain-specific constructions of the probability path over the visual domain. Generative models over discrete spaces have not seen as much progress on the methodology side compared to continuous-space counterparts. For the most part, applications such as large language modeling rely solely on autoregressive models (Radford et al., 2019; Bommasani et al., 2021). The simplicity of autoregressive modeling has also motivated people to use them for multimodal generation, where other modalities, such as images and videos, are tokenized and modeled within an autoregressive framework (Van den Oord et al., 2016; Team, 2024; Sun et al., 2024). A promising framework that brings iterative refinement to the discrete case is to consider the use of Markov chains within a dynamical generative framework.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found