RiTTA: Modeling Event Relations in Text-to-Audio Generation

Open in new window