Phenaki

Nov-17-2022, 15:47:32 GMT–#artificialintelligence

We present Phenaki, a model capable of realistic video synthesis given a sequence of textual prompts. Generating videos from text is particularly challenging due to the computational cost, limited quantities of high quality text-video data and variable length of videos. To address these issues, we introduce a new causal model for learning video representation which compresses the video to a small representation of discrete tokens. This tokenizer uses causal attention in time, which allows it to work with variable-length videos. To generate video tokens from text we are using a bidirectional masked transformer conditioned on pre-computed text tokens.

phenaki, representation, video, (2 more...)

#artificialintelligence

Nov-17-2022, 15:47:32 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence (0.86)