Meta's new text-to-video AI generator is like DALL-E for video
The researchers note in the paper that the model has many technical limitations beyond blurry footage and disjointed animation. For example, their training methods are unable to learn information that might only be inferred by a human watching a video -- e.g., whether a video of a waving hand is going left to right or right to left. Other problems include generating videos longer than five seconds, videos with multiple scenes and events, and higher resolution. Make-A-Video currently outputs 16 frames of video at a resolution of 64 by 64 pixels, which are then boosted in size using a separate AI model to 768 by 768.
Sep-30-2022, 11:40:37 GMT
- Technology: