Phenaki
Information
Description
An artificial intelligence model called Phenaki can take text and turn it into films that are several minutes long. A still picture and a cue can also be used to create a video. In terms of spatio-temporal quality and amount of tokens per movie, the suggested video encoder-decoder surpasses all per-frame baselines currently used in the literature. Using bidirectional masked transformers conditioned on pre-computed text tokens, they are able to produce video tokens from text. The real movie is created by de-tokenizing the generated video tokens.
Phenaki Reviews