Name: Sora
Author: OpenAI

OpenAI's text-to-video diffusion model announced February 2024. Generates videos up to one minute long maintaining visual quality and prompt adherence. Uses transformer architecture like GPT for superior scaling. Represents videos as patches (analogous to GPT tokens), enabling training on diverse durations, resolutions, and aspect ratios. Deep language understanding enables accurate prompt interpretation and compelling characters with vibrant emotions. Can create multiple shots in single video with persistent characters and visual style. Generates entire videos at once or extends existing videos. Research preview in February 2024, public release December 2024 as 'Sora Turbo' with enhanced speed.

Sora

Strengths

Caveats

Capabilities

Resources

Reviews

Comments