Sora
OpenAI
OpenAI's text-to-video diffusion model announced February 2024. Generates videos up to one minute long maintaining visual quality and prompt adherence. Uses transformer architecture like GPT for superior scaling. Represents videos as patches (analogous to GPT tokens), enabling training on diverse durations, resolutions, and aspect ratios. Deep language understanding enables accurate prompt interpretation and compelling characters with vibrant emotions. Can create multiple shots in single video with persistent characters and visual style. Generates entire videos at once or extends existing videos. Research preview in February 2024, public release December 2024 as 'Sora Turbo' with enhanced speed.
Strengths
Caveats
Capabilities
Vision
Audio
Video
Tool Use
Resources
No external resources available