OpenAI launches SORA, tool that creates video from text
After giving us ChatGPT and Dall-E, OpenAI is showing no signs of slowing down. The AI startup under the leadership of Sam Altman has announced a new tool that can create videos from text prompts. They are calling it Sora.
The new tool can produce a video of up to a minute long based either on instructions given by the user. It can also convert a still image into a motion picture or add new material into an existing video.
“We’re teaching AI to understand and simulate the physical world in motion, with the goal of training models that help people solve problems that require real-world interaction,” OpenAI wrote in a blog post.
The blog also includes a bunch of Sora-generated videos together with the prompts used to generate them.
One of the prompts for instance is, “A stylish woman walks down a Tokyo street filled with warm glowing neon and animated city signage. She wears a black leather jacket, a long red dress, and black boots, and carries a black purse. She wears sunglasses and red lipstick. She walks confidently and casually. The street is damp and reflective, creating a mirror effect of the colorful lights. Many pedestrians walk about.”
To my surprise the AI does a pretty great job of creating the visuals.
At the moment, Sora is only available to select creatives (visual artists, designers, and filmmakers) as well as security experts trying to find vulnerabilities that could be exploited.
“Today, Sora is becoming available to red teamers to assess critical areas for harms or risks. We are also granting access to a number of visual artists, designers, and filmmakers to gain feedback on how to advance the model to be most helpful for creative professionals,” the OpenAI blog post further reads.
Sora is able to generate complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background. The model understands not only what the user has asked for in the prompt, but also how those things exist in the physical world.
However, as we’ve come to expect with AI, the tool is not perfect.
It may struggle with accurately simulating the physics of a complex scene, and may not understand specific instances of cause and effect. For example, a person might take a bite out of a cookie, but afterward, the cookie may not have a bite mark.
Still, it’s a remarkable feat by OpenAI that may even be seen as justification for the company’s plan to branch from its original plan of being an open source non-profit organization.
At the moment we will have to wait until the tool becomes publicly available to test the full extent of its capabilities.