Movie Gen builds on Meta’s previous work in video synthesis, following 2022’s Make-A-Scene video generator and the Emu image-synthesis model. Using text prompts for guidance, this latest system can generate custom videos with sounds for the first time, edit and insert changes into existing videos, and transform images of people into realistic personalized videos. […] Movie Gen’s video-generation model can create 1080p high-definition videos up to 16 seconds long at 16 frames per second from text descriptions or an image input. Meta claims the model can handle complex concepts like object motion, subject-object interactions, and camera movements. You can view example videos here. Meta also released a research paper with more technical information about the model.
As for the training data, the company says it trained these models on a combination of “licensed and publicly available datasets.” Ars notes that this “very likely includes videos uploaded by Facebook and Instagram users over the years, although this is speculation based on Meta’s current policies and previous behavior.”