Creating videos from a single image


In a web world dominated by video, it is necessary to help create it, even if we only have a reference image.

This is what Google worked on, enabling the DeepMind neural network to create short videos from a single frame.

This AI model called “Transframer” generates text based on partial prompts and is now capable of generating 30-second videos from a single frame.

The AI ​​uses context images to guess the surroundings of the images and can thus show what the outline of a piece of furniture looks like without actually seeing it. It imagines a real object from every angle.

It’s easy to imagine how he does it. If I give a program thousands of pictures of a chair from all possible angles and then send it a photo of a chair from the front, the program can imagine the rest thanks to previous training.

The artificial depth perception and perspective is noticeable in the demonstration, which helps to envision how video games can improve with something like this, not just making videos for social media.

If we can create realistic images and videos from them with DALL-E, we are one step away from automatically creating videos from a single text message.

