It's these specific problems that AI video company Runway says it's made some progress in fixing with its new Gen-4 models. The new models offer "a new generation of consistent and controllable media" according to Runway, with characters, objects, and scenes now much more likely to look the same over an entire project.
This is because, as you might have gathered by now, these AIs are essentially probability machines. They know, more or less, what a futuristic cityscape should look like, based on scraping lots of futuristic cityscapes—but they don't understand the building blocks of the real world, and can't keep a fixed idea of a world in their memories. Instead, they keep reimagining it.
The new Gen-4 models can also "understand the world" and "simulate real-world physics" better than ever before, Runway says. The benefit of going out into the world with an actual video camera is that you can shoot a bridge from one side, then cross over and shoot the same bridge from the other side. With AI, you tend to get a different approximation of a bridge each time—something Runway wants to tackle.
Have a look at the demo videos put together by Runway and you'll see they do a pretty good job in terms of consistency (though, of course, these are hand-picked from a wide pool). The characters in this clip look more or less the same from shot to shot, albeit with some variations in facial hair, clothing, and apparent age.
While Gen-4 models are now available for image-to-video generations for paying Runway users, the scene-to-scene consistency features haven't rolled out yet, so I can't test them personally. I have experimented with creating some short clips on Sora, and consistency and real-world physics remains an issue there, with objects appearing out of (and disappearing into) thin air, and characters moving through walls and furniture. See below for one of my creations:
Of course, you only have to look at where AI video technology was a year ago to know that these models are going to get better and better, but generating video is not the same as generating text, or a static image: It requires a lot more computing power and a lot more "thought," as well as a grasp of real-world physics that will be difficult for AI to learn.
Read More Details
Finally We wish PressBee provided you with enough information of ( Runway Says That Its Gen-4 AI Videos Are Now More Consistent )
Also on site :