SORA
What a happy, mentally stable individual. Definitely not a serial killer grandma.
In case you’re completely out of the loop: OpenAI, the guys who develop ChatGPT, have just released a new AI service called “SORA”, which, like GPT, takes language prompts, but unlike GPT, outputs video. Realistic video.
Gone are the days of Will Smith eating spaghetti. The age of nearly perfect deepfakes is here. So what’s that mean for you?
Well, as usual, it’s complicated: a blessing for some, a curse for others, and a change in the market for everyone. It’s not gong to eliminate nearly as many jobs as the tech nerds think it will, it’s not going to bring about a content creation utopia, but it will wipe out a certain subset of careers that were at the bottom of the barrel in terms of Filmmaking jobs already. Once it becomes possible to feed the video generator photos and not just text, a lot more of the industry will vanish.
Uncanny Valley
First, the phenomenon itself. AI content has an aftertaste, and as of now that flavor is pretty bad. It takes a minor amount of training and exposure before you can recognize a text passage as having been written by ChatGPT, but once you’ve got it, it’s easy. Moderators on most forum sites are already pretty good at spotting (and often removing/banning) content and comments written by AI.
As of now, all the AI that I’ve seen has a particularly strong uncanny valley effect, whether it’s written, photographic, or videographic. Each piece of content always has something “wrong” with it that triggers a little spot in my brain that instantly goes “oh, that’s disgusting”. With text-based AI, it’s the fact that unlike AI in films (which usually has a British accent, a high degree of introspection, and some serious wit), ChatGPT has a tendency to use flowery vocabulary and sentence structure to say things that are completely and totally wrong with complete earnestness. It’s not lying because it’s a computer and doesn’t know what a “lie” is, but that’s what it feels like. You read it and get the sense that whoever you’re talking to is a complete and total psychopath who has no problem saying the craziest things earnestly, for motives (ulterior ones) you don’t understand. You feel like you’re being manipulated. Either that or it feels like you’re talking to someone with the vocabulary of a physicist and the self awareness of a 4yo, which is both more inline with reality, and also scarier than if you were merely being conned.
Photo and video AIs suffer from similar issues. Hands, of course, still look terrible. Facial expressions often look strangely bland and impersonal, extremely caricaturized, or even just completely insane. The faces themselves tend to appear glossy, as AI photo generators struggle with pores, having been trained on photosets which make use of editing and airbrushing.
Oh and speaking of editing, SORA can’t do it, at all. It has no understanding of the aesthetics of cutting, nor plot. It uses an intensified continuity approach to framing (1 main subject in the shot, max) and to my eyes often comes across as a strange kind of live-take video game. The clips it makes don’t feel cinematic, they feel like someone took their phone out and started streaming while stuck in Unreal Engine. It’s weirdly the worst of both worlds: the clips look just CGI enough to not feel real, but they completely lack the intelligence, artistry, and unrealistic and unique camera angles which make the best animated films great (quick example, you can do so much so quickly with animation that if you’re also a good director, the end result is both amazing AND cheap).
The Good
That said, it is a massive step forward. Landscape shots, especially ones without humans or moving objects, look almost perfect. If you just need stock drone footage, I would now recommend that you generate instead of buy it. I still haven’t seen any shots of anyone eating food, but from a far enough distance I’d expect crowds to look fine.
The potential for vfx, especially compositing, is massive: if you need a shot with a standing army in the background, you can shoot the footage irl, generate the army with SORA (instead of paying a vfx house, which is what you’d normally do) and slap it in there, easy. Continuity might be an issue between shots, that’s about it.
The Problem
The problem, with all of this, is trust. If you take the perspective that all life is sales (which it is), then using AI footage in any kind of project that you want people to actually watch effectively amounts to lying–which means that the lie better be good. The minute someone watching an ad realizes that any part of that ad was made by AI, that sale is dead, that customer is gone forever. As far as they’re concerned, you’re a con man. And right now, with the technology so new, so prominent, and so easy to use, you can bet that as soon as SORA gets released to the public, we are going to be inundated with “con-men”. People are going to get very good at spotting any inconsistency at all, very fast. And these minor mistakes SORA is making, far from being little hiccups on the path to perfect AI video integration, are going to create a massive schism in the film industry, as some audiences will embrace them, and others will write off entire projects because of them.
So you have to look at your brand to figure out which side of that schism you’re going to be on. If you read my last article on building social media for your business, you’ll understand that the position in the market you occupy determines the kind of content you should post. Ripping the examples from that article, it’s a safe bet that Gucci is going to be sticking with irl professional product photographers for the foreseeable future. Donut Media, on the other hand, will almost certainly use SORA to make the stupidest car memes imaginable, and their fans will love it. An actual car sales company, like Ferrari, will stick with live-shot content and professional CGI. Actual artists (directors) will likely by-and-large refuse to use AI content in their work, and develop cult followings because of it. There will be even more lawsuits and strikes in Hollywood. AI video will develop a negative stigma, and many people will likely refuse to watch it on principle… until it actually becomes so good that no one can tell the difference.
The Silver Lining
For now, if you’re a small time filmmaker like me, the best bet is to focus on parts of the industry that require specificity, i.e footage of a specific and unique subject. You can use SORA to generate great drone footage, but you can’t use it to generate great drone footage of your current RE listing (some Realtors will still try though, and kill their own reputations, see above). You can use SORA to generate footage of a wedding, but not your specific wedding reception. You still need an irl videographer for that, and will for at least a very long time, if not forever. Notwithstanding all of that: you still have a career filmmaking as long as you’re good at it. If you can edit, if you can direct, if you can act, if you’re a visionary filmmaker, you’re fine. If you’re selling drone shots on shutterstock, you’re in trouble.
Final note: in the middle ground of all of this is the product video industry, which I think will get annihilated as soon as SORA figures out how to take photos as input. Want to make a beer commercial? Hire a professional photographer, let them take beautiful shots of the bottle from a couple angles, feed those photos into SORA and let it generate the silky smooth camera movement, while still keeping the perfect lighting and atmosphere the photographer set up. Checkmate.