Kait

Sora is sorta neat

OpenAI announced Sora, a new model for text-to-video, and it's ... fine? I guess? I mean, I know why they announced it - it's legitimately really cool you can type something in and a video vaguely approximating your description in really high resolution shows up.

I just don't think it's really all that useful in real-world contexts.

Don't get me wrong, I appreciate their candor in the "whoopsies" segments, but even in the show-off pieces some of the video is weird to just downright bad.

A screenshot of a video of a woman walking, where her thumb is approximately as long as all her other fingersHands are hard! I get it! But there's also quite literally a "bag lady" (a woman who appears to be carrying at least two gigantic purses), and (especially when the camera moves) the main character floats along the ground without actually walking pretty often.

Are these nitpicky things people aren't going to notice on first glance? Maybe. But remember the outrage around Ugly Sonic? People notice small (or large) discrepancies in their popular entertainment, and the brand suffers for it. To say nothing of advertisers! Imagine trying to market your brand-new (well, "new" in car definitions) car without an accurate model of said car in the ad. Or maybe you really want to buy the latest Danover.

An AI-generated commercial of a generic SUV with the word "Danover" as the brand.It seems like all the current AI output has a limit of "close-ish" for things, from self-driving to video to photos to even text generation. It all requires human editing, often significant for any work of reasonable size, to pull it out of the uncanny valley.

"But look how far they've gotten in such little time!" they cry. "Just wait!"

But nobody's managed to push past that last 10% in any domain. It always requires a human touch to get it "right."

Like the fake Land Rover commercial is interesting, except imagine the difficulty of getting it to match your new product (look and name) exactly. You're almost going to have to CGI it in after, at least parts, at which point you've lost much of the benefit.

Unfortunately, "close enough" is good enough for a lot of people who are lazy, cheap or don't care about quality. The software example I'd give is there probably aren't a lot of companies who'd be willing to pay for software consultant services who are just going to use AI instead, but plenty of those people who message you on LinkedIn willing to pay you $200 for a Facebook clone absolutely are going to pay Copilot $20 a month instead.

And yes, there will be those people (especially levels removed from the actual work) who will think they can replace their employees with chatbots, and it might even work for a little bit. But poorly designed systems always have failure points, and once you hit it you're going to wind up having to scrap the whole thing. A building with a bad foundation can't be fixed through patching.

I have a feeling it's the same in other industries. I do think workers will feel the hit, especially on lower-budget products already or where people see an opportunity to cut corners. I also think our standards as a society will be relaxed a little bit in a lot of areas, simply because the mean will regress

But in good news, I think this'll shake out in a few years where people realize AI isn't replacing everything any more than Web3 did, but AI will have more utility as a tool in the toolkit of professionals. It's just gonna take a bit to get there.

The funny thing is a lot of the uncanny stuff makes it look like the model was trained on CGI videos, which might be a corollary to the prophesied problem of AI training on AI outputs. The dalmatian looks and moves CGI af, and the train looks like a bad photoshop insert where they had a video of a train on flat ground and matted over the background with a picture.