I got a golden ticket to the slop factory (thoughts on the Sora app)

1 month ago 6

OpenAI lança Sora 2 e app para competir com o TikTok • Tecnoblog

I got access to the newly launched Sora app yesterday. I’ve been using it since launch and wanted to capture my FEELINGS on it as someone who has been playing with a lot of the generative AI tooling for a while now. This is just a loose collection of impressions — Sora feels like a big moment for AI generated video along a few different axes, so I wanted to explore some of my own thoughts on that below. Enjoy!

My first thought: the profound un-creativity of the average user of AI image and video creation tools is continually astounding to me. People like to make a lot of noise about how AI commidifes art, but in practice it has put into such stark relief the difference between people that are doing “artistic” things with AI and those who are producing slop. This is basically a “make your dream movie” model/app, and the masses just make derivative slop with it.
Sora is full of slop.
However, the slop is kind of the point here, and it’s not a bad thing.
Sora turns general trend chasing fast follows in social media into part of the consumptive process itself. The idea of “how can I make something like they make” by performatively photographing your food, re-arranging your house a certain way, etc. in order to mimic it for your own social capital — it’s all gone. You can literally just remix something to “make it your own”.
But even beyond the normal connotations of remix here, because of the nature of remixing an AI video with another AI video from the same model, and the source video having a specific position in the latent space of Sora, the ability to remix something literally charts it against its source reference to a degree of closeness that your own remixed work is effectively the same as the original. It’s both a remix and an original work, and the works collectively provide some gestalt of meaning together — they are implicitly derivatives (almost literally, in the math) of each other, while also being originals.
Sora embraces this completely. Trend videos and prompts that are remixed appear as a horizontal feed, where each swipe left or right takes you to another variation the same source prompt that was remixed. This means you can explore a style of video in different permutations by creators seamlessly.
This is also incredibly strange, as it de-emphasizes specific creators and weirdly ends up actually centering creative content more than a traditional feed.
This setup also encourages your own engagement with trends through your own creativity and ease-of-contribution.
The net effect of browsing Sora then is less about finding creators and more about finding prompts you like and seeing how other people engage with them. Because any creator can also engage with any prompt, the feed feels like far less something where you are browsning aggregate creative output of a variety of different people, but instead more that you are browsing the collective creative id of a creative community.
This feels revolutionary in practice in how it embraces remix and full author-death more than any feed-based creative platform before it.
However, the current creative community is pathetic on Sora. Top remix videos of the past day since launch have been:
- MLK “I have a dream…” speech substituted with insipid things like “I have a dream that Xbox game pass was still $10” and “GPUs are affordable”
- Fake Spongebob Squarepants episodes that feature other meme topics de jour like backrooms.
- Lots of random Jesus content.
- Fake cctv/crime content.
- Lots of cameos featuring people from OpenAI and famous streamers
  - This already feels like more instances of the breakdown of the idea of web 2.0 as some great connective tissue between different social strata. It’s very clear already that celebrities minted outside of Sora will become the main objects of Cameos in Sora, further cementing their own status as untouchable while also paradoxically allowing an even closer parasocial relationship with them — you can effectively deepfake yourself hanging out with someone famous.
That said, the model is also Good. Incorporating things like edits into the videos while maintaining consistency is incredible. The audio is also incredible, and even more so that the dialog that is generated for scenes is intuited based on the implied context of your prompt.
The model is also creative in a specific way. There is clearly a lot going on behind the scenes to provoke the model into doing more than your own prompt, and tbh I like the chaos.
That said as well — these tools are awful. The UI is awful, the only ability to remix being to reprompt directly in a small text box. There are no prompt improvement options visible. No way to generate multiples. Submitted drafts can’t be used to seed new prompts (if you want to reuse an edit to a prompt you made). You can’t easily make new videos from stitching previous frames or referencing videos, etc.
This stuff mattered less a few years ago, but there are now a lot of companies with Pretty Good video models, and they are going to live and die by their UX. As far as I can tell, only Moon Valley seems dedicated to think of “model UX” outside of a prompt. Google’s AI Studio for Veo is also nice, but still feels more Google than Adobe. Midjourney is getting there but their tooling is still focused on images (fine, I like their image tools).
This also obviously doesn’t matter right now for Sora. And the only real moat in AI may be having a community, and building one against participatory social video to get people in and creating is a good way to build that moat.
Altman said this was a GPT3.5 moment, and I think that tracks here. This will be the primary way a lot of the world first engages with video models, so giving them effectively no options to engage with them beyond a feed and a prompt box is probably okay.
Last thing I’ll say — to use these tools creatively, I still need approximately 5x the generations/speed that I get from Midjourney, and about 10x in the current gen of Sora. These AI companies want me to use their models — fine. Let me use your model! The compute and speed of these things still feels far off from where I’d like them to be, with the current speed being so slow that each generation feels precious, which feels antithetical to actual creative work. I want to rip through generations as fast as their coming to find what I’m looking for. MJ gets close, but all could be better.
I’ll probably keep using and checking out Sora. I like that the slop veil is lifted — everything in the app is AI, which is weirdly liberating? You know what you’re signing up for in there and you aren’t going in for Authentic Content. You’re going in there to see a scene from Spongebob where he smokes a blunt with Gary. You’re looking for a video of Pikachu performing a Boiler Room set. A first person Mario Kart race commentated like an F1 race. Etc.