
Welcome to Today’s AIography!
Good morning, AI filmmakers.
Something quietly significant happened in AI tools this week. Some of the most capable image and video generators may have just moved off the paid cloud platforms and onto working creators' own computers. No monthly subscription. No usage caps. No waiting in the queue behind enterprise customers. For a working filmmaker thinking about which AI tools are worth investing learning hours in over the next year, that shift matters more than the rankings on any leaderboard.
Two open-source AI models shipped this week. Open-source means the underlying files are downloadable for people to use, modify, and in many cases ship commercial work with, instead of living only inside a company's paid platform. The first is HiDream-O1-Image, a new image generator under a permissive license that debuted at #8 on a major public ranking against the big proprietary models. The second is SwiftI2V, a new image-to-video research model from a Hong Kong-led academic and industry team that is designed to bring 2K generation within reach of a high-end consumer graphics card.
These two land into a stack that was already getting serious. Wan2.2 from Alibaba has been maturing since last summer with support for text-to-video, image-to-video, speech-to-video, and character animation. MOVA from OpenMOSS adds synchronized video-and-audio. HiDream, Wan2.2, and MOVA are openly downloadable under commercial-friendly terms; SwiftI2V's license is still unclear.
The story this week is that the pieces are starting to fit together. The pieces of an open-source AI video workflow now fit together into one pipeline, and the question working creators are asking has changed.
In today’s AIography:
What I'm Thinking About
HiDream-O1-Image shipped with arena-leaderboard numbers and an open license.
SwiftI2V makes 2K image-to-video feasible on consumer hardware.
HiDream plus SwiftI2V plus Wan2.2 plus MOVA is a working pipeline.
The text-to-video arena moved again. Seedance leads. Veo is at three.
The Editorial ran 240 prompts across the paid stack. The data showed up.
Action Item: Test the open stack on one paid shot from your current project
Short Takes
One More Thing... (Video of the Week): Corridor Crew on AI video quality
Essential Tools: Unreal Engine 5.8 + MetaHuman 5.8 just dropped
Final Thoughts: What "free" actually costs
Read time: About 9 minutes
WHAT I’M THINKING ABOUT
Hi Dream-O1-Image (codename "Peanut") debuted on May 8 at #8 on the Artificial Analysis Text-to-Image Arena, a public board where viewers blind-compare images from two anonymous models and vote. #8 places HiDream in the same tier as Google, OpenAI, and Black Forest Labs (the proprietary teams that power most of the paid image generators in your stack). The full model files are downloadable and the license covers commercial use without restriction.
One model handles text-to-image, edits, and reference-image work. Native output runs up to 2K with sharp detail.
An open model that runs on your own machine just placed top ten on a board full of closed-source frontier work. I am genuinely surprised at the position. If your current image stack is Midjourney for stills, Flux Kontext for edits, and Nano Banana for reference work, you now have one open model that does all three. The question is not whether HiDream is a like-for-like replacement on day one. The question is whether your paid stack is buying you enough on each task to justify keeping it. For most working creators today, the honest answer is mixed. HiDream is not yet ahead of Midjourney on photoreal. It is the first unified open model that ranks in arena company, and that gap will close from one side or the other within the year.
SwiftI2V is a new research model from a Hong Kong-led academic and industry team. It generates 2K image-to-video on a high-end consumer graphics card. The published benchmarks run on more powerful data-center hardware, so the speed on a personal workstation is still unverified, but the team's design target is that a home GPU is enough.
Most image-to-video work today happens below 2K because cloud generation at 2K gets expensive fast. If SwiftI2V holds up on personal hardware, the cost-per-second drops by an order of magnitude. That changes what small-budget projects can deliver without compromising on resolution. One caution: the license is not explicitly stated in the documentation yet. Treat the code as research-only until the team confirms it.
Until this week, building an AI video sequence without paying for a cloud service was not a real option for serious work. The open-source pieces existed, but they did not fit together. You could generate a strong reference image in one open model, then watch the open video models choke on it. You could get acceptable video output but no synchronized audio without paying for a third tool. The pipeline kept breaking at the seams. That changed this week.
With HiDream and SwiftI2V landing, the pieces now compose. HiDream generates the reference still. SwiftI2V or Wan2.2 turns that still into video. MOVA layers synchronized audio. The whole pipeline runs on a single high-end home workstation. No cloud subscriptions required for the core steps. All four pieces are downloadable today through community hubs like Hugging Face (the dominant hub for open-source AI models) and ComfyUI (a node-based interface popular with people running these models on their own machines).
The open stack used to be three or four pieces that did not fit together. A strong image model. A weak video model. No synchronized audio. Building a serious pipeline meant accepting that gap or stacking cloud subscriptions on top of it. The pieces now work together. It is not a like-for-like replacement for Veo 3.1 plus Seedance plus a paid music tool. It is close enough that the routing question is legitimately open per shot: which of my current cloud subscriptions am I keeping because they are genuinely better, and which am I keeping because I have not tested the alternative? That question used to have an obvious answer. It no longer does.
If you already run AI tools locally, pick one paid image-to-video shot from your current project. Generate the same shot through HiDream plus Wan2.2 on your home workstation. Compare against what your paid pipeline delivered. The exercise takes one hour.
The text-to-video arena moved again. Seedance leads. Veo is at three.

Image credit: arena.ai
The closed-source side of the arena shifted, too. The public text-to-video leaderboard at arena.ai now has Dreamina Seedance 2.0 (ByteDance) at #1, HappyHorse-1.0 (Alibaba) at #2, and Veo 3.1 (Google) at #3. Sora 2 Pro (OpenAI) is at #4. The leaderboard updated on May 12.
The gap between the top two and Veo at #3 is large enough that the audience meaningfully prefers the two Chinese models on blind comparison. HappyHorse is Alibaba's new entry and was not in the top five last month. There is no "Veo 4" anywhere on the leaderboard. There is no public Veo 4 release.
The trade press is still going to call Veo 3.1 the leading text-to-video model. That was true through April. It is no longer true by the blind-comparison data. Pull your trusted test prompt, run it through Seedance and HappyHorse, compare to your most recent Veo output, then route accordingly. If you see a "Veo 4" claim anywhere this week, check the source. One content farm already published a fabricated Veo 4 story this month.

Image credit: The Editorial
The Editorial published a six-week test of the paid AI video stack this month: 240 prompts across narrative, product, motion graphics, and text-to-video. The numbers are useful to have on your desk while reading the rest of this issue.
Runway Gen-3 Alpha Turbo cleared the test with an 85 percent success rate, the highest of the group, and is recommended for client and commercial work. Kling 2.0 is the budget pick at $19.99 per month. Sora 2 holds the premium photoreal slot at $199 per month. The test closed before any of this week's open-source releases shipped.
Six weeks of consistent benchmarking against 240 prompts is useful data. Read it as the floor on what the paid stack delivered through late April. The honest framing this week is that the gap between "what we tested last month" and "what shipped this week" is now measured in days. Re-run the same prompts against the open stack and see how the floor moves.
ACTION ITEM
What to do this week
If you already use local AI tools, this is worth testing now. Pick one shot you are currently paying a cloud subscription to generate. Keep it simple. Image-to-video at 720p or 1080p, under five seconds.
Spend an hour this weekend running the same shot through HiDream-O1-Image (for the reference frame) and Wan2.2 (for the image-to-video pass). Both are free and permissively licensed. Both rquire a very powerful consumer graphics card, roughly RTX 4090 class.
Compare the output to what your paid stack delivered. Three honest possible outcomes. The paid version is clearly better, and you keep paying. The open version is comparable, and you now have a real routing decision per shot. The open version is better, and you stop paying for that line item.
For everyone else, the practical takeaway is that downloadable alternatives to paid cloud video tools just got meaningfully more serious. You do not need to download anything this weekend to act on this. You just need to be ready for the conversation when collaborators or a studio start asking why a project is still on cloud bills when an open alternative now works. That conversation is on the calendar for working filmmakers within the next six to twelve months. Know the names. HiDream-O1-Image. SwiftI2V. Wan2.2. MOVA. Those are the four to watch when the cost question gets serious.
This is a one-hour test for the technically inclined, and a vocabulary update for everyone else. Both paths matter this week.
SHORT TAKES
Peter Jackson said AI is just a special effect, and consent is the line that matters. In a May 13 No Film School interview, Jackson said AI likenesses are acceptable only with permission, calling unauthorized use “stolen and usurped.” He also rejected comparisons between AI generation and Andy Serkis’s motion-capture work. For most filmmakers, the real issue is not the tool. It’s consent.
Jack Antonoff called AI-art creators "godless whores" in a public broadside this week. In an Instagram post promoting the new Bleachers album, Jack Antonoff blasted AI-made art as a shortcut that cheapens the “ancient ritual” of creating. Put beside Peter Jackson’s more pragmatic “AI is just a special effect” view, the divide sharpens: music is rejecting AI more loudly, while film is approaching it with more nuance.
OpenAI shipped real-time voice intelligence features in its API on May 7, positioned as infrastructure for voice-interactive creator tools. Voice has been the missing piece in most creator stacks. The upgrade graduates when the first creator app builds a noticeable workflow on top of it.
Elon Musk's lawsuit against OpenAI is pushing the company's safety record into open court. The dispute centers on whether the for-profit subsidiary structure compromises OpenAI's founding mission. Affects which labs creators rely on long-term. Does not change what filmmakers can do this week.
Enterprise AI investment keeps consolidating into billion-dollar deals, with SAP's recent $1 billion German AI startup acquisition cited as the headline example. Enterprise wins fund the infrastructure that eventually powers creative tools. The flywheel is still spinning.
ONE MORE THING…
Today’s Video
"What Professional VFX Artists Think About AI-Generated Video Quality" – Corridor Crew
Corridor Crew, the practical VFX team that has spent the past year stress-testing AI-generated video on their channel, dropped an episode on May 9 walking through what working VFX artists actually think about the current state of AI video output. The framing maps directly to this week's debate. Paid versus open. Is the output good enough to ship. Where the seams still show.
The reason this video belongs in your queue is the source. Corridor Crew are not AI evangelists, and they are not AI skeptics either. They are practitioners running their commercial work through the same tools the rest of us pay for, and they tell you on camera what works, what does not, and what they would not yet hand to a client. That is the closest thing to an honest practitioner panel the AI video space has right now.
Watch it the way you would watch dailies on someone else's project. You will pick up framing for your own next round of tests.
ESSENTIAL TOOLS
AI Filmmaking & Content Creation Tools Database
Unreal Engine 5.8 Preview + MetaHuman 5.8 — virtual production stack just leveled up
Epic’s Unreal Engine 5.8 Preview, released May 12, arrives with a substantial leap forward for "scale." and some of the most filmmaker-relevant virtual production upgrades in years.
The headline is sc"Shouldale. MetaHuman Collections and the new MetaHuman Crowds plugin make it possible to populate real-time environments with thousands of believable digital characters, not just a handful. Mesh-to-MetaHuman can now turn virtually any head or body mesh into a complete MetaHuman character, while MetaHuman Animator finally expands to macOS and Linux. Creators can also preview characters under custom lighting environments before committing textures, which should make look development far more practical.
On the Unreal side, UE 5.8 introduces experimental Mesh Terrain for building massive 3D environments, brings MegaLights to production-ready status, and continues pushing character and scene simulation forward with Control Rig Physics, Direct Mesh Controls, and a Nanite-ready Procedural Vegetation Editor.
For filmmakers building virtual production, previs, digital humans, or AI-assisted worldbuilding pipelines, this feels less like a routine update and more like a glimpse of where real-time filmmaking is headed. It is still a preview release, not production-stable software, but the direction is clear.
Check out our AI Tools Database. We will be adding to it on a regular basis. Got a tip about a great new tool? Send it along to us at: [email protected]
FINAL THOUGHTS
What "free" actually costs
A lot of the conversation around open-source AI tools treats "free" as if it solves the cost problem. It does not. It moves the cost.
What you save in subscription fees, you spend on GPU time, on setup, on debugging research-grade code, and on babysitting a model that does not have a customer support team behind it. The trade is real. The math is not the same for every creator.
The reason this week matters anyway is that the open-source stack just crossed the bar where it can be tested seriously against the paid stack. Before this month, the answer to "Should I run Wan instead of Veo" was almost always "Yes, but only for testing, not for delivery." After HiDream and SwiftI2V landed, the answer is "test it shot by shot and decide."
This isn’t about online AI video tools — your “cloud stack” — disappearing overnight. It’s about locally run and open-source tools — your “local stack” — becoming good enough that cloud-based tools are no longer the only serious option in a production workflow.
I came up cutting in rooms where the question was always whether to invest in the next workstation or pay the post house. The answer was usually some version of "do the math on your next twelve months." It is the same math now, just with a different vendor on the other side of it.
Stay sharp. Keep creating.
— Larry
A Website That Works While You Sleep
Most AI builders hand you a good-looking site and call it a day.
Readdy.ai builds you a business. Your site collects leads, takes payments, and answers customer questions 24/7 — through a built-in AI agent.
All in one. $15/month.
What did you think of today's newsletter?
If you have specific feedback or anything interesting you’d like to share, please let us know by replying to this email.
AIography may earn a commission for products purchased through some links in this newsletter. This doesn't affect our editorial independence or influence our recommendations—we're just keeping the AI lights on!






