If you’ve ever tried generating an animated video with AI, you already know the frustration. You type a prompt, something appears, and if the feet look wrong — which they almost always do — your only option is to try again. And again. And again.

That loop of prompt-and-pray is exactly what bothers Andrew Carr and Jonathan Jarvis. So they built Cartwheel, a 3D animation startup designed to give creators something most AI tools completely ignore: actual control.

Carr comes from OpenAI. Jarvis spent years as a creative director at Google. Together, they’re taking aim at what they call the “black box” problem of generative AI animation, where the machine makes all the real decisions and the human just watches.

Why 3D Motion Data Is So Hard to Find

Most AI models learned from mountains of text, images, audio, and video. That data is everywhere online. But 3D motion data? That’s a completely different story.

“We knew it was going to be hard,” Jarvis told me, “but it turns out to be harder than we thought by probably a factor of 10 or 100 to get that data.”

2D video of backyard dancing translated into precise 3D skeleton

So instead of building on the same datasets as everyone else, Cartwheel spent years mapping how humans actually move. Their models study the nuances of real human performance. That means a simple 2D video of someone dancing in a backyard can be translated into a precise, realistic 3D skeleton.

That shift from flat video to 3D assets is what makes Cartwheel different. And it’s the key to giving animators the hands-on control that’s been missing in the AI era.

AI “Sameness” Is a Real Problem. Here’s Their Fix.

There’s a creeping issue with AI-generated content that most people haven’t named yet, but immediately recognize when they see it. Everything starts to look the same. Same style, same motion, same feeling.

Carr and Jarvis see that sameness as a direct result of too little control. When everyone feeds prompts into the same generator and watches it spit out finished video, the results naturally converge toward a kind of bland middle ground.

Cartwheel’s answer is to make the AI output a starting point, not a finished product.

“The output of our system is designed for people to edit,” Carr said. “It’s designed for people to touch and manipulate, and we don’t want someone to type something in and then have it shuffle through to a finished animation. That’s not the point of it. That’s boring — who’s going to watch that?”

Cartwheel editable 3D output versus black box AI generator comparison

Because Cartwheel generates 3D data rather than flat video, a creator can swap characters, shift the camera, change lighting, or adjust a pose after the AI does its initial work. Plus, that editing flexibility naturally breaks the sameness problem. Put the same motion on different characters, in different environments, with different timing, and suddenly nothing looks generic anymore.

Open-Ended Storytelling and the Future of Animation

Beyond faster production, Cartwheel is chasing something more ambitious. They call it “open-ended storytelling” or “open-ended world-building,” and it’s aimed squarely at where content demand is heading.

Gaming and social media now require animation at a scale that manual production simply can’t match. Think about how many unique character interactions happen in a single multiplayer game session, or how much short-form video a single creator needs to stay relevant.

Cartwheel’s vision is characters that don’t just run through a fixed set of programmed moves. Instead, they’re powered by motion models that let them react and perform in real time. It’s less like choreographing every frame and more like “rehearsing” with a digital actor that understands the intent of a scene.

The goal, as both founders describe it, is bridging the gap between 2D vision and 3D execution.

Same 3D motion applied to different characters breaks AI sameness problem

“One of the core hypotheses that we hope is true in the next three years for Cartwheel,” Carr said, “is that everyone will work in 3D even if it’s authored in 2D, even if the final output is just 2D video.”

The Layer Below the Pixels

What Cartwheel is really selling isn’t just faster animation. It’s a philosophy about who should stay in charge of creative work.

The machine handles the biomechanics, the file formats, the technical drudgery that slows every animator down. But the human keeps final say over taste, timing, and the emotional heart of the story.

That’s a meaningfully different pitch than most AI tools make right now. Most tools promise to do everything for you. Cartwheel is promising to handle the hard parts while staying out of your way.

Whether that vision lands with professional animators and indie creators alike remains to be seen. But the founders’ combined backgrounds — deep technical research from OpenAI, creative leadership from Google — suggest they understand both sides of the problem better than most.

For anyone tired of watching AI animation tools spit out wonky feet and identical aesthetics, Cartwheel’s approach feels like a genuinely fresh direction. The technology is designed to start the conversation, not end it.