When My Crayon Drawing Started Breathing: Notes on AI Image Generators and ai animate image

There’s a shoebox under my bed that contains, among other things, a crayon drawing of a house I made when I was seven. The house is violently purple, the sun is a jagged yellow circle with eyelashes, and there’s a stick-figure family floating several inches above the grass because I hadn’t figured out ground planes yet. My mom kept it for thirty years, and when she handed me the shoebox last Christmas, I laughed at how earnestly bad it was. Then I stuck it on my desk and forgot about it until a Tuesday night in April when I was procrastinating on actual work and decided to see what an AI image generator from image could do with it.

I’d been playing with image-to-image AI for a few weeks by then, mostly the usual stuff—turning selfies into Renaissance portraits, feeding in landscape photos and asking for Studio Ghibli backgrounds. It was fun but felt like a toy. I hadn’t yet had the moment where the technology genuinely rearranged something in my brain. Then I scanned that crayon drawing with my phone, uploaded it to an image-to-image tool, and typed the most literal prompt I could think of: “photorealistic house, golden hour, volumetric lighting, 35mm photograph.” I hit enter fully expecting a mangled mess of purple pixels and deformed stick figures.

What came back wasn’t a joke at all. The house was still purple, but now it was a weathered lavender clapboard with peeling paint and a real wooden porch. The sun had lost its eyelashes and become a warm, low-hanging orb casting long shadows across a lawn that hadn’t existed in the crayon version. The floating stick family had resolved into actual people—a woman in a yellow dress holding the hand of a small child in overalls, their faces blurred just enough to feel like a memory. The composition was exactly the same. The AI hadn’t invented a new image; it had treated my drawing as a blueprint, a set of spatial and conceptual instructions. The jagged yellow sun meant sun, upper left. The purple rectangle meant house, center. The stick figures meant family, foreground. It was the same picture, but it had grown up.

That’s what separates a true AI image generator from image from a style transfer filter or a simple upscaler. It doesn’t just alter the surface; it reads the image as a kind of semantic scaffolding and builds something new within it. The original stays present as a ghost. My crayon drawing was still there, if I squinted, in the placement of the chimney and the triangular roof and the impossible way the sun touched the grass. The AI had seen my seven-year-old intentions and made them real in a way my motor skills never could. I found it profoundly moving, in a way that felt almost embarrassing to admit. It was just a crayon house. But it was also thirty years of distance collapsed into a single frame.

The thing is, once you’ve seen a drawing become a photograph, you start wondering what else it can become. That’s how I fell into the video side of things. I’d heard rumblings about tools that could take a single image and generate a short video from it—not a slideshow, not a morphing transition, but actual motion: hair blowing, water rippling, leaves rustling. The term I kept seeing in forums was “AI Image to Video Generator,” usually typed out in full, as if people were still feeling the shape of the phrase in their mouths. I bookmarked a few of them and forgot, until the crayon house made me remember.

So I took that photorealistic rendering of my childhood drawing and uploaded it to an AI Image to Video Generator I’d found that had a free tier. The UI was bare-bones: drag an image, write an optional motion prompt, choose a duration. I typed “leaves rustling in the trees, curtains moving in the window, woman shifting her weight slightly, late afternoon light slowly fading,” and then I waited. When the six-second clip loaded, I genuinely had to set my coffee down. The curtains in the house’s upper window, which I hadn’t even consciously noticed in the still image, were billowing inward as if a breeze had just passed through. The tree beside the porch, which the image-to-image AI had added from nowhere, now had leaves that trembled with a soft, irregular rhythm. The woman in the yellow dress didn’t walk or wave; she just tilted her head a fraction of an inch, like she’d heard something inside the house. It was subtle to the point of being easy to miss, but it hit me in the chest. My crayon drawing was breathing.

I learned later that this specific flavor of generation is being called “ai animate image” in a lot of the developer communities, a phrase that’s weirdly literal and kind of clumsy but absolutely accurate. It’s not “animate” in the cartoon sense of giving a character a walk cycle. It’s more like the AI is inferring the latent motion that the image’s contents imply—what a curtain would naturally do if the air pressure changed, what a human body does when it’s standing still but alive, micro-swaying, adjusting balance. The ai animate image process takes the same semantic understanding that an image-to-image model uses to turn a crayon sun into a real one and extends it into the time domain. If the model knows that a window implies glass and a breeze implies motion, it can generate the missing frames in between a static curtain and a gently moving one. It’s guesswork, educated guesswork, and when it works it feels indistinguishable from magic.

Of course, when it fails it’s a disaster, and I had plenty of disasters. The first time I tried to animate a photo of a family dinner, the ai animate image algorithm apparently decided that the lasagna on the table was a living organism and made it pulsate. Another attempt, with a photo of my friend mid-jump on a hiking trail, turned her hair into a Medusa-like tangle of independent snakes. In one truly cursed output, the AI Image to Video Generator interpreted the steam rising from a coffee mug as the mug itself melting upward into the air. I kept that one in a folder labeled “body horror” and showed it to my friend Dave, who laughed so hard he choked. This technology has no chill, and honestly, that’s part of why I love it. It’s trying something incredibly difficult—imagining a physically plausible future for every pixel in a flat image—and when it gets it wrong, it gets it entertainingly wrong.

Still, the successful clips have changed the way I think about old photos and drawings. I’ve started going through my camera roll with a new kind of attention, not just looking for images that are beautiful, but images that are full of implied motion. A photo of a lake isn’t just a lake anymore; it’s water that wants to lap against the shore. A picture of a friend laughing isn’t just a frozen expression; it’s a head that wants to tilt back, a set of shoulders that want to shake. The AI image generator from image gave me a way to translate my visual ideas across styles and levels of realism. The AI Image to Video Generator gave those translated ideas a pulse. And the phrase ai animate image, for all its awkwardness, describes exactly the bridge between the two: the act of recognizing that every still image is a paused moment in a longer story, and that a machine, if you ask it nicely, can press play.

I went back to the shoebox last week. There are dozens of drawings in there—a dragon with three heads, a lopsided birthday cake, a treehouse that defies physics. I’m working through them slowly, one by one, scanning them and feeding them through the image-to-image pipeline, then passing the best results into the video generator. The dragon now breathes a flickering plume of smoke. The treehouse sways slightly in an imagined wind, its rope ladder swinging. Each one feels less like a tech demo and more like a collaboration with my younger self, a kid who drew those things because he wanted to see them move and could only manage a single frame. Thirty years later, I can finally show him the rest of the clip.

When My Crayon Drawing Started Breathing: Notes on AI Image Generators and ai animate image

Thomas Lore