top of page

Lights, Camera, Algorithm: I Gave 3 AI Tools the Same Input and Got 3 Different Films

  • Writer: S B
    S B
  • Jul 28
  • 6 min read
A stylized 3D rendered camel with coral-pink fuzzy fur and golden metallic details on its face, humps, and hooves, standing against a purple gradient background.
Created by Sophia Banton in collaboration with Google Whisk. The surreal coral camel with golden accents that served as source material for all three AI video generation tests, showcasing the distinctive aesthetic each tool would interpret differently.

Published in Towards AI


Among GenAI tools, nothing is more fascinating for AI enthusiasts and those who are warming up to AI than AI video generation. While image generation is impressive, video models turn ideas into animated reality, transforming static concepts into moving, living visuals that can tell stories, demonstrate products, or simply bring imagination to life.


AI video generation has unlocked a new hobby for me as I have found myself creating AI short films and even entering AI film festivals, despite having no formal training in the Arts. The creative possibilities are endless when you can turn any concept into motion.


Within this new hobby I have found myself testing various models on the market. I used Runway first, then Google Veo 3 when it launched, and then MidJourney V1 when it came to market. These are the primary AI video models we'll be comparing.

But first, a brief overview of how AI video generation works. Today's video models achieve animation via one of two fundamental methods: text-to-video or image-to-video. Text-to-video starts with a text prompt and animates it, while image-to-video starts with an image and animates elements within that scene. Both methods produce comparable results.



Storytelling with AI Film Tools


Telling a story means successfully connecting a collection of video clips into a coherent narrative. In principle, all three models can be used to do this by taking your clips to a video editor like CapCut or Davinci Resolve. To demonstrate this, I created the same story with all three models: Runway, Veo 3, and MidJourney.


I used the same images for all three AI short films:

  • a surreal fox, camel, and eagle

  • a surreal prickly pear plant and a close-up of a surreal prickly pear plant

  • 3 surreal desert scenes, a desert cactus, and a surreal desert storm

  • a surreal oasis and a close-up of surreal water droplets


Video clips of each image were created using the same animation instructions. The films were created by organizing the video clips in DaVinci Resolve with a soundtrack created using Suno AI. Below are the resulting AI short films:


MidJourney V1 film: 

Video generated using MidJourney V1 AI video model

Veo 3 film: 

Video generated using Google Veo 3 AI video model


Runway Gen-4 film: 

Video generated using Runway Gen-4 AI video model

Note: These are the full, unedited clips as generated by each AI tool. The variation in length reflects each platform's default settings: MidJourney creates 5-second clips, Veo 3 creates 8-second clips, and Runway was set to create 10-second clips. For fairness, the first generations were selected. Since MidJourney creates 4 variations at a time, the top-left option was consistently chosen.


These clips can be sped up or slowed down during editing based on the creator's preference, but the natural timing reflects each platform's default approach.

The distinctiveness of each film highlights how differently each AI video model interprets and reconstructs the natural world. But what do these differences actually tell us about the creative process?



Understanding the Results


The same images were used to create the short film. Yet we ended up with 3 different videos. What this shows is that:

  • An artist's creative vision is being heavily influenced by the AI tools they use

  • Some AI models are more obedient than others

  • Some AI models override user requests more than others

  • There's a compromise or enhancement of the artist's vision


Looking at the results, Google Veo 3 was the most obedient to prompts in terms of both subject matter and camera movement. As a bonus, Veo 3 automatically added native sound to each clip: wind gusts, water splashing that perfectly matched the on-screen content. The audio generated by Veo 3 can be uncoupled from the video; you don't have to use it. When you load the clips in DaVinci, the audio component is separate from the video component.


MidJourney followed instructions but took significant creative liberties, often introducing new camera angles or movements to enhance what was requested. This creative interpretation worked well for storytelling and makes it an excellent option for creators less familiar with cinematography terminology. Even my 7-year-old noticed, walking by and commenting, "MidJourney sure knows how to zoom and turn the camera!"


Runway demonstrated a more traditional filmmaking approach.  It was more obedient than MidJourney. It handled camera movements like pans and zooms effectively but was less effective with complex animation requests such as clouds moving rapidly through the sky. However, Runway's output feels more like traditional filmed footage with stable camera work. The main challenge appeared to be creating fluid movement, as the motion often felt more restricted compared to both Veo 3 and MidJourney.


Overall all three created stunning short films.


The choice isn't just technical; it's about matching the AI's personality to your project's mood. For this particular film, Runway's steady, quiet approach would work best for meditative, contemplative pieces. MidJourney's dynamic camera work would shine for high-energy, dramatic storytelling. Veo's detailed execution with environmental audio would deliver the most immersive experience.

These distinct personalities in how each AI interprets the same instructions reveal something important: AI isn't a tool in video generation; it's a co-creator.



Who is the Lead Artist?


This collaboration raises an important question: who's leading and who's assisting? Does it matter?


Consider this: no two people will have the same creative vision, and no two AI generations are identical. This gives us novelty in two dimensions: human creativity and AI interpretation. But in this partnership, whose vision ultimately dominates?

After creating these films, I believe we can draw a clear line. Technical execution belongs to the AI, while creative voice remains with the human. The human ultimately chooses which outputs to keep, decides when to iterate, and adapts the process until their vision is realized. The AI handles the complex technical orchestration of turning prompts into moving images, but the human provides the creative direction, makes the editorial decisions, and shapes the final narrative.


This dynamic holds true for image generation as well. When creating the source images for these films, the AI executes the technical rendering while the human guides the creative process through prompt engineering, selection, and iteration.

In this relationship, the human is the director and the AI is the exceptionally talented cinematographer. Each is essential, but with distinct roles in bringing the vision to life.


This dynamic has opened up real opportunities beyond just creative experimentation.



AI Filmmaking as a Viable Career


AI-specific film festivals are now receiving thousands of submissions, and I've entered two major ones: Runway AIFF and Reply. I even participated in the Runway Gen48 48-hour film contest, where creators have just two days to conceive, produce, and submit a complete film.


The quality emerging from these festivals is remarkable. AI films are starting to rival traditional filmmaking in both technical execution and storytelling sophistication. The recognition is real too; these festivals offer substantial cash prizes and career opportunities. This isn't just a novelty anymore; it's becoming a viable career path. The AI filmmaking community includes both traditional filmmakers incorporating AI tools and creators like me who use entirely AI-driven workflows.



Finding Your Voice in the Age of AI


Three tools, three different films, and therefore three different stories. All three are proof of the potential of AI tools to democratize creative expression and unlock new forms of storytelling.


The goal in this new digital economy is to find a way to lead with your voice and your vision. How will people recognize your aesthetic and your brand? How can you create things that reflect your point of view the way the great artists and film directors did?


It starts with practice. It starts with patience. It starts with a willingness to try new tools and push them to their limits.

You can absolutely build your own visual identity with AI. It just won't be the result of one-shot, three-word prompts. It will take skillful and intentional prompting.


Join the Conversation


"AI is the tool, but the vision is human." — Sophia B.


👉 For weekly insights on navigating our AI-driven world, subscribe to AI & Me:

 

 


Let’s Connect

I’m exploring how generative AI is reshaping storytelling, science, and art — especially for those of us outside traditional creative industries.


 


About the Author


Sophia Banton works at the intersection of AI strategy, communication, and human impact. With a background in bioinformatics, public health, and data science, she brings a grounded, cross-disciplinary perspective to the adoption of emerging technologies.


Beyond technical applications, she explores GenAI’s creative potential through storytelling and short-form video, using experimentation to understand how generative models are reshaping narrative, communication, and visual expression.

bottom of page