I Ran Stable Diffusion 3 Prompts in Midjourney | SD3 vs. Midjourney Prompt Battle

Lexie AI
31 Mar 202403:50

TLDRThe video script presents a series of comparisons between Stable Diffusion 3 (SD3) and Mid Journey (MJ) AI-generated images based on various prompts. It highlights the strengths and weaknesses of each AI in rendering details like weapons, environments, and characters, ultimately favoring SD3 for its more accurate and creative outputs. The video concludes with a call to action for viewers to like and subscribe for more content.

Takeaways

  • 🎨 The video discusses a comparison between Stable Diffusion 3 and Mid Journey, two AI art generation models.
  • 🏹 The first prompt compared an Elven Ranger with a missing bow in Stable Diffusion 3's image, while Mid Journey version 6 had an arrow going through the elf's thumb.
  • 🦙 The 'llama kid' prompt resulted in a cute child riding a llama in a desert for Stable Diffusion 3, whereas Mid Journey struggled with the setting and the depiction of the child.
  • 👮‍♂️ Alien Banana Cop prompt saw Stable Diffusion 3 creating a Xenomorph police officer with bananas, while Mid Journey's interpretation lacked the cop element.
  • 👩‍🎤 The 'Let's go girl' anime style girl prompt was well-executed by Stable Diffusion 3, with Mid Journey improving but still lagging in text rendering.
  • 🐔 The final 'stack of um animals' prompt resulted in a creative image from Stable Diffusion 3, but it had issues with the dog and below, whereas Mid Journey produced a more whimsical and amusing result.
  • 🏆 Stable Diffusion 3 won most matchups, showcasing its ability to handle complex prompts and generate more accurate images.
  • 📈 The video highlights the ongoing development and improvement in AI art generation, with both models having their strengths and weaknesses.
  • 🎭 The importance of prompt clarity and specificity is emphasized, as it affects the quality and accuracy of the generated images.
  • 👍 The video encourages viewers to like and subscribe for more content related to AI-generated art and technology.
  • 🌐 The video script provides insights into the current state of AI art generation and its potential for creating diverse and imaginative content.

Q & A

  • What is the main topic of the video transcript?

    -The main topic of the video transcript is a comparison between the results produced by Stable Diffusion 3 (SD3) and Mid Journey (MJ), two AI image generation models, based on various prompts.

  • Why is the Stable Diffusion 3 model not widely available to the public?

    -The video transcript does not provide a specific reason for why Stable Diffusion 3 is not broadly available to the public. However, it mentions that the creator of the video used prompts from someone who has early preview access, implying that it might be in a limited release or testing phase.

  • What was the first prompt used in the video for comparison?

    -The first prompt used in the video for comparison was 'Faceoff badass Elven Archer', which called for an image of an Elven Ranger with braided platinum hair, a rune-etched bow, glowing eyes, and aiming at a roaring Dragon.

  • What was the issue with the Stable Diffusion 3 image for the 'Faceoff badass Elven Archer' prompt?

    -The issue with the Stable Diffusion 3 image for the 'Faceoff badass Elven Archer' prompt was that the bow was missing, and one of the arrows was replaced by the elf's middle finger.

  • Which model performed better for the 'Faceoff badass Elven Archer' prompt and why?

    -Mid Journey version 6 performed better for the 'Faceoff badass Elven Archer' prompt because it got most of the elements right, despite the arrow going through the elf's thumb, which was considered a minor issue compared to the missing bow in the SD3 image.

  • What was the second prompt used in the video for comparison?

    -The second prompt used in the video for comparison was 'Llama kid', which described a digital art picture of a child riding a llama with a bell on its tail through a desert.

  • How did the models handle the 'Llama kid' prompt differently?

    -Stable Diffusion 3 produced an image that was considered adorable and mostly accurate to the prompt, except for the misplaced bell. Mid Journey, on the other hand, struggled to accurately depict a child and a desert, and the placement of the bell was also considered odd.

  • Which model won the 'Llama kid' prompt and why?

    -Stable Diffusion 3 won the 'Llama kid' prompt because it provided a more accurate representation of the prompt, despite some minor issues, compared to Mid Journey's image which had more significant deviations from the prompt.

  • What was the third prompt used in the video for comparison?

    -The third prompt used in the video for comparison was 'Alien banana cop', which called for an image of a Xenomorph police officer enjoying a banana during the golden hour in Hawaii.

  • How did the models interpret the 'Alien banana cop' prompt?

    -Stable Diffusion 3 created an image with three bananas that aren't peeled, implying a humorous or terrifying interpretation of how aliens might eat bananas. Mid Journey's interpretation of the prompt resulted in an extraterrestrial monster that didn't look like a cop but was enjoying the golden hour in Hawaii.

  • What was the final verdict for the 'Alien banana cop' prompt?

    -Stable Diffusion 3 was deemed to have taken the lead from Mid Journey for the 'Alien banana cop' prompt, as it provided a more accurate and creative interpretation of the prompt.

  • What was the last prompt used in the video for comparison and why was it considered the wildest?

    -The last prompt used in the video was 'Stack of um animals', which described a rooster standing on a cat, which is standing on a dog, which is standing on a mule, which is standing on a turtle. It was considered the wildest because of the absurdity and creativity of the prompt, and the resulting images from both models were quite amusing and unexpected.

  • Which model won the 'Stack of um animals' prompt and why?

    -Mid Journey won the 'Stack of um animals' prompt because it produced an image that was more accurate and visually appealing compared to Stable Diffusion 3, which had some trouble with the depiction of the animals, particularly the dog and the mule.

Outlines

00:00

🎨 Stable Diffusion 3 Art Comparison

The paragraph discusses a comparison between the outputs of Stable Diffusion 3 and Mid Journey, two AI art generation tools. The author presents a series of prompts and evaluates the resulting images from both systems. The first prompt involves an Elven Ranger, where Stable Diffusion 3 produces a visually impressive image but misses a critical detail - the bow. Mid Journey's version has the bow but includes an arrow mistakenly passing through the elf's thumb. The second prompt describes a child riding a llama, with Stable Diffusion 3's image being more accurate and detailed, while Mid Journey's interpretation lacks the desert setting and corrects the placement of the bell on the llama. The third prompt, 'alien banana cop,' results in a creative and humorous image from Stable Diffusion 3, whereas Mid Journey's rendition misses the 'cop' aspect. The fourth prompt, 'let's go girl,' showcases Stable Diffusion 3's ability to accurately capture the text and theme, outperforming Mid Journey. The final prompt, a stack of animals, sees Stable Diffusion 3 struggling with the dog and mule, while Mid Journey creates a bizarre but amusing image. The paragraph concludes with a call to action for viewers to like and subscribe to the channel.

Mindmap

Keywords

💡stable diffusion 3

Stable diffusion 3 is a term used in the video to refer to an advanced AI imaging technology that generates images based on textual prompts. It is a significant part of the video's theme as it is the tool used to create the various images discussed. The video showcases the capabilities of stable diffusion 3 by comparing its outputs with those of another AI tool, mid Journey version 6.

💡prompts

In the context of the video, prompts are textual descriptions or requests that are used to guide AI imaging technologies like stable diffusion 3 to generate specific images. They are essential to the video's content as they are the starting point for the image creation process and serve as a basis for comparison between the two AI tools.

💡mid Journey version 6

Mid Journey version 6 is another AI imaging technology mentioned in the video that is compared against stable diffusion 3. It is used to generate images based on textual prompts and is portrayed as a competitor to stable diffusion 3 in the video's narrative.

💡roaring Dragon

The term 'roaring Dragon' refers to a specific element of the textual prompt used for the 'Faceoff badass Elven Archer' matchup. It is part of the video's theme as it illustrates the level of detail and creativity required in prompts to generate accurate and impressive AI images.

💡rune etched bow

A 'rune etched bow' is a specific detail mentioned in the prompt for the 'Faceoff badass Elven Archer'. It refers to a bow that has magical or ancient symbols (runes) engraved on it, adding a fantasy element to the image. This term is significant as it highlights the importance of detailed descriptions in prompts for creating vivid and thematic AI-generated images.

💡digital art

Digital art refers to the creation of artistic compositions or designs using digital technology, often involving software and other tools to produce images. In the video, digital art is the end product generated by AI technologies like stable diffusion 3 and mid Journey version 6 based on textual prompts.

💡Xenomorph

Xenomorph is a term used in the video to describe a fictional alien creature from the movie 'Alien'. In the context of the video, it refers to the prompt for an AI-generated image of an alien police officer enjoying a banana, which is a creative and unusual request that showcases the versatility of AI imaging technologies.

💡anime style

Anime style refers to a specific form of artistic design that originated in Japan, characterized by colorful artwork, fantastical themes, and vibrant characters. In the video, 'anime style' is used as a descriptor for the prompt of an image featuring a girl with white hair and red eyes, indicating the expected aesthetic of the generated image.

💡speech bubble

A speech bubble is a graphical element often used in comics, cartoons, and other forms of visual storytelling to indicate spoken words or thoughts. In the video, the term is used to describe a component of the 'let's go girl' prompt, where the anime style girl is depicted with a speech bubble containing the text 'let's go together on the live stage'.

💡stack of um animals

The phrase 'stack of um animals' refers to a humorous and whimsical prompt in the video where a series of animals are described as stacking on top of each other, such as a rooster on a cat, a cat on a dog, and so on. This concept is used to evaluate the AI technologies' ability to interpret and create complex and unconventional imagery.

💡rooster

A rooster is a male chicken, often characterized by its distinctive crowing. In the video, the term 'rooster' is used in the context of the 'stack of um animals' prompt, where a rooster is described as standing on a cat, which is part of the whimsical and challenging image that the AI technologies are tasked to generate.

Highlights

Introduction of Stable Diffusion 3 and its current unavailability to the public.

The use of prompts from an individual with early preview access to Stable Diffusion 3.

A comparison between Stable Diffusion 3 and Mid Journey version 6 in generating images from prompts.

The description of the first prompt involving an Elven Ranger with a unique detail involving the bow and arrow.

The judgment that Mid Journey version 6 had a better interpretation of the Elven Ranger prompt despite a minor issue.

A humorous segment encouraging viewers to like the video.

The second prompt featuring a child riding a llama through a desert, with commentary on the accuracy of the setting.

The creative prompt of an alien banana cop and the interpretation by Stable Diffusion 3.

The contrast between Stable Diffusion 3 and Mid Journey's interpretation of the alien banana cop, with a nod to Mid Journey's improvement.

The anime-style girl prompt and the detailed description of the generated image.

A critique of Mid Journey's text generation capabilities in comparison to Stable Diffusion 3.

The final prompt involving a stack of animals and the creative challenge it presented.

Stable Diffusion 3's struggle with the stack of animals prompt, particularly with the dog and mule.

Mid Journey's surprising success with the stack of animals, despite initial skepticism.

A call to action for viewers to subscribe to the channel, which is new and appreciates support.

A mention of additional generated content available at Open AI and a teaser for future developments.