Stable Diffusion 3 vs ChatGPT Dalle-3 vs Midjourney [NEW Best Image Generator?]

AI Andy
3 Mar 202420:50

TLDRThe video script presents a detailed comparison of three AI image generation models: Stable Diffusion 3, Mid Journey, and Dolly 3. The comparison is based on their ability to adhere to a given prompt and produce images with a high level of detail, adherence, and 'coolness'. The script walks through various prompts, evaluating the output of each model and discussing their strengths and weaknesses. The conclusion suggests a preference for the style and quality of Chachi BT and Dolly 3 over Mid Journey, particularly in text generation and overall aesthetic appeal.

Takeaways

  • 📸 Comparison of three AI models - Stable Diffusion 3, Mid Journey, and Dolly 3 - based on their ability to generate images from prompts.
  • 🎨 Evaluation criteria include detail, adherence to the prompt, and coolness factor of the generated images.
  • 🍎 The first prompt was a cinematic photo of a red apple in a classroom with a specific tagline on the blackboard.
  • 🚀 Stable Diffusion 3 was criticized for lacking in coolness but adhered well to the text prompt.
  • 🌟 Mid Journey produced images with higher coolness but lacked detail clarity and realness factor.
  • 🖌️ Dolly 3 demonstrated good clarity, detail, and a dramatic coolness factor in its images.
  • 👨‍🚀 The second prompt involved a painting of an astronaut on a pig with a unique scene description.
  • 🎨 All models had their strengths in interpreting this complex prompt, with Dolly 3 standing out for its stylized and dramatic depiction.
  • 🦎 A close-up studio photograph of a chameleon was the third prompt, emphasizing detail in texture and color.
  • 🌈 Mid Journey excelled in creating a visually appealing and detailed chameleon image, leading to a high score in coolness factor.
  • 🖥️ The fourth prompt was a photo of a 90's desktop computer with specific text and background elements.
  • 🎮 Dolly 3 provided a stylized and nostalgic image that captured the essence of the prompt effectively.
  • 🏎️ The final prompt for a sports car image with text on the side highlighted the strengths of Stable Diffusion 3 in text generation and style.
  • 🐎 A creative prompt involving a horse balancing on a ball showcased the ability of Dolly 3 to produce a stylized and imaginative outcome.

Q & A

  • What were the three factors used to compare Stable Diffusion V3, Mid Journey, and Dolly 3?

    -The three factors were detail, adherence to the prompt, and coolness.

  • Which AI model was criticized for lacking in coolness according to the video?

    -Stable Diffusion V3 was criticized for lacking in the coolness factor.

  • In the comparison, which AI model was noted for having a higher coolness factor despite some adherence issues?

    -Mid Journey was noted for having a higher coolness factor despite some adherence issues.

  • How did Dolly 3 perform in terms of detail and coolness for the prompt involving a red apple on a table in a classroom?

    -Dolly 3 performed well in terms of detail and coolness, showcasing nice typography and dramatic lighting.

  • Which AI model executed the prompt involving an astronaut riding a pig with perfect adherence and cool style?

    -Stable Diffusion executed the astronaut riding a pig prompt with perfect adherence and a cool style.

  • For the prompt about three transparent glass bottles with colored liquids, how did Mid Journey's accuracy with numbers and colors compare?

    -Mid Journey struggled with accurately representing the numbers and colors for the three transparent glass bottles prompt.

  • Which AI model excelled in generating detailed images of animals, specifically a chameleon, according to the script?

    -Mid Journey was praised for its ability to generate detailed images of animals, particularly excelling with the chameleon prompt.

  • How did the AI models compare in creating a photo of a 90's desktop computer with specific details mentioned in the prompt?

    -The AI models varied, with Dolly 3 praised for creating a nostalgic vibe and including a cool SD3 sign, while Mid Journey took a more graffiti, steampunk approach.

  • Which AI model was preferred for its style and text adherence in the video's concluding comparison?

    -The video concluded with a preference for the style and text adherence of ChatGPT and Dolly 3.

  • Did the video suggest that Mid Journey was the weakest in adhering to text-based prompts, and why?

    -Yes, the video suggested Mid Journey was the weakest at adhering to text-based prompts, likely due to its focus on visual aesthetics over text accuracy.

Outlines

00:00

🎨 Comparison of AI Image Generation Models

The paragraph discusses a comparison between three AI image generation models: Stable Diffusion 3, Mid Journey, and Dolly 3. The comparison is based on the same prompt, which is to create a cinematic photo of a red apple on a table in a classroom with the words 'Go big or go home' written on the blackboard. The models are ranked based on detail, adherence to the prompt, and coolness factor. The speaker critiques the lack of coolness in Stable Diffusion 3 and compares it directly with Mid Journey and Dolly 3, highlighting the strengths and weaknesses of each model in terms of detail clarity, adherence to the prompt, and overall aesthetic appeal.

05:02

🚀 Evaluation of AI Models on Complex Prompts

This paragraph presents an evaluation of the AI models on more complex prompts, including a painting of an astronaut riding a pig, a closeup of a chameleon, and a 90's desktop computer. The speaker praises the adherence and style of Stable Diffusion 3 for the astronaut prompt and criticizes Mid Journey's take on it for introducing street art elements. The chameleon prompt is well-executed by all models, with Mid Journey particularly excelling in animal depictions. The 90's desktop prompt showcases the nostalgic and detailed capabilities of the AI models, with Dolly 3 standing out for its stylized and dramatic representation.

10:05

🌈 Analysis of AI Generated Images for Complicated Prompts

The speaker analyzes AI-generated images for more complicated prompts, such as transparent glass bottles with colored liquids and an embroidered cloth with a tiger. The models struggle with the sequence and color representation of the bottles, with Dolly 3 providing a stylized and accurate depiction. The embroidered cloth prompt reveals that Stable Diffusion 3 excels in texture and detail, while Mid Journey falls short in text generation. Dolly 3's response is appreciated for its style and inclusion of fine details, such as the embroidery and the candle's color.

15:06

🏎️ AI Models' Performance on Dynamic and Fantasy Prompts

The paragraph evaluates the AI models' performance on dynamic and fantasy prompts, including a night photo of a sports car, a horse balancing on a ball, and an anime-style illustration. Stable Diffusion 3 is noted for its ability to handle text and motion blur effectively, while Mid Journey struggles with physical accuracy. Dolly 3 is praised for its stylized and creative interpretations, particularly for the horse on the ball and the anime stand, offering a more cinematic and engaging visual experience.

20:09

🌟 Final Thoughts on AI Image Generation Models

In the concluding paragraph, the speaker reflects on the strengths and weaknesses of the AI image generation models discussed. Stable Diffusion 3 is recognized for its text handling and adherence to prompts, while Mid Journey is criticized for its lack of focus on text generation. Dolly 3 is favored for its style and ability to create visually appealing images. The speaker expresses excitement for the potential of open-source models and ends the video by promoting a link for viewers to find their ideal Chachi BT prompt and encourages continued viewing of the video series.

Mindmap

Keywords

💡Stable Diffusion 3

Stable Diffusion 3 is a version of an AI model discussed in the video. It is one of the key technologies being compared based on factors like detail, adherence, and coolness. The video script indicates that while it may lack in the coolness factor, it performs well in terms of adherence to the prompt and detail clarity.

💡Mid Journey

Mid Journey appears to be another AI model or technology that the video script contrasts with Stable Diffusion 3 and Dolly 3. It is noted for its higher coolness factor and quality in certain prompts, particularly with animals, but sometimes falls short in text adherence and realness.

💡Dolly 3

Dolly 3 is another AI model mentioned in the video. It is praised for its style and ability to create dramatic and stylized images. However, there are instances where it might not perform as expected, particularly with text generation.

💡Prompts

Prompts in this context refer to the specific instructions or descriptions given to the AI models to generate images. The video script evaluates how well each AI model adheres to the prompts and translates them into visual content.

💡Detail

Detail in the video script refers to the level of intricacy and clarity in the images generated by the AI models. It is one of the criteria used to evaluate and compare the models' performances.

💡Adherence

Adherence refers to how closely the AI models follow the instructions given in the prompts. It is a critical factor in assessing the models' ability to accurately generate images that match the user's request.

💡Coolness

Coolness in the context of the video script denotes the aesthetic appeal and stylistic elements of the AI-generated images. It is a subjective measure of how attractive or engaging the output is.

💡Text Generation

Text generation is the ability of the AI models to include and handle textual elements within the generated images. The video script discusses the strengths and weaknesses of each model in terms of text inclusion and accuracy.

💡Image Quality

Image quality refers to the overall visual clarity, detail, and aesthetic appeal of the images produced by the AI models. It encompasses factors like resolution, color accuracy, and the realism of the depiction.

💡Realness Factor

The realness factor indicates how lifelike and believable the AI-generated images appear. It is an important aspect when evaluating the models' ability to create convincing visual representations.

Highlights

Comparison of Stable Diffusion 3, Mid Journey, and Dolly 3 using the same prompt.

Ranking based on detail, adherence, and coolness factors.

Cinematic photo of a red apple in a classroom with the phrase 'go big or go home'.

Critique of Stable Diffusion V3 lacking in coolness factor.

Mid Journey's Apple image lacks detail clarity but has a higher coolness factor.

Dolly 3's image with good clarity, detail, and dramatic lighting.

Second prompt featuring an astronaut riding a pig with a unique style.

Stable Diffusion's perfect execution and cool style for the astronaut prompt.

Mid Journey's street art interpretation of the astronaut prompt.

Dolly 3's creation of two images for the astronaut prompt with different styles.

Studio photograph of a chameleon with detailed scales and wrinkles.

Mid Journey's animal depiction with perfect scales and motion blur.

Dolly 3's stylized and dramatic photo of the chameleon.

Photo of a 90's desktop computer with nostalgic graffiti and the text 'sd3'.

Mid Journey's steampunk style interpretation of the 90's computer prompt.

Dolly 3's retro UI and cool style for the 90's computer prompt.

Transparent glass bottles with red, blue, and green liquids on a wooden table.

Mid Journey's incorrect order and color representation of the glass bottles.

Dolly 3's correct order and stylized look for the glass bottles.

Embroidered cloth with 'good night' text and a baby tiger with a lit candle.

Mid Journey's moody and cozy interpretation but with adherence issues.

Dolly 3's detailed and stylistically pleasing interpretation of the embroidered cloth.

Night photo of a sports car with 'sd3' text on the side and 'faster' text on a road sign.

Mid Journey's neon lights and text interpretation with high quality.

Dolly 3's stylized composition with incorrect 'sd3' placement but a cool perspective.

Horse balancing on a colorful ball in a field with green grass and mountains.

Mid Journey's unrealistic physics but aesthetically pleasing interpretation.

Dolly 3's more realistic and stylized interpretation of the horse on the ball.

Anime style illustration of a new stand with a rainstorm in the background.

Mid Journey's creative but off-target vending machine interpretation.

Dolly 3's vibrant and stylistically rich interpretation of the new stand.

Final verdict favoring Chachi BT and Dolly 3 for their style and adherence.