Stable Diffusion & Midjourney: Full Review & Comparison!🚀🌟

AI Samson
28 Nov 202205:42

TLDRIn this comparison, the AI-generated images from 'Mid Journey' and 'Stable Diffusion' are evaluated across various themes like portraits, landscapes, and fantasy scenes. 'Mid Journey' is praised for its narrative depth, anatomical accuracy, and aesthetic appeal, especially in its melancholic undertones. While 'Stable Diffusion' shows progress in certain areas like landscapes, it is criticized for its less coherent and more generic outputs, particularly in anatomy and consistency.

Takeaways

  • 🌌 Mid-journey AI creates a more narrative-driven piece with a dream of a distant galaxy, including characters and context.
  • 💏 In the portrait of an elegant fantasy couple, mid-journey shows better consistency in facial features and anatomy compared to stable diffusion.
  • 👩 A tired woman in a Valentino gown by mid-journey is depicted with more engaging composition and feeling, despite tiny hands.
  • 🤖 Stable diffusion's output tends to be more abstract and less coherent, especially in the portrayal of hands.
  • 👸 Fantasy cyberpunk princess by mid-journey has remarkable abs and symmetrical background, while stable diffusion's version lacks detail and intricate composition.
  • 🌟 Mid-journey's portrayal of celebrities like Timothée Chalamet shows a greater likeness, even when using an older dataset.
  • 🦁 In the stock photo comparison of a lion, stable diffusion performs closely to mid-journey, suggesting it is catching up in certain areas.
  • 🎨 Stable diffusion's images are often more generic and lack the aesthetic refinement of mid-journey, resulting in a more immature and rudimentary output.
  • 🌊 Mid-journey's landscape compositions, such as an Icelandic beach, are more engaging and aesthetically pleasing than stable diffusion's.
  • 🖌️ The melancholic feel in mid-journey's creations resonates deeply with human emotions and the exploration of our inner shadows.

Q & A

  • What was the main purpose of the comparison between Mid-Journey and Stable Diffusion in the transcript?

    -The main purpose was to evaluate and compare the performance of both AI systems in generating images based on the same prompts, covering a range of themes from portraits to landscapes.

  • How did the character depiction in the 'dream of a distant Galaxy' differ between Mid-Journey and Stable Diffusion?

    -Mid-Journey included a character with a narrative, looking into the space odyssey, while Stable Diffusion produced a more garish and less coherent image.

  • What was observed about the consistency in facial features and anatomy in the 'elegant fantasy couple kissing'?

    -Mid-Journey showed better consistency in facial features, bodies, and anatomy, whereas Stable Diffusion's output was less detailed and had inaccurate hand depictions.

  • What was the main critique about the composition and feeling of the 'tired woman in a Valentino gown' in the Roadside Diner scene?

    -The composition and feeling of the scene were more engaging in Mid-Journey's output, whereas Stable Diffusion produced a more abstract image with less realistic hand depictions.

  • How did the 'fantasy cyberpunk princess' image compare between the two AI systems?

    -Mid-Journey's image had more intricate details, better anatomy, and leading lines that directed the viewer's gaze effectively, while Stable Diffusion's version was less detailed and had failing anatomy.

  • What observation was made about the portrayal of the celebrity, Timothée Chalamet, in the AI-generated images?

    -Mid-Journey's output provided a greater likeness to Timothée Chalamet, despite using an older dataset, while Stable Diffusion still managed to create a resemblance but with a more boyish appearance.

  • In the comparison of the lion stock photo, which AI system performed better and why?

    -Stable Diffusion performed better in this instance, creating an image that could be easily mistaken for a real photo, showing its capability in generating realistic images.

  • What is the general critique about Stable Diffusion's output in terms of aesthetics and maturity?

    -Stable Diffusion's output is considered more rudimentary, immature, and lacking an aesthetic eye, often producing generic images similar to those found on stock photo sites.

  • How does Mid-Journey's approach to image creation differ in terms of style and emotional depth?

    -Mid-Journey tends to create images with a melancholic feel, which adds emotional depth and reflects a deeper level of engagement with the viewer's emotions and the culture's shadows.

  • What was the final verdict regarding the use of Mid-Journey and Stable Diffusion for the speaker's work?

    -The speaker prefers to continue using Mid-Journey for their work due to its superior performance in anatomy, consistency, and overall aesthetic appeal.

  • Who is the speaker in the transcript and what is their final comment?

    -The speaker is Samson Bowles, and their final comment is that they have a delightful day, emphasizing the positive experience of discussing design.

Outlines

00:00

🎨 Artistic Comparison of AI-Generated Images

This paragraph presents a detailed comparison between two AI art generation models, mid-journey and stable diffusion, based on various prompts. The comparison covers different themes, from portraits to landscapes, and evaluates the quality of the outputs in terms of narrative, coherence, anatomy, and aesthetic appeal. Mid-journey is praised for its consistency in facial features and anatomy, creating more engaging and easily identifiable compositions. Stable diffusion, while improving in some areas, is criticized for its less detailed and sometimes garish outputs. The discussion also touches on the impact of the removal of nudity and celebrities from stable diffusion's data set, and how it affects the generation of certain images, such as a portrait of Timothée Chalamet. The paragraph concludes with a commentary on the overall maturity and aesthetic taste of the outputs, highlighting mid-journey's tendency to evoke a melancholic feel in its images.

05:01

🏞️ Evaluation of AI Art in Landscapes and Stock Photos

In this paragraph, the focus shifts to the performance of the AI art generation models in creating landscapes and stock photos. While stable diffusion shows improvement in these areas, it is still not on par with mid-journey. The creator expresses a personal preference for mid-journey due to its better handling of anatomy and consistency, despite stable diffusion's advancements in landscapes and still life. The paragraph ends with the creator's intention to continue using mid-journey for their work and an invitation for the audience to share their thoughts and preferences. The creator, Samson Bowles, signs off with a positive note, indicating a continued exploration of AI-generated art.

Mindmap

Keywords

💡Mid-Journey

Mid-Journey refers to an AI art generation model being compared in the video. It is characterized by its ability to create images with a strong narrative and coherent composition. In the context of the video, Mid-Journey is noted for producing more engaging and aesthetically pleasing images, as seen in the examples of the dreamy galaxy and the fantasy couple kissing.

💡Stable Diffusion

Stable Diffusion is another AI art generation model discussed in the video. It is described as producing images that are sometimes less coherent and more garish compared to Mid-Journey. Despite its shortcomings, Stable Diffusion is noted for its improvement in creating landscapes and stock photos, although it still lags behind Mid-Journey in terms of overall taste and aesthetic appeal.

💡Narrative

In the context of the video, 'narrative' refers to the storytelling element present in the AI-generated images. A strong narrative means that the image tells a story or conveys a specific idea or emotion. The video suggests that Mid-Journey excels in creating images with a greater narrative, as seen in the dreamy galaxy and the fantasy couple, where the characters and settings evoke a deeper story.

💡Consistency

Consistency in the video refers to the accuracy and uniformity in the depiction of elements such as facial features, anatomy, and overall composition in the AI-generated images. The video highlights that Mid-Journey produces images with greater consistency, particularly in maintaining accurate body parts like hands and fingers.

💡Aesthetics

Aesthetics pertains to the artistic and visual appeal of the AI-generated images. The video suggests that Mid-Journey's images have a more refined and pleasing aesthetic, with better composition and a deeper emotional resonance. In contrast, Stable Diffusion's images are described as more rudimentary and generic, lacking the same level of taste and sophistication.

💡Anatomy

Anatomy in this context refers to the accurate and realistic depiction of human body parts in the AI-generated images. The video discusses the importance of anatomical accuracy, particularly in hands, as a marker of the quality of the AI models. Mid-Journey is praised for its improvements in anatomical consistency, while Stable Diffusion is criticized for its less accurate portrayals.

💡Celebrity

Celebrity in the video refers to the depiction of well-known public figures in the AI-generated images. The discussion around celebrity images touches on the impact of the removal of nudity and celebrities from Stable Diffusion's dataset. Despite this, there is still a passing likeness to celebrities like Timothée Chalamet, indicating that some residual data remains.

💡Dataset

The dataset is the collection of images and data that AI models like Mid-Journey and Stable Diffusion use to learn and generate new images. The video discusses the impact of the dataset on the quality of the generated images, noting that Mid-Journey uses a dataset that is a few years old, while Stable Diffusion uses a more recent one, yet still has issues with anatomy and consistency.

💡Emotional Resonance

Emotional resonance refers to the ability of the AI-generated images to evoke emotions or connect with the viewer on a deeper level. The video suggests that Mid-Journey's images often have a melancholic feel, which may resonate more with viewers by reflecting on the darker aspects of human nature and culture.

💡Landscapes

Landscapes in the video refer to the depiction of natural or urban environments in the AI-generated images. The discussion highlights the strengths and weaknesses of both Mid-Journey and Stable Diffusion in creating landscape images, with Mid-Journey being preferred for its more engaging compositions, while Stable Diffusion is catching up but still not at the same level.

💡Stock Photos

Stock photos are pre-existing images that can be licensed for various uses. In the context of the video, stock photos are used as a benchmark for evaluating the capabilities of AI models in generating realistic and aesthetically pleasing images. The discussion suggests that while Stable Diffusion is improving in this area, there is still a gap between its output and the quality of Mid-Journey's images.

Highlights

Comparing mid-journey and stable diffusion AI outputs using identical prompts.

Mid-journey's depiction of a distant galaxy includes a character with a compelling narrative.

Stable diffusion's output for the galaxy prompt is described as garish and less coherent.

In the fantasy couple portrait, mid-journey demonstrates better consistency in facial features and anatomy.

Stable diffusion's portrayal of the couple lacks the same level of detail and coherence.

The tired woman in a Valentino gown prompt shows mid-journey's stronger composition and engaging feel.

Stable diffusion's version of the woman is more abstract, with hands likened to a trotter.

Mid-journey's fantasy cyberpunk princess has remarkable abs and a well-balanced background.

Stable diffusion's cyberpunk princess lacks detail and has anatomy issues.

Young Timothy Chalamet prompt reveals mid-journey's output retains a likeness despite using older data.

Stable diffusion's output of Timothy Chalamet maintains some likeness despite the absence of celebrities in the data set.

Stable diffusion's stock photo of a lion is close in quality to mid-journey, showcasing its improvement.

Stable diffusion's images are considered more rudimentary and immature, lacking an aesthetic eye.

Mid-journey's approach is described as more aesthetic and pleasing, often with a melancholic feel.

The Icelandic Beach landscape demonstrates mid-journey's superior performance over stable diffusion.

Stable diffusion shows progress in landscapes and still life, but regresses in anatomy and consistency.

The speaker, Samson Bowles, expresses a preference for mid-journey for his work.

The talk concludes with a reflection on the depth and exploration of shadows within mid-journey's outputs.