Which is better? Midjourney v6 vs. DALL-E 3 vs. Stable Diffusion XL
TLDRIn this video, the host compares image generation results from three AI models: Dolly 3, Stable Diffusion XL, and Mid Journey version 6. The models are tested across five categories - cartoon images, photorealistic humans, architecture, seamless patterns, and logos - to determine which one best captures the essence of each prompt. The audience is encouraged to guess the model behind each image before the reveal, highlighting the strengths and unique styles of each AI in generating images.
Takeaways
- 🌟 The video compares image generation results from three AI models: Dolly 3, Stable Diffusion XL, and Mid Journey version 6.
- 📈 Dolly 3 is available on the plus plan within Chat GPT, while Mid Journey requires a subscription plan and is accessed through Discord.
- 🎨 The models are tested across five categories: cartoon images, photorealistic humans, architecture, seamless patterns, and logos.
- 💡 Users are encouraged to guess which image corresponds to which model in the comments before watching the reveal.
- 🐙 The first category, cartoon images, features an underwater adventure with a cheerful octopus wearing a pirate hat.
- 🎭 The photorealistic human category focuses on generating an image of a street performer playing a saxophone in an urban setting.
- 🏰 For the architecture category, the prompt is to create an image of a Gothic Cathedral complex with detailed features and a surrounding medieval park.
- 🌸 The seamless patterns category involves creating a vintage floral wallpaper with hand-drawn flowers and leaves in pastel colors.
- ☕ The logo category challenges the models to design a logo for a gourmet coffee shop, incorporating a steaming coffee cup and coffee beans.
- 📊 The video concludes with a discussion on the strengths and weaknesses of each model and encourages viewers to suggest further comparisons and tests.
Q & A
What are the three image generation models compared in the video?
-The three image generation models compared in the video are Dolly 3, Stable Diffusion XL, and Mid Journey version 6.
How can one access Dolly 3 for image generation?
-Dolly 3 can be accessed through the plus plan within Chat GPT.
What is the pricing like for Mid Journey version 6?
-The basic subscription plan for Mid Journey version 6 costs $10 per month, which allows for about 200 image generations.
Which category did the video script not choose for testing the image generators?
-The video script chose cartoon images, photorealistic humans, architecture, seamless patterns, and logos, but did not mention any other specific categories.
What was the prompt given for generating a cartoon image?
-The prompt for generating a cartoon image was 'underwater adventure'.
How many image generations can one get for every $10 spent on Mid Journey version 6?
-For every $10 spent on Mid Journey version 6, one can get approximately 5,000 image generations.
What was the common element in all the image prompts used in the video?
-The common element in all the image prompts was that they were designed to fit into one of the five chosen categories.
Which image generation model was considered the most photorealistic according to the video?
-According to the video, Mid Journey version 6 was considered the most photorealistic, particularly for the photorealistic human image prompt.
How can one access Mid Journey's image generator?
-To access Mid Journey's image generator, one needs to subscribe to a plan and then join their Discord server, where the Mid Journey bot can be added to one's own server for image generation.
What was the general conclusion about the image generation models?
-The general conclusion was that there might not be a true winner as all models performed well, and the preference for a particular style or look would come down to personal choice.
How did the video compare the image generation results?
-The video compared the image generation results by using the same prompt for each model across five different categories and then evaluating and comparing the outputs based on the criteria set forth in the script.
Outlines
🎨 Image Generation Models Comparison
This paragraph introduces a video comparing three major image generation models: Dolly 3, Stable Diffusion XL, and Mid Journey version 6. The video will evaluate these models across five categories: cartoon images, photorealistic humans, architecture, seamless patterns, and logos. Each model is accessed through different platforms and requires specific purchases or subscriptions. The comparison is based on generating images from given prompts, and viewers are encouraged to guess which image corresponds to which model before the reveal.
🧜♂️ Underwater Adventure: Cartoon Image Comparison
In this section, the video script describes the first round of the comparison, focusing on generating cartoon images based on the prompt 'underwater adventure.' The images created by Dolly 3, Mid Journey version 6, and Stable Diffusion XL are shown, each with a unique interpretation of the prompt. The first image features a cheerful octopus with a pirate hat, surrounded by treasure chests and fish. The second image is more cartoony with a pirate logo and more fish, while the third has a bubbly style with goggles on the octopus. The viewers are asked to guess which model produced each image before the reveal, which concludes that the first image was Mid Journey version 6, the second was Dolly 3, and the third was Stable Diffusion XL.
🎷 Photorealistic Street Performer: Human Image Comparison
This paragraph details the second round of the image generation comparison, focusing on photorealistic human images. The prompt given was to generate an image of a middle-aged black male street performer playing a saxophone. The first image shows a man wearing a cabby hat and playing the saxophone, with a busy city street in the background. The second image has a man with an unusual saxophone, and the third image features an older man with a touque, playing the saxophone correctly. The viewers are invited to guess the model for each image before the reveal, which indicates that the first image is Dolly 3, and the second is Mid Journey version 6, with the third being Stable Diffusion XL.
🏰 Gothic Cathedral: Architectural Image Comparison
The paragraph discusses the third round of the comparison, which is about generating an image of a Gothic cathedral. The prompt includes detailed flying buttresses, pointed arches, stained glass windows, and a surrounding park. The first image is an isometric view showing the garden and buttresses, the second looks more like a photograph with a Gothic style, and the third image resembles a painting with a medieval style. The viewers are asked to identify the model for each image, and the reveal shows that the isometric image was generated by Dolly 3, the photograph style by Mid Journey version 6, and the painting style by Stable Diffusion XL.
🌸 Vintage Floral Wallpaper: Seamless Texture Comparison
This section of the script covers the fourth round, where the models are tasked with creating a seamless texture of a vintage floral wallpaper. The design should have hand-drawn flowers and leaves in pastel colors. The first image appears hand-drawn and potentially无缝, the second image seems more seamless, and the third image looks more AI-generated. The viewers are prompted to guess the model for each image, and the reveal indicates that the third image was mistakenly identified as Mid Journey version 6, while the first two are correctly identified.
☕️ Gourmet Coffee Shop: Business Logo Comparison
The final round of the comparison involves creating a logo for a gourmet coffee shop. The prompt includes a steaming coffee cup with coffee beans and a cozy, inviting feel with warm色调. The first image attempts text but has spelling errors, the second image is more polished with incorrect words, and the third image focuses on the coffee and beans without text. The viewers are asked to choose their favorite and guess the models, with the reveal showing that the first image is Dolly 3, the second is Mid Journey version 6, and the third is Stable Diffusion XL. The video concludes by encouraging viewers to suggest further model tests and to use the new Mid Journey version 6 in their Discord server.
Mindmap
Keywords
💡Image Generation
💡Dolly 3
💡Stable Diffusion XL
💡Mid Journey version 6
💡Cartoon Images
💡Photorealistic
💡Architecture
💡Seamless Patterns
💡Logos
💡Personal Preference
Highlights
The video compares image generation results between Dolly 3, Stable Diffusion XL, and Mid Journey version 6 across five categories.
Dolly 3 is available on the plus plan within Chat GPT.
Stable Diffusion XL is the newest model from Stable Diffusion and can be accessed through their API or Dream Studio.
Mid Journey version 6 requires a subscription plan starting at $10 per month for basic access and 200 image generations.
The categories tested are cartoon images, photorealistic humans, architecture, seamless patterns, and logos.
The video uses a single prompt for each category to test the models' abilities.
The first category, cartoon images, features an underwater adventure with a cheerful octopus wearing a pirate hat.
Mid Journey version 6's image of the octopus was chosen as the best listener to the prompt in the cartoon category.
In the photorealistic human category, the prompt was to generate an image of a street performer playing a saxophone.
Mid Journey version 6 was praised for its photorealistic portrayal of the saxophone player, standing out with light glares and a well-muted background.
The architecture category tested the models with a prompt to create an image of a Gothic Cathedral complex.
Dolly 3 produced an isometric view of the Gothic Cathedral, while Mid Journey version 6's image resembled a photograph.
Stable Diffusion XL's approach to the Gothic Cathedral was more like a painting, with a focus on the church and less on the surroundings.
Seamless textures were the subject of the fourth category, with a vintage floral wallpaper prompt.
The video notes that Mid Journey has a feature for creating seamless textures, but it was not used to give an advantage in this test.
The final category, business logos, tasked the models with illustrating a logo for a gourmet coffee shop.
The video concludes with a comparison of the logo designs, highlighting the different approaches each model took to the prompt.
Dolly 2's generation is showcased for historical context, showing the significant advancements made by the newer models.
The video encourages viewers to suggest different prompts and image types for future comparisons and tests.