SDXL用のCounterfeitとネガティブエンベッディングスでたー!【stable diffusion】

AI is in wonderland
29 Jul 202307:12

TLDRAlice from AI's in Wonderland から、CounterfeitXLとSDXLをサポートするモデルについて紹介。約7GBの大きなモデルを使用しながら、3種類のネガティブエンベッディング(Standard、Realistic、Anime-like)の効果を検証。Comfy UIでより速く、清らかな画像を生成し、Real ESRGAN 4xとAnime6Bを使用したアップスケールの結果を示す。ネガティブエンベッディングの影響を評価し、テンプレートからのネガティブプロンプトを追加して、最終的に生成された画像の変化を比較。

Takeaways

  • 🚀 Introduction of a model supporting both Counterfeit and SDXL, indicating a significant development in AI technology.
  • 🗄️ Large model sizes of CounterfeitXL and SDXL, approximately 7GB, hint at complex and detailed AI capabilities.
  • 🌟 Implementation of three Negative Embeddings tailored for SDXL: Standard (A), Realistic (B), and Anime-like (C), showcasing versatility in image generation.
  • 🔍 Examination of the effects of each Negative Embedding on image quality and characteristics.
  • 🖼️ Comparison of images generated with and without Negative Embeddings, providing insights into their impact.
  • 🎨 Mention of Comfy UI for producing quicker and cleaner images, suggesting an efficient user interface for AI image generation.
  • 🎥 Plans to post a video on Comfy UI, indicating an upcoming resource for users interested in AI image generation.
  • 🛠️ Technical details provided, including model (Counterfeit XLα), settings, and image generation process, catering to an audience that may wish to replicate the process.
  • 📈 Evaluation of image quality with different configurations, including the addition of a negative prompt for quality improvement.
  • 🌸 Specific mention of cherry blossom petals as an example of detail enhancement through the use of Negative Embeddings.
  • 🔎 Encouragement for viewers to explore various settings and prompts to fully understand the capabilities of the AI model.

Q & A

  • What is the main topic of Alice's presentation?

    -The main topic of Alice's presentation is the introduction of a model that supports both Counterfeit and SDXL, discussing the effects of Negative Embeddings and various settings on image generation.

  • How large are the CounterfeitXL and SDXL models?

    -The CounterfeitXL and SDXL models are quite large, approximately 7GB each.

  • What are the three Negative Embeddings exclusive to SDXL?

    -The three Negative Embeddings exclusive to SDXL are A for Standard, B for Realistic, and C for Anime-like.

  • Where can viewers find prompts and images related to the presentation?

    -Viewers can find prompts and images featured on CIVITA's side.

  • What is Alice's plan for the Comfy UI?

    -Alice plans to try generating images with the Comfy UI and post a video about it next week.

  • What model and settings does Alice use for her first image generation example?

    -Alice uses the Counterfeit XLα model without LoRA, with a prompt of a girl in a school uniform, image size of 1024x1024, total steps of 35, and CFG scale of 7.

  • How does Alice upscale the images?

    -Alice upscales the images using an Upscale Model with denoising strength of 0.3 and a 1.5x scale.

  • What are the effects of the Standard Negative Embedding (A) on the image?

    -The Standard Negative Embedding (A) makes the face more solid and the cherry blossoms more distinct.

  • What issue does the Realistic Negative Embedding (B) seem to have?

    -The Realistic Negative Embedding (B) causes the hand to become distorted and does not seem to make the image more realistic.

  • What is the result of using the Anime-style Negative Embedding (C)?

    -Using the Anime-style Negative Embedding (C) does not result in significant changes, but there are some variations in the image.

  • How does adding a negative prompt from Alice's template affect the image?

    -Adding a negative prompt improves the hand's appearance but alters the composition of the image, making direct comparison difficult.

Outlines

00:00

🖌️ Introduction to Counterfeit XL and SDXL Models

Alice from AI's in Wonderland introduces the CounterfeitXL and SDXL models, noting their large size of about 7GB and the challenge of storage limitations. She mentions three Negative Embeddings specific to SDXL, with A being Standard, B being Realistic, and C being Anime-like, and plans to explore their effects. Alice also refers to CIVITA's side for prompts and images, acknowledges the limitations in image quality, and expresses her intention to experiment with these models. She discusses her preference for Comfy UI for its quick and clean image production and plans to release a video on it, albeit delayed due to breaking news. The settings for the Counterfeit XLα model are detailed, including the absence of LoRA, the prompt of a girl in a school uniform, image size, total steps, and the use of DPM++2MSD crow for image generation. An Upscale Model is set up for 1.5x upscale with denoising strength 0.3, and the output order is specified as Base, Refiner, and Upscale. Alice's approach to experimenting with different styles and settings is highlighted.

Mindmap

Keywords

💡Counterfeit and SDXL

Counterfeit and SDXL refer to two different models discussed in the video. CounterfeitXL is a large model, approximately 7GB, used for generating images. SDXL is another model that is also large in size and is used for a similar purpose. Both models are utilized in the video to demonstrate their capabilities in image generation, storage, and the impact of Negative Embeddings on their outputs.

💡Negative Embeddings

Negative Embeddings are a feature used exclusively for SDXL to enhance the image generation process. They come in three types: Standard (A), Realistic (B), and Anime-like (C). These embeddings are designed to refine the output of the image generation process, making the images more solid, distinct, and in line with the desired style. The video explores the effects of each type of Negative Embedding on the final image.

💡Comfy UI

Comfy UI is a user interface that is mentioned as being quick and clean in producing images. The video's presenter is considering using it again for its efficiency and plans to showcase it in a video to be posted next week. It is implied that Comfy UI offers a good balance between speed and quality in image generation.

💡Upscale Model

An Upscale Model is a technique used to increase the resolution of images. In the context of the video, the presenter has set up an Upscale Model to generate images at 1.5x upscale with a denoising strength of 0.3. This process aims to improve the quality of the images produced, making them sharper and more detailed.

💡DPM++2MSD crow

DPM++2MSD crow refers to a specific sampler used in the image generation process. It is a method that helps in creating images with a certain level of detail and quality. The video's presenter plans to use this sampler to generate images, indicating a preference for the results it produces.

💡Real ESRGAN 4x

Real ESRGAN 4x is an advanced image upscaling technique that is used to enhance the quality and resolution of images. In the video, the presenter uses this method in conjunction with Anime6B to generate images in an anime style. The use of Real ESRGAN 4x suggests a focus on achieving a high level of detail and clarity in the final output.

💡Refiner

In the context of the video, a Refiner is a term used to describe a stage in the image generation process. It is a step following the base model generation, aimed at improving the quality of the image by reducing noise and enhancing details. The presenter plans to use a Refiner to refine the images generated by the base model.

💡Anime style

Anime style refers to a specific aesthetic often associated with Japanese animation. In the video, the presenter is experimenting with generating images in an anime style using the Upscale Model. This indicates a focus on creating images with a distinctive look that is characteristic of anime, such as exaggerated features and vibrant colors.

💡Negative prompt

A negative prompt is a term used in the context of the video to describe an input that helps guide the image generation process away from certain unwanted features or qualities. By adding a negative prompt, the presenter aims to prevent image deformities and achieve a cleaner, more polished final image.

💡Cherry blossom petals

Cherry blossom petals are a detail mentioned in the video that serves as an example of the kind of fine details that the image generation process is aiming to capture. The clarity and distinctness of such details are used as a measure of the effectiveness of the models and techniques being discussed.

Highlights

Introduction of a model supporting both Counterfeit and SDXL.

The CounterfeitXL and SDXL models are approximately 7GB in size.

Three Negative Embeddings are exclusively for SDXL: Standard (A), Realistic (B), and Anime-like (C).

CIVITA features prompts and images for reference.

Comfy UI is considered for producing quicker and cleaner images.

A video on Comfy UI is planned for the following week, slightly delayed due to breaking news.

The model used is Counterfeit XLα without LoRA.

The image size is set to 1024 by 1024 with total steps of 35.

CFG scale is 7 and clip skip is set to 2.

Images are generated using the sampler with DPM++2MSD crow.

An Upscale Model is set up for 1.5x upscale with denoising strength 0.3.

The impact of Negative Embeddings on image quality is evaluated.

Standard model (A) makes the image more solid.

Realistic model (B) slightly distorts the hand but makes the face cute when upscaled.

Anime-style model (C) doesn't show significant changes but has some variation.

Negative prompt from a template is added for deformity prevention and quality improvement.

The final image with the added negative prompt shows a noticeable change in composition and hand clarity.