Run Stable Diffusion 3 Locally! | ComfyUI Tutorial

Markury AI
12 Jun 202403:48

TLDRIn this tutorial, the host demonstrates how to use Stable Diffusion 3 Medium with ComfyUI, a newly released AI model available on Hugging Face. The process involves downloading necessary files such as the safe tensors and text encoders, updating ComfyUI, and installing the models. The video showcases the model's ability to generate images from natural language prompts, highlighting its impressive results and encouraging users to provide feedback on licensing issues to Stability AI.

Takeaways

  • 🌟 Stable Diffusion 3 Medium is a newly released AI model available on Hugging Face.
  • 📝 To access the model, users must fill out a form and agree to the terms to gain access to the repository.
  • 📚 Essential files to download include sd3 medium safe tensors, text encoders (CLIP G, CLIP L, T5 XXL), and Comfy UI workflows.
  • 🛠️ Users need to update Comfy UI by running the 'update_comfy_ui.bat' script in the directory.
  • 🔄 After updating, install the downloaded CLIP models into the respective folders within the Comfy UI directory.
  • 📁 Create an 'sd3' folder in the 'checkpoints' directory if it doesn't exist and place the sd3 medium safe tensor file there.
  • 🖼️ Comfy UI can be launched with the Nvidia GPU script to utilize the new model for image generation.
  • 📝 Load the 'sd3 medium safe tensors' checkpoint in Comfy UI to start using the Stable Diffusion 3 Medium model.
  • 🎨 The model generates images based on natural language prompts, which is more intuitive than tag-style prompts.
  • 🌌 The example prompt demonstrates the model's ability to create detailed and ethereal images, such as a character resembling the Northern Lights.
  • 📜 There is a call to action for the community to help update the licensing information for the model, as it is currently unclear.

Q & A

  • What is the topic of the tutorial video?

    -The tutorial video is about how to use Stable Diffusion 3 Medium and integrate it with ComfyUI.

  • Where should users go to access the Stable Diffusion 3 Medium model?

    -Users should go to Hugging Face to access the Stable Diffusion 3 Medium model.

  • Is there a form that needs to be filled out to access the model on Hugging Face?

    -Yes, it is a gated model, so users will have to fill out a form and agree to access the repository.

  • What files does the user need to download from Hugging Face for the Stable Diffusion 3 Medium model?

    -The user needs to download the sd3 medium safe tensors, text encoders including CLIP G, CLIP L, and T5 XXL in fp16 format, and the ComfyUI workflows.

  • Why is it necessary to close ComfyUI before updating it?

    -Closing ComfyUI is necessary to ensure that the update process is not interrupted and that the new files are correctly installed.

  • How can users update their ComfyUI installation?

    -Users should navigate to their ComfyUI directory, go to the 'update' folder, and run the 'update_comfy_ui.bat' file.

  • What should users do with the downloaded CLIP models?

    -Users should place the downloaded CLIP models (T5 XXL, CLIP L, and CLIP G) into the 'clip' folder within their ComfyUI models directory.

  • How do users add the Stable Diffusion 3 Medium model to their ComfyUI checkpoints?

    -Users should create an 'sd3' folder in the 'checkpoints' directory of their ComfyUI models folder and drag the 'sd3 medium safe tensor' file into it.

  • What should users do to start ComfyUI with the new model?

    -After updating and adding the model files, users should go back to their base directory, run the 'Nvidia_GPU.bat' to start ComfyUI, and then load the checkpoint for the sd3 medium safe tensors.

  • What is the example prompt provided in the video for generating an image with Stable Diffusion 3 Medium?

    -The example prompt is for a female character with long flowing hair made of ethereal swirling patterns resembling the Northern Lights or Aurora Borealis.

  • What is the issue mentioned regarding the licensing of the Stable Diffusion 3 Medium model?

    -The licensing is a bit unclear, and the video suggests that the community should open an issue or contact Stability AI to update their license.

Outlines

00:00

🎨 Introduction to Stable Diffusion 3 Medium

The video begins with an introduction to the Stable Diffusion 3 Medium, a new model released by Stability AI. The host explains that the model is gated, requiring viewers to fill out a form on Hugging Face to gain access. The process involves downloading the necessary files, including the 'sd3 medium.safetensors', various text encoders like CLIP G, CLIP L, and T5 XXL, and the 'comfy UI workflows' for basic inference.

🛠️ Updating Comfy UI and Installing Models

The host instructs viewers on how to update Comfy UI, which is necessary before installing new models. This includes closing any running instances of Comfy UI and running the 'update comfy ui.bat' file in the directory. After updating, the host describes how to install the new CLIP models into the 'clip' folder and the 'sd3 medium.safetensors' into a new or existing 'checkpoints' folder within the Comfy UI directory.

🚀 Starting Comfy UI with New Models

With the models installed, the host guides viewers on how to start Comfy UI using the 'Nvidia GPU dobat' script. Once Comfy UI is running, the host demonstrates how to load the new 'sd3 medium.safetensors' checkpoint and the CLIP files. The video then shows the model in action, generating an image of a female character with hair resembling the northern lights, highlighting the model's ability to interpret natural language prompts effectively.

📝 Feedback on Model and Licensing

The host concludes by expressing excitement about the release of the Stable Diffusion 3 Medium model and its impressive generation capabilities. However, they also mention that there are issues with the licensing, encouraging viewers to open issues or communicate with Stability AI to address these concerns. The host emphasizes the importance of community involvement in resolving licensing discrepancies.

Mindmap

Keywords

💡Stable Diffusion 3

Stable Diffusion 3 is a term referring to a specific version of a machine learning model used for generating images from text descriptions. In the video, it is the central tool being introduced for image generation, and the tutorial focuses on how to set it up and use it for creating images with specific characteristics.

💡ComfyUI

ComfyUI is a user interface for interacting with AI models like Stable Diffusion. The script mentions updating ComfyUI to ensure compatibility with the new model. It is an essential component in the process of running Stable Diffusion 3 locally.

💡Hugging Face

Hugging Face is a platform that hosts machine learning models, including Stable Diffusion 3. The script instructs viewers to visit Hugging Face to access the model, indicating that it is the source for downloading the necessary files to use the model.

💡Gated Model

A gated model is one that requires some form of access control, such as filling out a form or agreeing to terms and conditions. In the context of the video, Stable Diffusion 3 is a gated model, meaning users must go through a process to gain access to it.

💡Tensors

Tensors are multi-dimensional arrays of numbers used in machine learning to represent data. In the script, 'sd3 medium safe tensors' refers to the specific data files needed for the Stable Diffusion 3 model to function.

💡Text Encoders

Text encoders are components of AI models that convert text into a format that can be understood by the model. The script mentions downloading 'clip G clip, L and T5 xx', which are types of text encoders used in conjunction with Stable Diffusion 3.

💡Workflows

In the context of the video, workflows refer to the sequence of steps or processes followed to achieve a task, such as generating an image with Stable Diffusion 3. The 'basic inference workflow' is mentioned as the one to be used for this purpose.

💡Checkpoints

Checkpoints in machine learning are points at which the state of a model is saved during training. In the script, the 'sd3 medium safe tensor file' is placed in the checkpoints folder, indicating it is a saved state of the Stable Diffusion 3 model.

💡Nvidia GPU

Nvidia GPUs are graphics processing units manufactured by Nvidia, known for their use in machine learning tasks due to their parallel processing capabilities. The script mentions running 'Nvidia GPU dobat', which is likely a script to utilize the GPU for running Stable Diffusion 3.

💡Q prompt

A Q prompt, or query prompt, is a text input given to an AI model to guide its output. In the video, the Q prompt is used to instruct Stable Diffusion 3 to generate an image of a specific description, showcasing the model's ability to understand and create images from natural language.

💡Ethereal

Ethereal refers to something being extremely delicate and light, often associated with a heavenly or spiritual quality. In the script, the term is used to describe the desired appearance of the generated image, indicating the model's ability to interpret and visualize abstract concepts.

💡Aurora Borealis

Aurora Borealis, also known as the Northern Lights, is a natural light display in the Earth's sky, predominantly seen in the high-latitude regions. In the script, it is used as a metaphor for the swirling patterns in the image generation prompt, demonstrating the model's capacity to create visuals inspired by natural phenomena.

Highlights

Introduction to using Stable Diffusion 3 Medium with ComfyUI.

Accessing the gated model on Hugging Face and filling out the form to gain access to the repository.

Downloading necessary files such as sd3 medium safe tensors, text encoders, and ComfyUI workflows.

Instructions to close and update ComfyUI to the latest version.

Installing CLIP models into the ComfyUI directory.

Creating a new folder for sd3 medium safe tensors and adding the file to the checkpoints.

Starting ComfyUI with Nvidia GPU support.

Loading the sd3 medium safe tensors checkpoint in ComfyUI.

Using the example prompt to generate an image with ComfyUI.

Description of the generated image resembling a female character with ethereal, aurora-like hair.

Discussion on the natural language prompt style used by ComfyUI.

Observation of the model's impressive generation capabilities.

Mention of the model's free release of weights and community feedback on licensing.

Call to action for users to open issues or provide feedback to Stability AI regarding licensing.

Emphasis on the community effort needed to update the licensing terms.

Closing remarks and well-wishes for the viewers.