How to UPSCALE with Stable Diffusion. The BEST approaches.

Next Tech and AI
3 Dec 202322:16

TLDRThis video tutorial explores various methods for upscaling images generated by StableDiffusion 1.5 models, which often have lower resolution compared to StableDiffusion XL. The video introduces viewers to different upscaling techniques, including the use of ESRGAN, Superscale, and ControlNet, and compares their results. It also provides a step-by-step guide on how to install and use these upscalers with the epicRealism model. The narrator emphasizes the importance of selecting the right parameters for upscaling, such as sampling steps, CFG scale, and denoising strength. The video concludes with a comparison of the upscaling methods, highlighting ControlNet as the most stable and detailed option for high-quality image upscaling.

Takeaways

  • 📈 StableDiffusion 1.5 has improved custom models that offer quality and performance comparable to StableDiffusion XL, but with less memory usage.
  • 🔍 The primary limitation of these custom models is their lower resolution, typically trained with 512x512 or 768x768.
  • 🎨 Upscaling is a solution to increase the resolution of images generated by these models.
  • 📚 Models and upscalers can be found on platforms like CivitAI and HuggingFace, with usage instructions provided.
  • 📁 The epicRealism model is used in this tutorial, and the Superscale upscaler is highlighted for its effectiveness.
  • 🛠️ The Automatic1111 WebUI is used for the upscaling process, which is compatible with Windows, Linux, and macOS.
  • ⚙️ Parameters such as sampling steps, CFG scale, and denoising strength are crucial for generating and upscaling images.
  • 🔍 Zooming in on an image viewer reveals pixelation, which can be addressed by using upscalers like ESRGAN or Superscale.
  • 📱 For quick upscaling without pixelation, the nearest pixel algorithm can be used, but for more detail, ESRGAN or Superscale is preferred.
  • 🔧 The 'image to image' tab allows for further enhancement of images by removing specific details and focusing on overall quality.
  • 🔬 ControlNet is introduced as a neural network for advanced upscaling tasks, providing more stability and detail in the upscaled images.

Q & A

  • What are the advantages of using StableDiffusion 1.5 models over StableDiffusion XL?

    -StableDiffusion 1.5 models offer improved performance, require less memory, and can deliver the same or better quality compared to StableDiffusion XL. However, they typically have a lower resolution, which can be addressed through upscaling techniques.

  • Where can one find the specialized and improved models for StableDiffusion 1.5?

    -The specialized and improved models for StableDiffusion 1.5 can be found at platforms like CivitAI and HuggingFace.

  • What is the recommended sampling step for the epicRealism model?

    -For the epicRealism model, it is recommended to set the sampling steps to something above 20, for instance, 25.

  • What is the purpose of using an upscaler in the context of StableDiffusion models?

    -An upscaler is used to increase the resolution of images generated by StableDiffusion models, particularly when the models are trained with lower resolutions such as 512x512 or 768x768.

  • How does the ESRGAN upscaler differ from the nearest pixels algorithm?

    -The ESRGAN upscaler uses a deep learning network to analyze and scale images more intelligently, resulting in smoother and more detailed images compared to the nearest pixels algorithm, which works directly on pixels and tends to leave visible pixels.

  • What is the significance of using the 'highly detailed' prompt in the image to image tab?

    -Using the 'highly detailed' prompt helps in enhancing the level of detail in the upscaled image, especially when the original prompt contains very few expressions or details.

  • What is the recommended denoising strength when using the Superscale upscaler?

    -The recommended denoising strength when using the Superscale upscaler is around 0.2, as higher values may lead to more deformation and are generally not desired.

  • What is the benefit of using ControlNet for upscaling images?

    -ControlNet provides a more stable upscaling result and can be used as a base for further upscaling iterations, making it suitable for achieving very high resolutions like 8k.

  • How does the Ultimate SD upscale script differ from the SD upscale script?

    -The Ultimate SD upscale script offers more detail in the upscaled image compared to the SD upscale script, making it a better choice for achieving high-quality results.

  • What is the recommended approach if one needs to upscale an image multiple times?

    -If one needs to perform multiple upscaling iterations, using ControlNet is recommended due to its stability and the high quality of the results it produces.

  • What is the main advantage of using the Superscale upscaler over other methods?

    -The Superscale upscaler provides a good balance between detail and smoothness, making it suitable for most cases. It is particularly effective when used in conjunction with the Ultimate SD upscale script for enhanced detail.

  • How can one ensure that the upscaled image does not consume too much VRAM?

    -To prevent high VRAM usage, one can set the tile size to 512x512, which is the default size for StableDiffusion 1.5 and typically does not require as much memory.

Outlines

00:00

🖼️ Upscaling with Custom StableDiffusion 1.5 Models

The paragraph discusses the advantages of using improved custom models for StableDiffusion 1.5 over StableDiffusion XL, particularly in terms of performance and memory usage, despite the challenge of lower resolution outputs. It introduces the concept of upscaling as a solution and outlines the process of using the epicRealism model from CivitAI or HuggingFace. The paragraph also covers the installation of the Superscale upscaler and the parameters to consider for generating images with the model.

05:04

🔍 Exploring Upscaling Techniques and Parameters

This section delves into the use of different upscaling methods, including the high-res fix with ESRGAN, and the importance of denoise strength. It discusses the limitations when using a GPU with 16 GB VRAM and proposes an alternative upscaling approach by resizing the image and choosing an upscaler. The paragraph also explains how to use the epicRealism model for generating detailed images and adjusting parameters for upscaling, emphasizing the need for a detailed prompt and the correct sampling method.

10:07

📈 Advanced Upscaling with Superscale and ControlNet

The paragraph explains how to use the Superscale upscaler for generating high-quality upscaled images without pixelation. It details the process of installing and using the Superscale from the default web UI installation, setting the tile size, and choosing the upscaler. The paragraph also introduces ControlNet as a superior upscaling solution, guiding through its installation, necessary model downloads from HuggingFace, and the steps to integrate it into the upscaling process for enhanced detail and stability.

15:17

🔧 Customizing Upscaling with Ultimate SD Upscale and ControlNet

This part focuses on fine-tuning the upscaling process using the Ultimate SD upscale script and ControlNet for even better results. It provides instructions for installing extensions, setting parameters such as sampling method, tile size, and denoising levels, and choosing the upscale target size. The paragraph also compares the results of using ControlNet with other upscaling methods and emphasizes the stability and detail enhancement ControlNet offers for iterative upscaling.

20:24

📊 Comparing Upscaling Results and Choosing the Best Method

The final paragraph compares the upscaling results obtained from different methods, highlighting the superior detail and stability of ControlNet. It summarizes the best practices for upscaling, recommending ESRGAN for quick results without needing details, the sd upscale script or ultimate sd upscale script for detailed images, and ControlNet for the best results or when multiple upscaling iterations are required. The paragraph concludes with a call to action for viewers to like or comment if the video was helpful for upscaling their pictures.

Mindmap

Keywords

💡StableDiffusion XL

StableDiffusion XL is a high-resolution image generation model that understands short prompts and creates detailed images. However, it requires more memory and has performance limitations compared to newer, specialized models for StableDiffusion 1.5. In the video, it is mentioned as a starting point for understanding the evolution of these models.

💡Custom Models

Custom models refer to specialized versions of StableDiffusion 1.5 that offer quality and performance improvements over StableDiffusion XL. They are often found on platforms like CivitAI and HuggingFace and come with specific usage instructions. In the context of the video, custom models are used to generate images that can then be upscaled.

💡Upscale

Upscaling is the process of increasing the resolution of an image while maintaining or improving its quality. In the video, various methods of upscaling are discussed and demonstrated, showing how they can enhance the resolution of images generated by custom models.

💡epicRealism Model

The epicRealism model is a specific custom model used in the video to generate images. It is chosen for its ability to create realistic images, which are then upscaled using different methods to compare the results.

💡Superscale

Superscale is an upscaler tool mentioned in the video, used to increase the size of images generated by custom models. It is one of the methods explored for upscaling images in the tutorial.

💡ESRGAN

ESRGAN stands for Enhanced Super-Resolution Generative Adversarial Networks. It is an upscaler that uses deep learning to intelligently scale images, resulting in smoother and more detailed results compared to simpler upscaling algorithms. In the video, ESRGAN is presented as a good choice for most upscaling tasks.

💡ControlNet

ControlNet is a neural network used for various tasks, including upscaling images with more stability and detail. It is mentioned as the best upscaling solution in the video, providing more detailed results compared to other methods.

💡Sampling Steps

Sampling steps refer to the number of iterations used in the image generation process. In the context of the video, a higher number of sampling steps (e.g., 25) is recommended for generating images with the epicRealism model for better quality.

💡CFG Scale

CFG scale is a parameter used in the image generation process that affects the creativity level of the model. A higher CFG scale, such as 5 in the video, can lead to more varied and detailed images.

💡Denoising Strength

Denoising strength is a parameter that controls the level of noise reduction applied to the generated image. The video suggests a value around 0.2 for a balance between noise reduction and maintaining image details.

💡Tile Size

Tile size is a parameter that determines the dimensions at which the upscaled image is divided into smaller parts for processing. Using a tile size of 512x512, as mentioned in the video, is beneficial for managing VRAM usage, especially on systems with lower memory.

Highlights

StableDiffusion 1.5 has improved custom models that deliver similar or better quality than StableDiffusion XL with better performance and less memory usage.

The primary challenge with these custom models is their lower resolution, typically trained with 512x512 or 768x768.

Upscaling is a solution to the resolution challenge, allowing for higher quality images from these models.

Different upscaling methods will be compared to show their advantages and results.

Specialized models for StableDiffusion 1.5 can be found at CivitAI and HuggingFace.

The epicRealism model is used for demonstration, with instructions on how to install and use it.

The Superscale upscaler is introduced as a tool for improving image resolution.

The Automatic1111 WebUI is used for the upscaling process across Windows, Linux, and macOS.

Parameters such as sampling steps, sampling method, height, CFG scale, and prompts are crucial for generating images with the epicRealism model.

The high-res fix is mentioned as an alternative upscaler, but caution is advised due to potential memory issues.

ESRGAN is recommended for most cases, while a specific upscaler is suggested for anime images.

The image-to-image tab is used for further upscaling improvements with the removal of specific prompts for greater detail.

The Superscale upscaler is used again for a more detailed upscale, showing significant improvement over the nearest pixels algorithm.

The Ultimate SD upscale script is introduced for even more detail in the upscaled images.

ControlNet is presented as the best upscaling solution, a neural network for various tasks including defining poses and upscaling.

The process for installing and using ControlNet for upscaling is outlined, emphasizing its stability and detail.

A comparison is made between upscaling with and without using a script, highlighting the superior detail and stability of the latter.

ESRGAN is suggested for quick results without needing details, while the sd upscale script or ultimate sd upscale script is recommended for more detailed images.

ControlNet is the preferred method for the best results or when multiple upscaling iterations are required.