【速報】WEBUI Forge でFlux1対応、パラメータ検証/stablediffusion

AI Art JAPAN
13 Aug 202405:40

TLDR今回、WEBUI ForgeでFlux1に対応し、パラメータ検証を行いました。最新版のForgeを準備し、Fluxの最適な設定を適用。20から50までのステップ数を変化させ生成を試み、RTX 4070ti環境での生成時間は20ステップで30~40秒、50ステップで約1分半です。生成結果の画像品質はステップ数の増加に比例して向上し、個人的には30ステップが適切だと感じています。また、様々なサンプリング方法を試し、特に++2mやEulerが好まれる。CFGスケールの調整やChoppaモデルの試用も行いました。Flux1モデルのダウンロード先は要約欄を参照。

Takeaways

  • 🌐 WebUI Forge は最新バージョンに更新され、Flux1に対応しています。
  • 🔧 ForgeはFluxに最適な設定を自動的に設定します。
  • 🕒 20ステップから50ステップまでの範囲でステップ数を変化させてテストを行います。
  • 💻 テスト環境はRTX 4070ti、64GBメモリで、20ステップの生成には約30~40秒かかります。
  • 📈 ステップ数が30、40、50に増えると、生成される画像の品質が向上すると考えています。
  • 🤔 30ステップでの生成時間が妥当であると感じていますが、個人的な意見であり、自信はありません。
  • 🎨 20ステップでサイズを512x768に縮小すると、特定の画像が非常にぼやけることがあります。
  • 🔄 ++2mやEulerが生成されましたが、Euler aは生成されませんでした。個人的には++2mやEulerが好きです。
  • 🔧 以前に言及したサンプリング方法にすべてのスケジューラーを適用してみます。
  • 🐱 猫がトイレでスマートフォンで遊んでいるというプロンプトを追加してテストを行います。
  • 🚀 NF4やFP8というモデルを使用して生成を試み、NF4では30ステップで約30~40秒、FP8では約2分半かかります。
  • 📚 Flux1モデルのダウンロード先は要約欄に貼り付けられていますので、参照してください。

Q & A

  • What is the main topic of the video script?

    -The main topic of the video script is the use of Flux1 with WebUI Forge, including its compatibility, parameter settings, and performance in generating images.

  • What does the speaker mention about the compatibility of Forge with Flux?

    -The speaker mentions that Forge is compatible with Flux and that it sets optimal settings for Flux when checked.

  • What are the various preparations mentioned for generating images with Flux?

    -The speaker refers to various preparations needed for generating images, such as setting the number of steps and the image quality, which are discussed in the second half of the script.

  • What is the speaker's hardware setup for generating images?

    -The speaker's hardware setup includes an RTX 4070ti with 64GB of memory, which takes about 30 to 40 seconds for 20 steps of image generation.

  • How does the speaker feel about the generation time for different steps?

    -The speaker feels that 20 steps take quite long, around 30 to 40 seconds, compared to the quicker generation times they are used to with XL in Forge.

  • What is the speaker's opinion on image quality at different steps?

    -The speaker believes that image quality improves as the number of steps increases from 20 to 30, 40, and 50, but this is stated as a personal opinion.

  • What sampling methods does the speaker discuss?

    -The speaker discusses various sampling methods including ++2m, Euler, SGM, Simple, Normal, Beta, and DDIM, and mentions that SGM and Simple have good image quality.

  • What is the speaker's preference between ++2m and Euler sampling methods?

    -The speaker personally prefers either ++2m or Euler sampling methods.

  • What prompt does the speaker use to test the CFG scale?

    -The speaker uses a prompt of 'a cat playing with a smartphone in the toilet' to test the CFG scale.

  • What are the differences between the NF4 and FP8 models as mentioned by the speaker?

    -The speaker mentions that NF4 and FP8 are models used for generating images, with NF4 taking 30 to 40 seconds and FP8 taking about 2 and a half minutes for a 30-step generation at a size of 896 x 1152.

  • Where can the viewers find the download destination of the Flux1 model for Forge?

    -The download destination of the Flux1 model for Forge is mentioned to be pasted in the summary section of the script.

Outlines

00:00

💻 Exploring Flux1 with WebUI Forge

The speaker begins by discussing their intention to use Flux1 with WebUI Forge, ensuring it's updated to the latest version. They mention the compatibility of Forge with flux and the automatic optimal settings adjustment for flux. The speaker shares their personal experience with generating images, noting that 20 steps typically take 30 to 40 seconds on their RTX 4070ti system with 64GB memory. They express a preference for higher image quality at 30, 40, and 50 steps, despite the increased generation time. The speaker also mentions their consideration of the 30th step as a balance between quality and time. They plan to test various sampling methods and schedulers, including ++2m, Euler, and DDIM, and discuss the quality of images generated with these methods. Additionally, they touch upon the CFG scale and its impact on image generation, highlighting issues with blurriness at higher values. The speaker also experiments with different prompts and models, such as nf4 and FP8, comparing their generation times and capacities.

05:02

🚀 Introducing Flux1: Beyond Mid-Journey

In the second paragraph, the speaker introduces the Flux1 model, which has achieved a score surpassing the mid-journey milestone. They invite the audience to refer to the summary column for the download link of the Flux1 model for Forge, indicating that the model is designed to be attention-grabbing and performant.

Mindmap

Keywords

💡Flux1

Flux1 refers to a specific model or version within the context of AI and machine learning, particularly in generative models like those used for image or text generation. In the video, Flux1 is highlighted as a model that has achieved a score exceeding the mid-journey, indicating a significant milestone in its performance. The video discusses the integration of Flux1 with WebUI Forge, suggesting that Flux1 is a model that can be used to enhance the capabilities of the Forge platform.

💡WebUI Forge

WebUI Forge is a user interface for the WebUI version of the Forge software, which is often used for managing and interacting with complex systems or applications. In the context of the video, it is mentioned that the presenter is using the latest version of WebUI Forge to work with Flux1, indicating that the interface is compatible with this model and is likely used for setting parameters and generating outputs based on the model's capabilities.

💡SD, XL, or Flux

These abbreviations refer to different types of models or configurations within the AI and machine learning space. 'SD' could stand for 'Stable Diffusion,' 'XL' might refer to an 'Extra Large' model, and 'Flux' is the model being discussed. The video suggests that Forge is compatible with Flux, and the presenter is checking the optimal settings for Flux, which implies that these models have different characteristics and requirements.

💡Steps

In the context of generative AI, 'steps' often refer to the number of iterations or stages in the generation process. The video mentions varying the number of steps from 20 to 50 to observe the impact on image quality and generation time. This indicates that the number of steps can significantly affect both the output and the efficiency of the process.

💡Image Quality

Image quality is a measure of the visual clarity and detail of an image. The video discusses the image quality at different steps of the generation process, noting that it improves as the number of steps increases. This suggests that the generation process involves a trade-off between quality and speed, with higher quality images requiring more steps and thus more time.

💡CFG Scale

CFG Scale likely refers to a configuration setting or a parameter within the AI model that affects how the model generates images. The video mentions testing the CFG scale starting from 1, indicating that it is a tunable parameter that can influence the output. The mention of it getting blurry at higher values suggests that it controls aspects of image detail or focus.

💡Sampling Methods

Sampling methods in the context of AI generative models refer to the techniques used to select or generate data points from a larger dataset. The video lists several sampling methods such as ++2m, Euler, SGM, Simple, Normal, and Beta, indicating that different methods can produce varying image qualities. The presenter's preference for ++2m or Euler suggests that these methods are considered to yield better results in the given context.

💡Choppa

In the video, 'Choppa' seems to be a term used to describe a specific output or style of image generated by the AI model. When the presenter mentions generating 'Choppa' using a model called nf4, it implies that different models can produce different styles or themes in the generated images, and 'Choppa' is one such distinctive style.

💡NF4 and FP8

NF4 and FP8 are model names or versions used within the AI system discussed in the video. The presenter mentions generating images with these models, noting the time and capacity required for different image sizes and steps. NF4 and FP8 likely represent different configurations or optimizations of the AI model, with NF4 being faster but FP8 potentially offering higher quality or different features.

💡Stable Diffusion

Stable Diffusion is a type of AI model known for its ability to generate high-quality images from text prompts. While not directly mentioned in the video script, the term is part of the initial list of models (SD, XL, or Flux). It's likely that Stable Diffusion is one of the models that could be used within the WebUI Forge environment, suggesting that it is a model that can be configured and optimized for various image generation tasks.

Highlights

WebUI Forge now supports Flux1, enhancing compatibility with various models.

Optimal settings for Flux are automatically configured in Forge.

Preparations are necessary for generating images with Flux.

Varying the number of steps from 20 to 50 affects image generation quality.

RTX 4070ti with 64 memory is used, with 20 steps taking 30-40 seconds.

Image quality improves with an increase in the number of steps.

30 steps are chosen for a balance between quality and generation time.

Exploring various sampling methods in Flux.

Reducing image size to 512 x 768 at step 20.

Certain images become extremely blurry at lower resolutions.

Preference for ++2m or Euler sampling methods over others.

Applying different schedulers to the sampling method improves image quality.

Testing CFG scale with a starting value of 1.

Combination of Euler and Simple sampling methods with a cat prompt.

Increasing CFG scale affects image clarity.

Choppa model generates unexpected results when used with Flux1.

Comparison between NF4 and FP8 models in terms of generation time and capacity.

Flux1 model download link provided in the summary section.

Introduction of Flux1, a model surpassing mid-journey in performance.