😍画像生成AI『Stable diffusion web UI』プロンプトの基本的な書き方【4W1H+A】&年齢と性別を制御する💕超おすすめ大文字「AND」の使い方😃

のいちゃんねる
2 Apr 202303:09

TLDRThe video script introduces a method for generating preferred images using the 4W1H+A approach, which stands for Who, What, When, Where, Why, and How, plus Action. It emphasizes the importance of controlling age and style in image generation, utilizing saved prompts and specific models like 'One Girl' and 'One Woman' to create desired outputs. The script also highlights the use of 'large text &' to combine keywords for detailed customization, such as blending 'blond' and 'brown' hair to achieve a fashionable and glossy appearance. The video demonstrates how to control the age and perception of characters like 'One Girl' and 'One Man' by specifying age parameters, showcasing the versatility and customization possible with AI image generation.

Takeaways

  • 🎨 The importance of using the 4W1H+A (Who, What, When, Where, Why, How, and Action) framework to guide the creation of desired images.
  • 🌅 The example prompt of a girl sitting on a beach bench in a swimsuit at sunset, smiling, to illustrate the 4W1H+A concept.
  • 🔍 The explanation of 5W1H being challenging to understand, leading to the creation of 4W1H+A with 'Action' for clarity.
  • 👧 Controlling age in prompts by using registered styles and negative prompts to refine the image generation.
  • 💽 Saving prompts with Floppy Disk icons for easy access and reuse of frequently used prompts.
  • 🎨 Using the 'Chill Out Mix' model with VAE turned off for image generation.
  • 🧬 Adjusting parameters to avoid generation artifacts, such as setting the sampling step to 50 and adjusting the image aspect ratio to 1.5 times taller.
  • 🌟 The demonstration of how tweaking the prompt by changing 'One Girl' to 'One Woman' affects the image generation, showing the AI's interpretation.
  • 📈 The impact of specifying age in the prompt to control the generation of characters across different age groups.
  • 💡 The recommendation of using 'Large Text &' to combine elements of keywords for more nuanced image generation.
  • 💎 An example of combining 'blonde' and 'brown' hair to create a fashionable, glossy hair with a golden glow.

Q & A

  • What is the 4W1H+A concept mentioned in the script?

    -The 4W1H+A concept refers to the fundamental elements of who, what, when, where, why, and how, with the addition of 'A' for action. It's a framework used to create prompts that yield ideal results when generating images.

  • How does the speaker suggest using the 4W1H+A concept?

    -The speaker suggests using the 4W1H+A concept to structure prompts in a way that reduces random elements and makes it easier to generate preferred images. It involves clearly defining the scenario, including the action, to guide the image generation process.

  • What is the purpose of controlling age in the image generation process?

    -Controlling age in the image generation process is important to ensure that the resulting image matches the desired demographic, whether it's a young girl, an adult woman, or any other age group specified by the user.

  • How does the speaker save and reuse prompts?

    -The speaker saves prompts using floppy disk icons, which allows for easy retrieval and reuse of frequently used prompts in the image generation process.

  • What model does the speaker use for image generation?

    -The speaker uses a model called 'チルアウトミックス' (Chill Out Mix) without VAE (Variational Autoencoder) for image generation.

  • How does the speaker demonstrate the understanding of the term 'ワンガール' (One Girl) by artificial intelligence?

    -The speaker demonstrates that artificial intelligence interprets 'ワンガール' as a young girl rather than an adult woman, based on the parameters and prompts used.

  • What happens when the speaker changes the term 'ワンガール' to 'ワンウーマン' (One Woman)?

    -When the speaker changes the term to 'ワンウーマン', the generated image shifts from a young girl to an adult woman, showing how the AI understands and responds to different terms.

  • How can specific age groups be controlled in the image generation?

    -Specific age groups can be controlled by adding detailed age parameters to the prompt, such as 10s, 20s, 30s, 40s, 50s, or 60s, which guides the AI to generate images that match the desired age range.

  • What is the '大文字&' technique recommended by the speaker?

    -The '大文字&' technique is a method of combining keywords from different prompts to create a more detailed and nuanced image. For example, combining '金髪' (blonde) and '茶髪' (brown hair) can result in an image of hair that has the光泽 (luster) and 潤い (moisture) of blonde hair but is brown in color.

  • Why is it important to confirm settings before generating images?

    -Confirming settings before generating images is crucial to avoid generation breakdowns and to ensure that the final image meets the desired specifications, such as body proportions and the inclusion of the entire body in the image.

  • What parameters does the speaker adjust to ensure the full body is captured in the image?

    -The speaker adjusts the sampling steps to 50 and sets the vertical length to 1.5 times the original to ensure that the entire body is included in the generated image.

  • What sampling method does the speaker prefer?

    -The speaker prefers to use a sampling method called 'tpmw', combined with 'SDAカラス' (SDA Crows), to generate images.

Outlines

00:00

🎨 Art Prompt Strategy and AI Image Generation

The paragraph introduces a method for generating preferred images using the 4W1H+A approach, which stands for Who, What, When, Where, Why, and How, with an added element of Action. The speaker explains the importance of this framework for creating prompts that yield ideal results. They use the example of a girl sitting on a beach bench at sunset to illustrate the application of this method. The speaker also discusses the use of negative prompts and the storage of frequently used prompts using floppy disk icons. They mention the use of different models, such as the standard one-girl model, and the importance of setting parameters to avoid generation errors. The paragraph concludes with a discussion on how to control age in the generated images and the impact of wording on AI's interpretation of terms like 'one-girl' and 'one-woman'.

Mindmap

Keywords

💡4W1H+A

The concept of 4W1H+A refers to the fundamental elements of 'Who, What, When, Where, Why, and How' plus 'Action', which are essential for creating a detailed and comprehensive prompt. In the context of the video, it is used to guide the viewer on how to generate their preferred images by structuring their prompts effectively. The video emphasizes the importance of these elements in reducing randomness and achieving desired outcomes in image generation.

💡Neon Genesis Evangelion

Neon Genesis Evangelion is a popular Japanese anime series known for its deep psychological themes and complex characters. In the context of the video, it is likely used as a reference to a style or theme that can be applied in the image generation process. The series' distinctive visual and thematic elements could be used to create prompts for generating images with a similar aesthetic.

💡Negative Prompt

A negative prompt is a technique used in image generation where certain elements or characteristics are explicitly excluded from the final output. This helps in refining the generated images to better match the creator's vision by avoiding unwanted features. In the video, the creator discusses using negative prompts to control the age of the characters, ensuring that the generated images align with the intended age group.

💡Floppy Disk Icon

The floppy disk icon is a symbol commonly associated with saving or storing data. In the context of the video, it is mentioned as a convenient feature for saving prompts that the user frequently employs. This allows the user to recall and reuse their preferred prompts easily, streamlining the image generation process.

💡VAE

VAE stands for Variational Autoencoder, which is a type of generative artificial neural network used for unsupervised learning of latent representations of data. In the video, the creator mentions using VAE without it, implying that they are using a different model or technique for generating images that does not rely on VAE.

💡One-Girl

The term 'One-Girl' in the context of the video refers to a prompt or category for generating images of a single female character. The video discusses how the AI interprets this term and how it can be adjusted to generate images of females of different ages, from young girls to adult women, by specifying the age in the prompt.

💡Sampling Steps

Sampling steps refer to the process of generating an image by progressively building it up in stages. In the video, the creator sets the sampling steps to 50 to avoid generation artifacts and ensures that the entire body is captured by adjusting the aspect ratio to 1.5 times taller. This technique helps in creating a more coherent and complete image.

💡TPMW

TPMW is likely an acronym or a specific parameter setting used in the image generation process. While the exact meaning is not provided in the script, it is mentioned as the creator's preferred sampling method, suggesting it is a technique or setting that yields desirable results for the creator.

💡SDA Karas

SDA Karas is likely a term referring to a specific model or set of parameters used in the image generation process. The video mentions using SDA Karas in combination with TPMW to generate images, suggesting it is another tool or technique that contributes to the desired output.

💡Seed Value

The seed value is a starting point or initial value used in the image generation process to ensure consistency and reproducibility of results. In the video, the creator mentions fixing the seed value before making changes to the prompt, which allows for controlled variations and the ability to recreate similar images.

💡Age Control

Age control in the context of the video refers to the ability to specify the age of the characters in the image generation process. By including age-related terms in the prompt, the creator can guide the AI to generate images of characters within a desired age range, from children to adults.

💡Large Text &

The term 'Large Text &' likely refers to a technique or method of combining keywords in a prompt to create a detailed description for image generation. This approach allows the creator to refine the characteristics of the generated images by specifying elements in a more granular manner.

Highlights

The speaker introduces the concept of using 4W1H+A to create ideal prompts for image generation.

The 4W1H+A stands for Who, What, When, Where, Why, and How, with an added 'Action' to clarify the 'Why'.

An example prompt is described: a girl in a swimsuit, sitting on a bench and smiling at a beach during sunset.

The speaker emphasizes the importance of reducing random elements in prompts to generate preferred images more easily.

The use of registered styles and floppy disk icons for saving prompts is mentioned for convenience.

The speaker discusses using the 'ChilloutMix' model without VAE (Variational Autoencoder) for image generation.

The concept of 'One Girl' is explained, and how AI interprets it as a young girl rather than an adult woman.

The process of changing the prompt from 'One Girl' to 'One Woman' and its effect on the generated image is described.

The speaker demonstrates how to control the age of the generated character by specifying the age in the prompt.

The use of 'large text &' to combine keywords for more nuanced control over image generation is introduced.

An example is given of combining 'blonde' and 'brown' hair to create a fashionable, glossy hair with a golden glow.

The speaker concludes by thanking the viewers and encouraging them to apply the shared techniques.