【Stable Diffusion】構図・アングル・視線の呪文集プロンプトをまとめて紹介

AIジェネ【AIイラスト生成の情報発信】
26 Aug 202308:23

TLDRThe video script offers a comprehensive guide on creating compositions and angles in AI-generated images. It explains the use of specific 'spells' or prompts to achieve desired views, such as 'head shot' for chest-up compositions and 'cowboy shot' for thigh-level perspectives. The importance of image size in generating quality images is emphasized, with recommendations for aspect ratios and the use of 'hires.fix' for higher quality. The script also covers angle spells, including 'shoot from above' and 'looking viewer', to direct the character's gaze. Tips on troubleshooting common issues, like avoiding hats in 'cowboy shots' or preventing eye collapse in full-body images, are provided. The video concludes with suggestions to use 'img2img', 'openpose', or change models if the desired results aren't achieved.

Takeaways

  • 🎨 Use specific composition spells like 'head shot' for generating images from the chest area upwards.
  • 📐 Adjust image size to a square (e.g., 512x512) for easier and potentially higher quality generation.
  • 👖 Include or exclude certain elements (like 'skirt' or 'denim') in the prompt to control what is displayed in the image.
  • 🤠 Utilize 'cowboy shot' for compositions from the middle of the thigh, and add ((no hat1.3)) to avoid generating images with hats.
  • 📏 For full-body compositions, try an image size of 512x1024 to improve eye quality and overall full-body generation.
  • 🔄 Use 'hires.fix' for the highest quality image, especially when the width is 512 and the height is 768.
  • 📸 Specify angles with commands like 'shoot from front', 'above', 'below', 'side', and 'behind' to control the perspective.
  • 👀 Employ 'looking viewer', 'looking up', 'looking down', 'looking side', and 'looking back' to direct the character's line of sight.
  • 🔄 If a desired composition or angle isn't achieved, consider changing image size, using 'img2img', 'openpose', or switching models.
  • 🌐 For unresolved issues with spells, adjusting image size or using reference-based generation methods may be effective solutions.

Q & A

  • What is the main topic of the video?

    -The main topic of the video is about using specific spells and techniques to create desired compositions and angles in AI-generated images.

  • Why is it important to master composition and angle spells in AI image generation?

    -Mastering composition and angle spells is important because it allows for the creation of images with specific compositions and angles without having to generate multiple images through trial and error.

  • How can you generate an image from the chest area upwards?

    -You can generate an image from the chest area upwards by inserting the 'head shot' spell into the prompt.

  • What should you do if you want to generate a composition from the middle of the thigh?

    -To generate a composition from the middle of the thigh, you can enter 'cowboy shot' into the prompt. To avoid generating an image with a hat, add '(no hat1.3)' after 'cowboy shot'.

  • What image size is recommended for generating high-quality images from the chest to the top?

    -A square image size, such as 512 by 512, is recommended for generating high-quality images from the chest to the top.

  • How can you improve the quality of eyes in a full-body image?

    -To improve the quality of eyes in a full-body image, set the width to 512 and the height to 1024, and use 'hires.fix' for the generation.

  • What spell can be used to generate an image from the front angle?

    -The spell 'shoot from front' can be used to generate an image from the front angle.

  • How can you make the generated image look upwards?

    -To make the generated image look upwards, you can enter 'looking up: 1.2' in the prompt.

  • What should you do if the model does not reflect the spell content?

    -If the model does not reflect the spell content, changing the image size or using methods like 'img2img', 'openpose', or 'openpose editor' based on a reference image may solve the problem.

  • What is the recommended method for generating a high-quality full-body image?

    -The most recommended method for generating a high-quality full-body image is to set the width to 512 and the height to 768 and use 'hires.fix' for the generation.

  • What should you do if you still can't get the desired composition or angle after trying all the introduced spells?

    -If you still can't get the desired composition or angle, consider changing to a different model or using 'img2img', 'openpose', or 'openpose editor' based on a reference image.

Outlines

00:00

🎨 Composition and Angle Viewpoints in Art

This paragraph discusses the importance of mastering composition and angle viewpoint spells in creating images. It explains that knowing the right spells can help generate images without the need for repeated attempts. The focus is on composition spells for different body parts and the impact of image size on the quality of the generated image. It also touches on how to avoid common issues like the 'hat' problem in cowboy shots and the benefits of using specific image sizes for certain compositions. Lastly, it emphasizes the recommended method for generating high-quality images by adjusting image dimensions and using 'hires.fix'.

05:05

👁️‍🗨️ Adjusting Character's Line of Sight

The second paragraph delves into the intricacies of controlling a character's line of sight in image generation. It describes various spells to achieve different angles such as 'looking viewer', 'looking up', 'looking down', 'looking side', 'looking back', and 'looking away'. The paragraph highlights the challenges of getting the desired line of sight and suggests increasing the value after the colon for more effective results. It also mentions the potential need to change image sizes or use alternative methods like 'img2img' or 'openpose' if the desired outcome is not achieved. The summary concludes with a reminder to remember these spells for efficient image creation and an encouragement to explore the video's summary column for further guidance.

Mindmap

Keywords

💡Composition

In the context of the video, 'composition' refers to the arrangement of visual elements within an image. It is a critical aspect of creating art or photography, where the placement and interaction of elements contribute to the overall aesthetic and meaning of the piece. The video discusses various 'composition spells' or techniques, such as 'head shot' and 'cowboy shot', to guide the viewer in achieving desired compositions from different angles and perspectives. For instance, generating a composition from the chest area upwards can be done by inserting 'head shot' into the prompt.

💡Spells

In the video, 'spells' are specific terms or phrases used in prompts to guide the generation of images. These are not magical incantations but rather a form of input that influences the output of an AI model, dictating aspects such as the angle, focus, and details of the generated image. Understanding and mastering these 'spells' allows for more precise control over the image generation process, reducing the need for multiple attempts to achieve the desired result. For example, adding '(no hat1.3)' after 'cowboy shot' can reduce the likelihood of the generated image including a hat.

💡Image Size

The 'image size' refers to the dimensions of the generated image, which can significantly impact the quality and composition of the output. The video emphasizes the importance of selecting the appropriate image size to avoid issues such as eye collapse or incomplete full-body representations. It suggests that a square image size, like 512 by 512, can be beneficial for certain compositions, while a vertically longer size, such as 512 horizontal by 1024 vertical, may be better for full-body images.

💡Angles

The term 'angles' in the video pertains to the perspective from which an image is viewed or generated. It is a fundamental aspect of photography and art that can dramatically alter the viewer's perception of the subject. The video discusses various angle 'spells', such as 'shoot from front', 'shoot from above', and 'shoot from below', which are used to direct the AI model to produce images from specific viewpoints. These angles can change the dynamic of the image, focusing on different aspects of the subject.

💡Line of Sight

The 'line of sight' refers to the direction in which a subject is looking within an image. This concept is crucial in the video as it guides the viewer on how to control the gaze of the character in the generated image. By using specific 'spells' related to the line of sight, such as 'looking viewer' or 'looking up: 1.2', the viewer can dictate whether the character is looking directly at the camera, upward, or in another direction, thus influencing the narrative and mood of the image.

💡Quality

In the context of the video, 'quality' refers to the visual fidelity and resolution of the generated images. High-quality images have more detail, better clarity, and are generally more realistic. The video suggests that using specific 'spells' and adjusting the image size can dramatically improve the quality, particularly of the eyes, which are a critical element in portrait compositions. However, higher quality often comes at the cost of increased processing time.

💡Prompt

A 'prompt' in the video is a set of instructions or inputs given to an AI model to generate a specific image. It is a crucial aspect of the image generation process, as the prompts directly influence the output. The video discusses how to craft effective prompts by including various 'composition spells' and 'angle spells' to achieve the desired image. The prompts must be carefully constructed to ensure the AI understands the intended composition and viewpoint.

💡Trial and Error

The process of 'trial and error' involves attempting different methods or inputs to achieve a desired outcome, learning from the results, and making adjustments as necessary. In the context of the video, this refers to the iterative process of generating images using different 'spells' and image sizes to find the optimal combination that produces the best visual result. It highlights the importance of experimentation in the image generation process, as the AI model may not always produce the exact desired image on the first attempt.

💡Img2img

The term 'img2img' refers to a method or feature in some AI models that allows for the generation of an image based on an existing image as a reference. This technique can be used when the desired composition or angle cannot be achieved through textual prompts alone. By using 'img2img', users can input a reference image and guide the AI to create a new image that aligns more closely with their vision.

💡Openpose

In the context of the video, 'Openpose' is likely a tool or feature that can be used to generate or manipulate images based on pose detection. It may involve using AI to analyze and interpret the pose of a subject in an image and then generate new images or modify existing ones to match or enhance that pose. The video suggests 'Openpose' as a solution when the desired composition or angle is not achieved through standard prompts.

💡Ai Gene

The term 'Ai Gene' in the video seems to refer to the AI model or technology being discussed, which is capable of generating images based on textual prompts and other inputs. It is implied that 'Ai Gene' disseminates information about generated AI, suggesting that it might be a source of knowledge or a platform for sharing information about AI-generated content.

Highlights

Creating a composition that shows the whole body can be achieved by using specific spells and image sizes.

Mastering spells such as composition and angle viewpoint is crucial for efficient image generation.

For a composition from the chest upwards, using 'head shot' can generate the desired image.

Deletion of spells like 'skirt' or 'denim' is necessary for proper composition generation from the chest area.

A square image size, such as 512x512, facilitates easier generation of high-quality images.

The term 'head' in prompts often results in images drawn up to the chest area.

Generating an image from the middle of the thigh can be done using the 'cowboy shot' spell.

To prevent the model from generating an image with a hat, add ((no hat1.3)) to the prompt.

For a full-body composition, setting the image size to 512x1024 improves the quality of the eyes.

The most recommended method for generating high-quality images is using a width of 512 and a height of 768 with 'hires.fix'.

Entering 'shoot from front' ensures a front-facing angle with the character's line of sight.

Angles such as from above, below, or the side can be specified using respective spells.

The 'looking viewer' spell is used to make the character look directly at the camera.

Adjusting the character's gaze upwards or downwards can be done with 'looking up: 1.2' or 'looking down: 1.2'.

For a sideways composition, use 'shoot from front, looking side'.

To generate an image where the character looks behind, use the 'looking back' spell.

The 'looking away: 1.4' spell can make the character look elsewhere instead of the camera.

If the desired composition is not achieved, changing the image size or using 'img2img' or 'openpose' can be solutions.

Understanding and inputting composition and angle spells can save time in generating ideal images.