Adobe Answers the Sora Question & New SD3 Features!

Theoretically Media
18 Apr 202415:52

TLDRIn this video, the host discusses several advancements in AI technology. They share their experiences from the NAB convention in Las Vegas, where they explored the newly released Stable Diffusion 3, which excels in prompt understanding and generating text within images. The host also interviews Kyle from Adobe to discuss the upcoming generative AI video features for Adobe Premiere, including object removal, addition, and generative extend. They touch on the potential for generative audio and the integration of third-party AI models. Additionally, the video covers Microsoft's entry into the AI avatar market with Vasa, a lifelike, audio-driven talking face technology. Finally, the host shares their experience as a judge in the first AI art esports competition, highlighting the potential for community events centered around AI technology.

Takeaways

  • 📈 Stable Diffusion 3 has been released with improved prompt understanding and text-to-image generation capabilities.
  • 🔍 The new version includes features like a creative upscaler up to 4K, in-painting, out-painting, search and replace, and background removal.
  • 🎨 Stable Diffusion 3 introduces image-to-image editing with a mask, allowing for text or image prompts to modify specific parts of an image.
  • 🖼️ A comparison between Stable Diffusion 3 and XL shows SD3's superior clarity and detail in image generation.
  • 🤖 Microsoft has entered the AI avatar market with VASA-1, an audio-driven, lifelike talking face generation system.
  • 🌟 Adobe is integrating generative AI video features into Adobe Premiere, including object removal, addition, and generative extend functionalities.
  • 📹 Adobe's Firefly video model will continue to be commercially safe, and third-party models used will have content credentials attached.
  • 🔗 Adobe is exploring collaborations with major video AI model providers like Open AI, Runway ML, and Pika for Premiere Pro.
  • 🎥 The generative AI features will be available within the Adobe ecosystem, likely as clip extensions in Premiere.
  • 📢 Adobe is also working on generative audio to accompany clip extensions, offering more control and granularity in the future.
  • 🏆 The first AI art esports competition was held, showcasing the potential for community events centered around AI-generated content.

Q & A

  • What is the main focus of Adobe's presence at the NAB 2024 show?

    -The main focus of Adobe's presence at the NAB 2024 show is generative AI and collaboration. They are working on a video model within their AI models collection, Firefly, which will be announced later in the year.

  • What are some of the new features in Stable Diffusion 3?

    -Stable Diffusion 3 features improved prompt understanding and text generation within images. It also includes a creative upscaler that can upscale images up to 4K, in-painting and out-painting, search and replace without a mask, a built-in background remover, and an image-to-video feature.

  • How does Adobe's generative AI video feature for Adobe Premiere integrate with the platform?

    -The generative AI video features for Adobe Premiere will be integrated as part of the Adobe ecosystem, allowing for functionalities such as object removal, object addition, generative extend, and clip extensions with generative audio.

  • What is the difference between Stable Diffusion 3 and Stable Diffusion XL?

    -Stable Diffusion 3 is reported to be crisper and sharper in terms of prompt understanding and text generation within images compared to Stable Diffusion XL. It also provides more detailed and accurate representations in its generated images.

  • What is the Stable Assistant Beta platform?

    -Stable Assistant Beta is a platform announced by Stability AI. It is a friendly chatbot that allows paying subscribers to access the latest models, generate images, write content, and match photos to text through conversation.

  • How does Microsoft's Vasa 1 differ from other AI avatar models?

    -Vasa 1 differs from other AI avatar models as it is audio-driven rather than text-based. It generates lifelike, audio-driven talking faces in real-time, with impressive lip sync and a wide range of facial nuances and natural head motions.

  • What is the significance of the AI art esports competition?

    -The AI art esports competition signifies the potential for community events based around generative AI technology. It involved contestants using Leonardo's real-time drawing tool to create images based on prompts, showcasing the exciting possibilities of AI in creative and competitive settings.

  • What is the role of Adobe's Firefly image model in the development of their video model?

    -Firefly is the collection of AI models within Adobe, and it plays a crucial role in the development of their video model by providing a foundation for the integration of generative AI features into Adobe's video editing software.

  • What are the potential commercial implications of Adobe's generative AI video features?

    -The commercial implications of Adobe's generative AI video features include the ability to offer more efficient and creative editing options to users, potentially enhancing the quality of video content and providing a competitive edge in the market.

  • How does the AI art esports competition demonstrate the potential of AI in creative fields?

    -The AI art esports competition demonstrates the potential of AI in creative fields by showcasing how generative AI can be used in a timed, competitive environment to create unique and responsive art pieces, indicating its utility in various creative applications.

  • What are the benefits of the 'search and replace' feature in Stable Diffusion 3?

    -The 'search and replace' feature in Stable Diffusion 3 allows users to replace objects in images using simple language prompts without the need for complex masking tools, making the image editing process more accessible and efficient.

  • How does the integration of generative AI into Adobe Premiere Pro affect the workflow of video editors?

    -The integration of generative AI into Adobe Premiere Pro is expected to streamline the workflow of video editors by providing smart masking, object removal, and other AI-driven enhancements, allowing for more precise and automated editing processes.

Outlines

00:00

🚀 Stable Diffusion 3 Release and Features

The video begins with the host returning from the NAB convention in Las Vegas and jumping straight into discussing the latest developments in AI. A significant focus is on Stable Diffusion 3, which has been released with improved prompt understanding and text-to-image generation capabilities. The host mentions an exclusive look at the new features, including a creative upscaler that can reach up to 4K resolution, inpainting and outpainting without the need for a mask, and a search and replace feature. The discussion also covers the built-in background removal and the image-to-video feature. Comparisons are made between Stable Diffusion 3 and Stable Diffusion XL, showcasing the advancements in image quality and detail. The host also mentions Stable Assistant Beta, a new platform offering a friendly chatbot for generating images and content through conversation, with pricing plans available for subscribers.

05:00

🎥 Adobe Premiere's AI Video Generation Features

The host details his conversation with Kyle from Adobe regarding the upcoming AI video generation features in Adobe Premiere. They discuss the potential for object removal, generative fill, object addition, and generative extend functionalities that will be powered by Adobe's AI models. The conversation explores the possibility of third-party plugin integrations for Premiere and the intention to provide users with a choice of models that suit their specific needs. The host also inquires about the commercial safety of third-party models and the content credentials that Adobe will attach to media generated through their toolset. The discussion touches on the potential for generative audio and the future of AI tools in After Effects, with a focus on smart masking and manual masking improvements in Premiere.

10:00

🤖 Microsoft's Vasa AI Avatars and Real-time Lip Sync

The host introduces Microsoft's entry into the AI avatar space with Vasa, a technology that generates lifelike, audio-driven talking faces in real time. Unlike text-based models, Vasa is driven by actual recorded audio, resulting in highly realistic lip sync and facial expressions. The host shares examples of Vasa in action, noting the impressive natural head motions and the ability to control camera angles and text prompts. Despite some minor issues with head movement and hair tracking, the host is overall impressed with the technology and its potential applications, hinting at its use in platforms like Microsoft Teams.

15:04

🎨 First AI Art E-Sport Competition and Community Engagement

The video concludes with the host's experience as a judge in the first AI art eSport competition organized by Creative Refuge. The event involved contestants using Leonardo's real-time drawing tool to generate images based on prompts given by the audience. The host describes the competition as more exciting than anticipated, highlighting the potential for community events centered around AI technology. He encourages viewers to check out Creative Refuge's channel for a full rundown of the event and thanks them for watching before signing off.

Mindmap

Keywords

💡Stable Diffusion 3

Stable Diffusion 3 is an advanced AI model for image generation that is reported to excel at understanding prompts and generating text within images. It is part of the discussion as the speaker shares insights about its capabilities, such as the 'creative upscaler' that can upscale images up to 4K resolution and features like 'search and replace' that allow for object replacement without needing a mask. This technology is central to the video's theme of exploring new advancements in AI.

💡Adobe Premiere

Adobe Premiere is a professional video editing software that the speaker discusses in the context of new AI video generation features being integrated into it. The video touches on the upcoming capabilities such as object removal and addition, which are significant for video editing and post-production workflows. These features are integral to the narrative of AI enhancing creative processes.

💡AI Avatar

The term AI Avatar refers to the technology that generates realistic, human-like virtual characters. The video mentions Microsoft's entry into this space with 'Vasa 1', an audio-driven AI avatar that can produce lifelike talking faces in real time. This technology is showcased as an impressive example of AI's ability to mimic human expressions and movements.

💡Search and Replace

In the context of the video, 'search and replace' is a feature of Stable Diffusion 3 that allows users to replace objects within an image using simple language prompts, without the need for detailed masking. This feature is significant as it simplifies the image editing process and is highlighted as a time-saving innovation for designers and editors.

💡Image to Video

The 'image to video' feature mentioned in the video is a capability that connects to Stable Diffusion Video, allowing users to convert still images into video formats. This is an important aspect of the evolving AI technology, as it expands the possibilities for content creation and storytelling.

💡Masking Tools

Masking tools are a part of image and video editing that allow for the selection and modification of specific parts of an image or video. The video discusses how AI advancements are reducing the need for manual masking by providing smarter, more automated ways to select and edit portions of visual content.

💡Content Credentials

Content credentials are likened to a 'nutrition label' for AI-generated media. They provide information about whether a piece of media is entirely AI-generated or just modified, and which model was used in its creation. This concept is important for transparency and traceability in the era of synthetic media.

💡Firefly Image Model

Firefly Image Model is a collection of AI models within Adobe that the company is using to enhance its creative software offerings. The video discusses how Adobe is working on integrating these models into its ecosystem, starting with Adobe Premiere for video editing, to provide more intuitive and automated editing capabilities.

💡Vasa 1

Vasa 1 is Microsoft's AI avatar technology that is distinguished by being audio-driven, which means it generates realistic facial expressions and lip movements in response to audio input. This technology is highlighted in the video as a significant leap in creating more natural and believable virtual characters.

💡AI Art E-Sport Competition

The AI Art E-Sport Competition is an event where contestants use real-time drawing tools to generate images based on prompts, in a game similar to Pictionary. The video describes this event as a fun and engaging way to showcase the potential of AI in creative community events.

💡Smart Masking

Smart Masking is a feature that is being improved and integrated into Adobe Premiere, which uses AI to assist with the process of masking, or isolating specific parts of an image or video for editing. The video emphasizes the importance of this feature in streamlining the editing process and enhancing user control.

Highlights

Stable Diffusion 3 has been released with improved prompt understanding and text within image generation capabilities.

Two versions of Stable Diffusion 3 are available: Stable Diffusion 3 and Stable Diffusion Turbo.

Stable Diffusion 3 features a creative upscaler that can upscale images up to 4K resolution.

The new platform includes a search and replace feature that allows users to replace objects in images using simple language prompts.

Background removal is a built-in feature of Stable Diffusion 3, saving time for users.

Image-to-video feature connects directly to Stable Diffusion Video, offering new creative possibilities.

Image-to-image with a mask allows for selective modifications of an image using a mask, controlled by an image prompt.

Early examples of Stable Diffusion 3 showcase a significant improvement over Stable Diffusion XL in image quality and detail.

Adobe has announced generative AI video features to be integrated into Adobe Premiere.

Adobe's Firefly is a collection of AI models that will include a video model to be announced later this year.

Adobe Premiere will include object removal and addition, generative extend, and generative audio features.

Adobe is working on providing more control and granularity in their AI models over time.

Content credentials will be attached to media generated through Adobe tools, providing transparency on the generation process.

Stable Assistant Beta is a new platform offering a friendly chatbot for generating images, writing content, and matching photos to text.

Microsoft has entered the AI avatar game with Vasa 1, an audio-driven, lifelike talking face generation system.

Vasa 1 captures a wide spectrum of facial nuances and natural head motions for a realistic and lively character.

The first AI art esports competition was held, showcasing the potential for community events based on AI technology.

The competition used Leonardo's real-time drawing tool, where contestants had one minute to generate an image based on a prompt.