Dall-E 3, Sora, & ChatGPT Plus: Stable Audio vs Suno v3 & New Video Generator!

Theoretically Media
4 Apr 202411:16

TLDRIn this AI news update, OpenAI introduces in-painting for Dolly 3, despite its delayed implementation. Stability AI releases Stable Audio 2.0, offering free music generation, though it lags behind Sunno in quality. Chad GPT 3.5 is now accessible for free without login. Sora's first music video, 'World Weight,' showcases its potential but faces competition from Hyper. New portrait animator Anna Portrait emerges, and an upcoming video model, HiFi, is announced with a focus on video editing and generation improvements.

Takeaways

  • 🎨 OpenAI has introduced in-painting feature in Dolly 3, allowing users to edit images by adding or changing elements within the picture.
  • 🖌️ The in-painting process is not as intuitive as one might expect and requires users to manually select and edit areas of the image using a selection brush.
  • 🍞 An example given in the script involved adding butter to a piece of toast in an image, which resulted in an overly exaggerated depiction.
  • 🎶 Stability AI released Stable Audio 2.0, capable of creating full musical tracks up to 3 minutes long from a single prompt and offers 20 free credits per month for users.
  • 🥇 Sunno, another AI music generator, is considered superior to Stable Audio in terms of audio fidelity and genre-appropriate instrumentation.
  • 🎵 Users can add their own audio into Stable Audio as a reference, which is a unique feature not found in other AI music generators.
  • 🆓 OpenAI now allows users to access Chat GPT 3.5 for free without needing to log in, expanding accessibility to its AI chat model.
  • 🎬 The first music video created with Sora has been released, showcasing its capabilities in generating visuals for music tracks.
  • 🌟 The video created with Sora, titled 'World Weight' by August Camp, has an ambient electronic feel, reminiscent of artists like Aphex Twin.
  • 👤 A new portrait animator called Anna Portrait has been introduced, which uses a combination of a reference photo and video to generate realistic portraits.
  • 🚀 HiFi, a new video generator with a focus on editing characters and objects within videos, has been announced with a beta sign-up available.

Q & A

  • What new feature has been introduced in Dolly 3 that was long overdue?

    -The new feature introduced in Dolly 3 is the in-painting capability, which allows users to edit images by adding or changing elements within the photo, such as adding butter to a piece of toast.

  • What was the speaker's initial impression of Dolly 3's output aesthetic?

    -The speaker was not a huge fan of Dolly 3's output from an aesthetic standpoint, as they personally did not feel it jived much with their preferences.

  • How does the integration of Dolly 3 with chat GPT enhance its functionality?

    -The integration of Dolly 3 with chat GPT allows users to chat with their image generator, providing a more interactive and dynamic experience in creating and editing images.

  • What is the main advantage of Stable Audio 2.0 compared to its previous version?

    -Stable Audio 2.0 creates full musical tracks up to 3 minutes in length from a single prompt and offers 20 free credits per month, making it more accessible and cost-effective for users.

  • How does the speaker describe the audio quality of Sunno compared to Stable Audio?

    -The speaker believes that Sunno has better audio quality, offering higher fidelity, more accurate instrumentation, and composition choices that align better with the genre prompted.

  • What unique feature does Stable Audio offer that Sunno does not?

    -Stable Audio allows users to add their own audio as a reference, providing more personalized and creative uses for the generated music.

  • How can one access and use Chat GPT 3.5 for free without logging in?

    -By visiting the homepage and clicking on 'try it out,' users can start using Chat GPT 3.5 for free without the need to log in.

  • What is the significance of the first music video created with Sora?

    -The first music video created with Sora, titled 'World Weight' by August Camp, showcases the capabilities of Sora in generating visuals for music, offering a new tool for creative expression.

  • How does the speaker view the public's perception of Sora?

    -The speaker feels there has been a shift in public opinion regarding Sora, with some seeing it as exclusive or part of a 'cool kids table' due to its limited access and promotional events.

  • What is the main goal of HiFi, the new video generator mentioned in the script?

    -HiFi aims to build an improved video editor that enables users to modify characters and objects in videos and train a more powerful video generation model.

  • What is the speaker's upcoming event related to AI and filmmaking?

    -The speaker will be attending the Curious Refuge AI filmmaking Mega party and will be judging the world's first AI Esports tournament alongside other notable figures.

Outlines

00:00

🚀 Open AI Updates and New Tools

This paragraph discusses recent updates from Open AI, including the introduction of in-painting in Dolly 3, despite it being a feature long anticipated by users. The speaker expresses mixed feelings about Dolly 3's aesthetic outputs but appreciates its integration with chat GPT. The paragraph also covers a new portrait animator and an upcoming video model. Additionally, the speaker reviews a significant audio update from Stability AI, highlighting the release of Stable Audio 2.0 which generates full musical tracks from a single prompt and is available for free with a limited credit system.

05:01

🎶 AI Music Generation and Sora News

The focus of this paragraph is on AI-generated music and the capabilities of different platforms. It compares the output of Stability AI's Stable Audio 2.0 with that of Sunno, noting that while Sunno offers better audio fidelity and more accurate genre representation, Stability AI's ability to incorporate user's own audio as a reference is a unique feature. The paragraph also touches on the public's shifting perception of Sora and the release of the first music video created with Sora, comparing it to the capabilities of other free tools like Hyper.

10:02

🎨 New Portrait Animator and Upcoming Video Model

This paragraph introduces Anna Portrait, a new portrait animator inspired by Emotive Avatar Talker, which uses a combination of reference photos and videos to generate realistic portraits. The speaker also discusses an upcoming video model called Higs Field AI, led by Alex Masharov, former head of AI at Snap. Higs Field aims to improve video editing by allowing modifications to characters and objects in videos and training a more powerful video generation model. The speaker expresses excitement about the potential of these new tools and provides links for further exploration.

Mindmap

Keywords

💡AI news

AI news refers to the latest updates and developments in the field of artificial intelligence. In the context of the video, it sets the stage for discussing various AI-related topics, such as software updates, new AI models, and their applications.

💡Open AI

Open AI is an organization dedicated to ensuring that artificial general intelligence (AGI) benefits all of humanity. In the video, Open AI is mentioned as a source of updates and new features, such as the in-painting feature in Dolly 3, which allows users to edit images interactively.

💡Dolly 3

Dolly 3 is an AI-generated image creation tool that is part of Open AI's suite of applications. It is known for its ability to generate photorealistic images based on textual prompts. The video talks about the addition of an in-painting feature, which enhances its utility for users.

💡Stability AI

Stability AI is a company focused on developing AI models for various applications, including audio generation. In the video, Stability AI is mentioned in relation to its audio update, Stable Audio 2.0, which allows the creation of full musical tracks from a single prompt.

💡Sunno

Sunno is an AI music generation platform that is recognized for its high-quality audio output and the ability to generate music in line with specific genres. The video compares Sunno's music generation capabilities with those of Stability AI, highlighting Sunno's superior audio fidelity and instrumentation choices.

💡Sora

Sora is an AI platform that specializes in creating music videos and visual content. In the video, the first music video created with Sora is discussed, which showcases the platform's capabilities in generating visuals and editing them to create a cohesive aesthetic.

💡Anna portrait

Anna portrait is an AI tool designed for creating realistic portraits, inspired by the emotive Avatar talker but taking a different approach. It uses a combination of reference photos and videos to generate high-quality portrait images.

💡HiFi

HiFi is an upcoming video generation platform led by Alex Masharov, the former head of AI at Snap. It aims to build an improved video editor and a more powerful video generation model, focusing on character and object modifications in videos.

💡AI-generated music

AI-generated music refers to the process of using artificial intelligence to create original musical compositions. In the video, this concept is explored through the comparison of different AI platforms, such as Stability AI and Sunno, which can generate music based on textual prompts.

💡AI filmmaking

AI filmmaking involves the use of artificial intelligence in the creation and production of films, including the generation of scripts, characters, dialogues, and visual effects. The video touches on this concept by discussing the use of AI in creating a music video with Sora and the potential for AI to revolutionize the filmmaking process.

💡AI tools

AI tools refer to software applications and platforms that utilize artificial intelligence to perform various tasks, such as image editing, music composition, and video generation. The video provides an overview of several AI tools and their features, highlighting the versatility and capabilities of these technologies.

Highlights

AI news continues to advance at a rapid pace, even during a slow week.

OpenAI introduces in-painting feature in Dolly 3, allowing users to edit images interactively.

Dolly 3's in-painting is not as intuitive as expected, requiring manual selection and editing.

The new portrait animator, Anna Portrait, shows promise by combining reference photos and videos for realistic outputs.

Stability AI's drama is not discussed, but Stable Audio 2.0 is highlighted for creating full musical tracks from a single prompt.

Stable Audio 2.0 offers 20 free credits per month, enabling users to explore AI-generated music at no cost.

Sunno's AI-generated music is praised for its higher quality and adherence to the prompted genre.

OpenAI now allows free access to Chat GPT 3.5 without the need for login, expanding accessibility to its AI model.

A new video model, higs field AI, is on the horizon, promising an improved video editor and more powerful video generation model.

Sora's first music video, 'World Weight', showcases the potential of AI in creating visually appealing content.

The comparison between Sora and Hyper highlights the potential for similar results with free tools.

The emotive Avatar talker evolves with Anna Portrait, which uses a combination of reference materials for enhanced realism.

AI-generated content is increasingly being used in various creative applications, such as music videos and filmmaking.

The AI community is abuzz with new developments, indicating a surge of innovative tools and applications in the near future.

The NAB show and Curious Refuge AI filmmaking Mega party are upcoming events showcasing the latest in AI technology.

AI advancements are not limited to large corporations; lean startups like HiFi are also contributing to the field.

The speaker, Tim, emphasizes the continuous evolution and expansion of AI capabilities and their practical applications.