NEW Stable AUDIO Released!

Sebastian Kamph
14 Sept 202306:23

TLDRThe video introduces Stable Audio, an open-source tool for generating music from prompts. The speaker expresses amazement at the variety of music styles produced by short text inputs, such as epic trailers, Lo-Fi beats, and bluegrass. They discuss the capabilities of the tool, including creating full tracks with instruments and sound effects, and mention the potential for open-source models and training one's own auto-generation models. Despite some technical difficulties due to high traffic, the video encourages viewers to explore Stable Audio on their website.

Takeaways

  • 🚀 Stable AI has launched Stable Audio, an open-source tool for generating music based on prompts.
  • 🎵 The tool can create a variety of music styles, such as epic trailer music, Lo-Fi hip-hop, and bluegrass.
  • 🎶 Prompts can be as simple as a genre or as detailed as specifying tempo and instruments.
  • 🎧 The generated music includes full tracks with instruments and sound effects.
  • 🎷 Even without deep knowledge of music theory, users can appreciate the quality of the generated music.
  • 🖼️ The creators of Stable Audio are also behind the popular AI art generator, Stable Diffusion.
  • 🌐 Open-source models based on Stable Audio are expected to be released, allowing users to train their own models.
  • 💻 Currently, users can test Stable Audio on their website, although it may be experiencing high traffic.
  • 🎼 The tool can generate sound effects, such as a restaurant ambiance or an airplane pilot's announcement.
  • 📈 There is a tiered pricing model for Stable Audio, with a basic free option and a paid plan for more tracks.
  • 🔥 The script encourages users to try Stable Audio for themselves, despite the current traffic issues.

Q & A

  • What is the main topic of the video transcript?

    -The main topic of the video transcript is the introduction and exploration of Stable Audio, an open-source tool for audio generation developed by Stability AI.

  • What is the significance of Stable Audio for musicians and audio creators?

    -Stable Audio is significant because it allows users to generate music tracks using simple text prompts, which can greatly streamline the music creation process and inspire new ideas without the need for extensive musical knowledge or equipment.

  • How does the speaker describe their experience with the 'Epic trailer music' prompt?

    -The speaker describes their experience as quite cool and amazing, noting how the tool was able to generate intense tribal percussion and brass music from a short text prompt.

  • What is the drummer's unique naming convention for his twin girls?

    -The drummer named his twin girls 'One' and 'Two', which the speaker uses as an opportunity to make a dad joke in the video.

  • What is the speaker's opinion on the 'Lo-Fi hip-hop beat' example?

    -The speaker is impressed by the 'Lo-Fi hip-hop beat' example, appreciating the melodic and chill hop at 85 BPM that the AI generated from a brief prompt.

  • How does the speaker react to the 'Bluegrass' music example?

    -Although the speaker mentions that 'Bluegrass' is not personally their style, they express amazement at the quality and authenticity of the music generated by the AI from a simple prompt.

  • What is the speaker's experience with the 'piano solo chord' example?

    -The speaker is impressed by the 'piano solo chord' example, noting that the AI seems to have understood and applied a chord progression in a major key, which they associate with a happy and uplifting sound.

  • What other types of audio除了 music tracks can the Stable Audio tool generate?

    -Besides music tracks, the Stable Audio tool can also generate sound effects, such as the voices of people talking in a restaurant or an airplane pilot speaking over the intercom.

  • What is the speaker's anticipation regarding future releases from Stability AI?

    -The speaker is looking forward to upcoming releases from Stability AI, including open-source models based on Stable Audio and trading code that will allow users to train their own auto-generation models.

  • What is the current status of the speaker's attempt to generate 'Epic Funk rap beats'?

    -The speaker's attempt to generate 'Epic Funk rap beats' is taking a long time due to high traffic, indicating that the service is overwhelmed and the speaker suggests that viewers might have better luck or should try it themselves later.

  • How does the speaker feel about the pricing model of Stable Audio?

    -The speaker mentions that while the service is not entirely free, with a limit of 20 tracks up to 45 seconds per month at no cost and a $12 per month fee thereafter, it is still fairly affordable but appreciates the availability of open-source models.

Outlines

00:00

🎶 Introduction to Stable Audio and its Features

The paragraph introduces Stable Audio, an open-source tool for generating music, developed by the team behind Stable Fusion. The speaker expresses excitement about the capabilities of Stable Audio and shares a personal anecdote involving a drummer with twin girls named 'One' and 'Two'. The main features of Stable Audio are discussed, including its ability to create music based on text prompts, such as 'Epic trailer music' and 'Lo-Fi hip-hop beat', and the variety of genres it can produce. The speaker also mentions the technical aspects like 'fast timing conditions latent Auto diffusion' and the incorporation of instruments and sound effects. The paragraph concludes with information about the availability of open-source models and the pricing structure of Stable Audio's services.

05:00

🎵 Exploring Different Music Genres and Sound Effects

This paragraph delves into the diverse genres of music that can be generated using Stable Audio, including 'Bluegrass', 'synth pop', and 'ambient techno with Scandinavian Forest'. The speaker shares their personal preferences and experiences with the generated music, highlighting the impressive quality despite the simplicity of the prompts. The paragraph also touches on the ability of Stable Audio to create realistic sound effects, such as a restaurant ambiance and an airplane pilot's announcement. The speaker recommends that listeners try Stable Audio for themselves, acknowledging that the service might be experiencing high traffic and potential delays.

Mindmap

Keywords

💡Open source

Open source refers to something that can be freely used, modified, and shared because its design is publicly accessible. In the context of the video, it highlights the availability of the stable audio tool's code to the public, allowing for community contributions and local experimentation with the technology.

💡Stable AI

Stable AI is the company behind the stable audio tool, which specializes in creating AI models for generating music and sound effects. The term implies a focus on stability and reliability in their AI products, which is crucial for producing consistent and high-quality audio outputs.

💡Auto-diffusion

Auto-diffusion is a term related to AI models that generate content by learning from data and iteratively refining outputs. In the context of the video, it refers to the process by which stable audio creates music based on user prompts, suggesting a sophisticated algorithm capable of understanding and producing complex audio content.

💡Music generation

Music generation is the process of creating music using artificial intelligence, where the AI learns patterns and styles from existing music to produce new compositions. In the video, it is the core functionality of stable audio, demonstrating the tool's ability to create various music styles based on user input.

💡Sound effects

Sound effects are audio elements that are used to enhance the atmosphere, mood, or narrative of a production, such as a film, video game, or virtual environment. In the context of the video, sound effects are part of the capabilities of stable audio, allowing users to generate not just music but also ambient sounds and specific audio cues.

💡Pricing model

A pricing model refers to the structure or strategy used by a company to charge for its products or services. In the video, it is mentioned in relation to stable audio's business model, where users can generate a limited number of tracks for free but need to pay a subscription fee for more extensive use.

💡Community interest

Community interest refers to the level of engagement and enthusiasm from a group of people, typically those who share common interests or goals. In the video, it reflects the anticipation and excitement from the user base regarding the potential release of open source models for stable audio, indicating a strong desire for accessible and customizable AI tools.

💡User prompts

User prompts are inputs provided by users to guide AI in generating specific outputs. In the context of the video, these prompts are textual descriptions of the desired audio, such as genre, mood, or specific instruments, which stable audio uses to create custom music or sound effects.

💡BPM

BPM stands for beats per minute and is a measure used to indicate the tempo of music. It is a crucial aspect of music composition and performance, as it helps establish the rhythm and pace of a piece. In the video, BPM is used to describe the speed at which the music beats occur, such as in the 'Lo-Fi hip-hop beat, melodic chill hop 85 BPM.'

💡Chord progression

A chord progression is a series of chords played in a particular order, forming the harmonic foundation of a piece of music. It is a fundamental aspect of music theory and composition, as it creates tension and resolution, contributing to the overall mood and structure of a song. In the video, the mention of a 'solo chord progression major key' suggests an understanding of music theory in the AI's output.

💡Traffic overload

Traffic overload occurs when a service or platform experiences an unusually high volume of users or requests, leading to slower performance or even downtime. In the context of the video, it refers to the challenge faced by stable audio's servers due to a surge in users wanting to access the tool, highlighting the popularity and demand for AI-generated audio content.

Highlights

Stability AI launched Stable Audio, a tool for music generation.

Stable Audio allows users to create music through text prompts.

The tool utilizes fast timing conditions, latent auto diffusion, and other advanced techniques.

An example prompt generated epic trailer music with intense tribal percussion and brass.

Lo-Fi hip-hop beat and melodic chill hop at 85 BPM were demonstrated.

Bluegrass genre was created from a simple prompt, showcasing the tool's versatility.

Full music tracks with instruments and sound effects can be generated.

A piano solo chord progression in a major key was played, impressing the speaker.

Sound effects like people talking in a restaurant and an airplane pilot speaking were accurately generated.

Stable diffusion is the leading AI art generator, and the team is now working on audio generation.

Open source models based on Stable Audio are expected to be released.

Stable Audio has a pricing model, offering up to 45 seconds of music for a monthly fee.

The tool is currently overwhelmed with traffic, indicating high demand.

An epic Funk rap beat with piano and violin was attempted but took a long time to generate.

Synth pop with a big reverb synthesizer pad chord was played, showcasing the tool's capabilities.

Calm meditation music suitable for a spa lobby was generated.

An electric guitar top line solo instrumental was created, reminiscent of classic rock.

Ambient techno with Scandinavian forest sounds was also demonstrated.

The speaker recommends users to try Stable Audio for themselves despite the current traffic overload.