How to Use Generative Audio | Runway Academy

Runway
8 May 202403:07

TLDRIn this Runway Academy tutorial, we explore the generative audio tool, which allows text-to-speech conversion, custom voice model training with clean audio, and creating lip-sync videos. The process includes generating audio from text, saving it in the assets folder, and customizing voice models for unique outputs. Additionally, we learn to sync audio with images or videos, with tips to enhance the reversing effect for longer audio clips, providing a comprehensive guide to leveraging Runway's audio capabilities.

Takeaways

  • 🎙️ Use the generative audio tool in Runway to convert text into spoken audio.
  • 🔍 Preview and select from a list of default voices to generate audio.
  • ⏱ Generation time varies based on script length but is generally quick.
  • 📂 Audio files are automatically saved in the generative audio folder within assets.
  • 📁 Custom save locations can be chosen via a drop-down menu.
  • 🎧 Train a custom voice model with a few minutes of clean audio.
  • 📝 Ensure the audio for custom voice models is as clear as possible.
  • 🖼️ Create lip-sync videos using an image or video with a full viewable face.
  • 🔄 Lip-sync can accommodate generated, recorded, or uploaded audio.
  • 🎥 Convert images to video using Gen 2 for video-based lip-sync.
  • 🔁 If audio is longer than video, the video will loop to match audio duration.
  • 🎨 Use motion brush for subject motion to minimize the reversing effect in videos.
  • 💡 Join the Runway community on Discord for more resources and assistance.

Q & A

  • What is the main topic of the Runway Academy video?

    -The main topic of the video is generative audio, which includes text to speech, custom voice models, and creating lip sync videos in Runway.

  • How do you access the generative audio tool in Runway?

    -You access the generative audio tool by clicking on it from the Runway dashboard at the top.

  • What is the first step after typing in the text for the generative audio tool?

    -The first step is to preview the text and choose a voice from the default voice list.

  • What is the default name of the voice in the provided example?

    -The default voice provided in the example is named James.

  • How long does it usually take for the audio generation to complete?

    -The generation times depend on the total script length, but they usually go pretty quickly.

  • Where are the audio generations saved by default in Runway?

    -By default, audio generations are saved to the generative audio folder inside the main assets folder in Runway.

  • What is required to train a custom voice model in Runway?

    -To train a custom voice model, you need a few minutes of clean audio which can be imported or recorded within the generative audio tool.

  • What should be ensured while recording the audio for a custom voice model?

    -The audio should be as clean as possible to ensure the best results for the custom voice model.

  • What is needed to create a lip sync video in Runway?

    -To create a lip sync video, you need an image or video of a person with their full face viewable within the frame.

  • How can you add new text to speech for a lip sync video?

    -You can add new text to speech by typing the text, choosing your voice, and clicking on the generate button.

  • What happens if the audio is longer than the video in a lip sync video?

    -If the audio is longer than the video, once the video reaches its end, it will reverse and go back to the beginning for the duration of the audio.

  • What is a pro tip for using the video workflow in Runway to avoid a noticeable reversing effect?

    -A pro tip is to avoid using camera motion parameters and just add subject motion with the motion brush to make the reversing effect less noticeable.

  • How can viewers find more helpful resources and join the community for Runway?

    -Viewers can join the community on Discord for more information and experimentation using Runway, or find specific answers using the button on the dashboard at any time.

Outlines

00:00

🎙️ Introduction to Generative Audio

This paragraph introduces the topic of the video, which is generative audio in Runway Academy. It covers text-to-speech, custom voice models, and creating lip-sync videos. The speaker explains how to access the generative audio tool from the dashboard, input text, and select a voice from the default list. The process of generating audio from the script is described, including the automatic saving of audio files to the assets folder and the option to save them elsewhere. The paragraph also mentions the possibility of training a custom voice model using a few minutes of clean audio.

🔊 Custom Voice Model Training

The second paragraph delves into the process of training a custom voice model within the generative audio tool. It details the requirement of having a few minutes of clean audio, which can be imported or recorded directly in Runway. The speaker suggests reading from the provided script or using one's own, emphasizing the importance of audio clarity. Once the audio is ready, the user is instructed to name the voice model, and it will be quickly ready for use with text-to-speech functionality.

🎥 Creating Lip-sync Videos

This paragraph explains how to create lip-sync videos using the generative audio tool. It requires an image or video of a person with a full face visible within the frame. The user can upload their own media or choose from preset characters. The paragraph outlines the process of adding generated audio from text-to-speech, recorded audio, or uploaded audio, and then selecting a voice to generate the lip-sync effect. It also provides a tip on how to handle videos longer than the audio by using Gen 2 to create a video from an image and then adding lip-sync in the generative audio tool, with a note on avoiding camera motion parameters for a smoother reversing effect.

📚 Conclusion and Additional Resources

The final paragraph wraps up the video with a conclusion, thanking viewers for their time and encouraging them to engage with the community on Discord for more information and experimentation with Runway. It also mentions the availability of a button on the dashboard for finding specific answers to questions. The speaker reiterates the invitation to get started with the work at hand and ends the video with a warm appreciation for the viewers' attention.

Mindmap

Keywords

💡Generative Audio

Generative audio refers to the process of creating new audio content using artificial intelligence. In the context of the video, it encompasses text-to-speech conversion, custom voice models, and lip-sync videos. The main theme revolves around using Runway's generative audio tool to produce spoken audio from text, which is a significant aspect of the video's educational content.

💡Text to Speech

Text to speech (TTS) is a technology that converts written text into audible speech. The video script explains how to use Runway's generative audio tool to transform typed text into spoken audio files, choosing from a list of default voices, which is a core feature demonstrated in the tutorial.

💡Custom Voice Models

A custom voice model in the video refers to the creation of a unique voice profile using a few minutes of clean audio. This process allows users to train the tool with their own voice, which can then be used for text-to-speech purposes. The script illustrates this by guiding users through the process of recording and naming their voice model.

💡Lip Sync

Lip sync is the process of matching an audio track, especially speech, with the movements of the lips in a video. The video describes how to create a lip-sync video using an image or video of a person, ensuring that the generated or uploaded audio aligns with the character's lip movements for a realistic effect.

💡Runway Dashboard

The Runway dashboard is the central interface for accessing and managing the tools and features within the Runway platform. In the script, it is mentioned as the starting point for using the generative audio tool, indicating its importance in navigating the platform's functionalities.

💡Generative Audio Tool

The generative audio tool is a specific feature within Runway that facilitates the creation of audio content. The video script details its use for text-to-speech conversion, training custom voice models, and creating lip-sync videos, highlighting its versatility in audio generation.

💡Audio Generation

Audio generation in the video refers to the output produced by the generative audio tool, which includes the spoken audio files created from text input. The script mentions that these generations are automatically saved in a specific folder within the user's assets, showcasing the tool's convenience.

💡Clean Audio

Clean audio is a term used to describe high-quality audio recordings without background noise or distortion. The video emphasizes the importance of clean audio when training a custom voice model, as it ensures the accuracy and clarity of the generated voice.

💡Preset Characters

Preset characters are pre-designed characters available within the tool for users to utilize in their projects. In the context of lip-sync videos, the script suggests using these characters to demonstrate the lip-sync feature without requiring users to upload their own media.

💡Gen 2

Gen 2, as mentioned in the script, is a feature or tool within Runway that allows users to convert an image into a video. This is particularly useful for adding lip-sync to a video, as explained in the tutorial, by first creating a video from a still image.

💡Motion Brush

The motion brush is a tool within Runway that enables users to add subject motion to their videos. The script provides a pro tip on using the motion brush to avoid camera motion parameters when creating a lip-sync video, which helps in making the reversing effect less noticeable.

💡Discord Community

The Discord community mentioned in the video is a platform where users can join for more information, experimentation, and support related to using Runway. It serves as a social hub for users to interact, share experiences, and seek help regarding the generative audio tool and other features.

Highlights

Introduction to generative audio in Runway Academy.

Generative audio includes text to speech, custom voice models, and creating lip sync videos.

Accessing the generative audio tool from the Runway dashboard.

Type in text to convert it into a spoken audio file.

Preview and select a voice from the default voice list.

Generation times vary based on script length but are usually quick.

Audio Generations are automatically saved to the generative audio folder.

Option to save audio to a different location through the drop-down menu.

Training a custom voice model with clean audio.

Recording or importing audio for custom voice model training.

Naming the voice model and its readiness for use with text to speech.

Creating a lip sync video with an image or video of a person.

Ensuring the full face is viewable within the frame for lip sync.

Uploading custom media or using preset characters for lip sync.

Using lip sync with generated, recorded, or uploaded audio.

Adding text to speech and generating audio for lip sync.

Turning an image into a video using Gen 2 for adding lip sync.

Handling audio longer than video duration with a reversing effect.

Pro tip for avoiding camera motion parameters in the video workflow.

Using motion brush for subject motion to reduce reversing effect visibility.

Invitation to join the Runway community on Discord for more resources.

Using the dashboard button for finding specific answers to questions.