How to Create Voiceover Using Google Cloud Text to Speech

LearnWoo
28 May 202203:27

TLDRIn this informative video, Nian Roby from LearnVue demonstrates two methods to create text-to-speech voiceovers using Google Cloud. The first method involves setting up a Google Cloud account, enabling the Cloud Text-to-Speech API, and using an API key with the Wavenet extension for Chrome to convert text into speech. The second method is simpler, utilizing an audio capture extension and Google Cloud's interface to input text, select language and voice, and record the speech. Both methods are user-friendly and provide a quick way to generate voiceovers with over 200 voices and 40 languages available.

Takeaways

  • 📘 Create a Google Cloud account to access the Text-to-Speech API.
  • 🔍 Use the search bar on the Google Cloud homepage to find the Text-to-Speech API.
  • 💳 Provide credit card information for verification, but no charges will be made until after the complimentary credit is used up.
  • 🔑 Generate and restrict an API key for security reasons to prevent unauthorized use.
  • 🌐 Install the Wavenet extension from the browser's web store to facilitate the text-to-speech process.
  • 📝 Copy and paste text into the character counter to ensure it meets the API's character limit.
  • 📑 Right-click selected text to access the Wavenet extension for converting text to speech.
  • 🔄 Repeat the process for the entire article until completion.
  • 🎧 For an alternative method, install an audio capture extension to record the speech.
  • 🔉 Choose your desired language, voice, speed, and pitch settings in the Text-to-Speech interface.
  • 📱 Use the audio capture extension to record the speech after selecting the speak option.
  • 💾 Save the final audio file after the recording is complete.

Q & A

  • What are the two methods presented in the video for creating a text-to-speech voiceover using Google Cloud?

    -The video presents two methods: Method one involves setting up a Google Cloud account, enabling the Cloud Text-to-Speech API, and using the Wavenet extension for Chrome to convert text to speech. Method two is simpler, using an audio capture extension to record speech generated by typing or pasting text into the Google Cloud Text-to-Speech interface.

  • How many voices and languages can one choose from using Google Cloud's Text-to-Speech tool?

    -With Google Cloud's Text-to-Speech tool, one can choose from over 200 voices in more than 40 languages.

  • What is the complimentary credit amount provided by Google Cloud when signing up for a new account?

    -Google Cloud provides a complimentary credit of 300 units when signing up for a new account.

  • What is the cost for processing more than one million characters using the Wavenet invoices in the free version of the API?

    -If you want to process more than one million characters using the Wavenet invoices in the free version of the API, you will have to pay $16.

  • Why is it important to restrict the API key when creating credentials in Google Cloud?

    -It is important to restrict the API key because it functions like a password and should not be shared with anyone. Sharing it could allow others to use your account resources without authorization.

  • How can one download an audio file using the Wavenet Chrome extension?

    -After pasting your API key into the Wavenet Chrome extension and copying the text into the character counter, you right-click the selected text, choose the Wavenet option from the context menu, and select 'Download smp3' to download the audio file.

  • What is the purpose of the audio capture extension used in the second method?

    -The audio capture extension is used to record the speech generated by the Google Cloud Text-to-Speech tool. It allows users to capture and save the audio output directly from the browser.

  • How can one change the language, voice, speed, and pitch of the speech in the Google Cloud Text-to-Speech interface?

    -In the Google Cloud Text-to-Speech interface, you can choose your desired language and voice from the provided options. Additionally, you can toggle the speed and pitch to customize the speech output according to your preferences.

  • What is the process of recording speech using the audio capture extension after generating it with Google Cloud Text-to-Speech?

    -After generating the speech using Google Cloud Text-to-Speech, you click on the audio capture extension that you have downloaded earlier to start recording. Once the speech is generated, you go back to the audio extension and click 'finish' to save the audio file.

  • What is the prerequisite for using the first method of creating a text-to-speech voiceover using Google Cloud?

    -The prerequisite for using the first method is having a Google Cloud account. You need to sign up, verify your phone number, and provide credit card information for the final step of the account setup.

  • How does the complimentary credit of 300 units work in the context of Google Cloud's Text-to-Speech API?

    -The complimentary credit of 300 units allows new users to try out the Google Cloud Text-to-Speech API without incurring charges. Users can process up to one million characters with Wavenet voices before they need to consider payment for additional usage.

  • What should one do if they have doubts or questions about the methods presented in the video?

    -If viewers have doubts or questions about the methods or anything else, they are encouraged to leave a comment below the video. The presenter, Nian Roby from LearnVue, promises to reach out and help as much as possible.

Outlines

00:00

📢 Introduction to Google Cloud Text-to-Speech

The video begins with an introduction by Nian Roby from LearnVue, who outlines two methods for creating text-to-speech voiceovers using Google Cloud. The tool offers a wide selection of over 200 voices in more than 40 languages. The presenter emphasizes the need for a Google Cloud account and guides viewers through the sign-up process, including setting up billing information and enabling the Cloud Text-to-Speech API. The free version of the API allows processing up to one million characters for WaveNet voices, with an option to pay for additional usage. The video then instructs viewers on how to create and restrict an API key for security purposes and concludes the first method by showing how to use the WaveNet Chrome extension to convert text into speech and download the audio files.

🎧 Method Two: Using Audio Capture Extension

The second method described in the video is a straightforward approach to creating text-to-speech voiceovers. It involves installing an audio capture extension from the browser's web store and using the Google Cloud website without signing in. The presenter demonstrates how to input text into the 'Put text to speech into action' box on the Google Cloud site, choose the desired language and voice, and adjust the speed and pitch. To record the speech, the viewer is instructed to use the audio capture extension to start recording when the text-to-speech function is initiated. Once the recording is complete, the viewer can save the audio file. The video concludes with an invitation for viewers to subscribe for more content, ask questions, or seek help in the comments section.

Mindmap

Keywords

💡Google Cloud

Google Cloud is a suite of cloud computing services that runs on the same infrastructure that Google uses internally for its end-user products, such as Google Search, Google Maps, and YouTube. In the video, it is the platform where the Text-to-Speech voiceover is created, and it is central to both methods described for generating voiceovers.

💡Text-to-Speech

Text-to-Speech (TTS) is a technology that converts written text into audible speech. In the context of the video, TTS is the process by which the user creates voiceovers using Google Cloud's technology, which is the main focus of the tutorial.

💡Voiceover

A voiceover is a production technique where a voice—that is not part of the main action of a movie, TV show, or video game—is used to narrate or provide additional information. In the video, the host is teaching viewers how to create voiceovers using Google Cloud's Text-to-Speech API.

💡API Key

An API key is a unique identifier used in the context of software development to authenticate a user, developer, or calling program to an API. In the video, the API key is necessary to access and use the Google Cloud Text-to-Speech API, and it must be kept secure to prevent unauthorized use.

💡Wavenet

Wavenet refers to Google's advanced neural network-based voice synthesis technology that can generate speech that is almost indistinguishable from human speech. In the video, Wavenet is used as an extension to facilitate the text-to-speech process.

💡Complimentary Credit

Complimentary credit is a promotional offer where a service provider gives users a certain amount of credit for free to use their services. In the video, Google Cloud offers a complimentary credit of $300 to new users, which can be used to process characters for the Wavenet voiceover.

💡Audio Capture Extension

An audio capture extension is a browser plugin that allows users to record audio from their computer. In the second method described in the video, the audio capture extension is used to record the voiceover generated by the Google Cloud Text-to-Speech API.

💡Language Selection

Language selection refers to the process of choosing the language in which the text will be converted to speech. The video script mentions the ability to select from over 200 voices in 40 plus languages, showcasing the versatility of the Google Cloud Text-to-Speech service.

💡Voice Selection

Voice selection is the process of choosing the specific voice that will be used to narrate the text in a text-to-speech application. The video emphasizes that users can select from a wide range of voices, allowing for customization of the voiceover.

💡Speed and Pitch

Speed and pitch refer to the rate at which the speech is delivered and the frequency of the sound waves that make up the voice, respectively. In the video, the host demonstrates how users can adjust the speed and pitch of the voiceover to suit their preferences.

💡SMP3

SMP3 likely refers to a file format for audio files. In the context of the video, the host instructs viewers to download the voiceover as an SMP3 file using the Wavenet extension for method one of the voiceover creation process.

Highlights

Two methods are demonstrated for creating text-to-speech voiceovers using Google Cloud.

Google Cloud offers over 200 voices in more than 40 languages.

To get started, create a Google Cloud account and enable the Cloud Text-to-Speech API.

The free version of the API allows processing up to one million characters for WaveNet voices.

For the first method, an API key is required and should be kept secure.

Install the WaveNet extension from the browser's webstore and paste your API key.

Copy text into a character counter and use the WaveNet extension to download the audio.

For the second method, use the Audio Capture extension to record speech from the Google Cloud website.

Select your language, voice, and adjust speed and pitch as desired.

Record the speech using the Audio Capture extension and save the audio file.

Both methods are quick and simple to use for creating voiceovers.

The video provides a step-by-step guide for both methods.

The presenter is Nian Roby from LearnVue.

Complimentary credit of $300 is available upon sign-up.

The API key restricts access to prevent unauthorized use of account resources.

The video includes a demonstration of how to use the WaveNet extension.

The Audio Capture extension allows for easy recording of speech from the Google Cloud Text-to-Speech service.

The process is explained in a user-friendly manner suitable for beginners.

Subscribers can get additional help and support through comments.