[IFA 2023 Berlin] Speech Enhancement with AI - ai | coustics

World Trade Show TV / Germany
15 Sept 202304:39

TLDRAI Coustics, a Berlin-based startup, specializes in speech enhancement using AI. They can filter and improve speech quality from any audio content, making distant or poor-quality recordings sound as if captured by a high-quality microphone. Their technology works in real-time and is browser-based, with applications in live TV and software/hardware integration. They envision a future where their technology is integrated into hearing aids and headphones, allowing people with hearing disabilities to focus on speech. Their AI works by analyzing voice spectrograms to differentiate human voices from other sounds, using a custom inference engine built in Rust for efficiency. The company offers an API and SDK for product integration and invites users to test their service on their website.

Takeaways

  • ๐ŸŽค AI Coustics specializes in speech enhancement using AI technology.
  • ๐Ÿ” They can filter and enhance speech from any audio content, improving voice clarity even from poor quality sources.
  • ๐ŸŒ The technology works both in-browser and for real-time audio, such as live TV broadcasts.
  • ๐Ÿ‘ฅ The company is small, with only four employees, and is looking to collaborate with manufacturers for integration.
  • ๐Ÿ› ๏ธ They offer an API or SDK for easy integration into other products, allowing for speech enhancement features.
  • ๐Ÿ’พ Users can test the service by uploading files on their website to see the difference in audio quality.
  • ๐ŸŽง The technology has potential applications in hearing aids and could improve the experience for those with hearing disabilities.
  • ๐Ÿ“บ It can also be used to adjust audio levels in soundbars and TVs, making dialogue clearer without increasing overall volume.
  • ๐ŸŒ AI Coustics envisions a future where every headphone has built-in speech transparency, making hearing aids less necessary.
  • ๐Ÿค– Their AI works by analyzing the spectrogram to differentiate human voices from other sounds, providing a more accurate filter.
  • ๐Ÿ’ป They have developed their own inference engine for machine learning models, optimized for speed and efficiency.
  • ๐Ÿ”— More information and access to the SDK can be found on their website: https://ai-coustics.com/

Q & A

  • What is the primary focus of AI Coustics?

    -AI Coustics specializes in speech enhancement with AI, which involves filtering out speech and voices from audio content to improve their quality.

  • How does AI Coustics' technology improve audio quality?

    -AI Coustics' technology enhances audio by filtering out speech and voices, making them sound as if they were recorded with a high-quality microphone, even when the original recording is of poor quality or the microphone is far away.

  • Is AI Coustics' technology limited to browser use?

    -No, AI Coustics' technology is not limited to browser use. It works with pre-recorded files and in real-time, making it suitable for live content and microphone input.

  • What is the size of AI Coustics as a company?

    -AI Coustics is a small company with only four employees at the time of the transcript.

  • How can other manufacturers integrate AI Coustics' technology?

    -Manufacturers can integrate AI Coustics' technology into their devices by using their API or SDK, which allows for software or hardware solutions to incorporate speech enhancement features.

  • What is the process for private individuals to use AI Coustics' service?

    -Private individuals can use AI Coustics' service by visiting their website, uploading their audio files, and experiencing the enhanced audio quality for themselves.

  • What is a potential application of AI Coustics' technology in hearing aids?

    -AI Coustics' technology could be used in hearing aids to help individuals with hearing disabilities focus on speech more clearly, potentially reducing the need for expensive hearing aids.

  • How does AI Coustics' technology work with sound bars or TVs?

    -The technology can adjust the audio levels on sound bars or TVs, reducing loud music and enhancing voice levels, making it easier to understand dialogue without needing to increase overall volume.

  • What is unique about the AI used by AI Coustics?

    -The AI used by AI Coustics works on the spectrogram, identifying human voices and differentiating them from other sounds, allowing for precise filtering and enhancement of speech.

  • How does AI Coustics' inference engine for machine learning models differ from open-source solutions?

    -AI Coustics has developed its own inference engine for machine learning models, which is faster and more efficient than commonly found open-source solutions, allowing it to run on smaller devices.

  • What programming language did AI Coustics use to build their model?

    -AI Coustics used Rust to build their model, which is known for its performance and safety, making it suitable for systems programming.

  • How can interested parties access AI Coustics' SDK and test the model?

    -Interested parties can access AI Coustics' SDK and test the model by visiting their website at https://ai-coustics.com/.

Outlines

00:00

๐ŸŽ™๏ธ Speech Enhancement with AI

AI Coustics, a Berlin-based company, specializes in speech enhancement using artificial intelligence. They can process any audio content to isolate and improve the clarity of speech and voices, making them sound as if recorded with a high-quality microphone, even when the original audio is of poor quality or the microphone is distant. Their technology works both in real-time and with pre-recorded files, and can be applied to various devices and platforms, including live content and television. The company, currently consisting of four employees, offers an API or SDK for integration into other manufacturers' products and also provides a website for individual users to upload and enhance their audio files. They envision their technology being integrated into hearing aids and headphones to assist those with hearing disabilities, as well as improving the audio experience on soundbars and TVs by adjusting the balance between voice and background noise. The AI works by analyzing the spectrogram to differentiate human voices from other sounds, and the company has developed its own inference engine for machine learning models, which is faster than open-source solutions and can be used on small devices. They have built this technology in a few months using Rust, and they hope to see it integrated into headphones in the future. More information and access to their SDK can be found on their website, https://ai-coustics.com/.

Mindmap

Keywords

๐Ÿ’กSpeech Enhancement

Speech enhancement refers to the process of improving the quality of speech signals, typically by reducing noise and other interferences. In the context of the video, AI Coustics uses artificial intelligence to filter out non-speech elements from audio content, making the speech clearer and more intelligible. This is exemplified when they mention that even if a microphone is far away or the quality is poor, their technology can make it sound like a high-quality microphone is being used.

๐Ÿ’กAI Coustics

AI Coustics is the name of the company featured in the video, which is based in Berlin and specializes in speech enhancement using artificial intelligence. The name itself suggests a combination of 'AI' (Artificial Intelligence) and 'acoustics,' indicating their focus on sound technology. They aim to provide solutions that can be integrated into various devices to improve audio quality.

๐Ÿ’กReal-time processing

Real-time processing is the ability to process data as it is being received, without significant delay. In the video, AI Coustics highlights that their technology works not only on pre-recorded files but also in real-time, which is crucial for live applications like TV broadcasts. This capability allows for immediate enhancement of speech quality during live events.

๐Ÿ’กAPI/SDK

An API (Application Programming Interface) and SDK (Software Development Kit) are tools provided by AI Coustics that allow other manufacturers to integrate their speech enhancement technology into their products. The API is a set of rules and protocols for building software applications, while the SDK includes a collection of tools and libraries that facilitate the development process. The video script mentions that customers can get the API or SDK to integrate AI Coustics' technology into their solutions.

๐Ÿ’กSpectrogram

A spectrogram is a visual representation of the spectrum of frequencies of a signal as it varies with time. In the context of the video, AI Coustics uses AI to analyze the spectrogram of audio signals, distinguishing between human voices and other sounds. This allows their system to selectively enhance the voice frequencies while reducing or eliminating background noise.

๐Ÿ’กInference Engine

An inference engine is a component of a machine learning system that applies the trained model to new data to make predictions or decisions. AI Coustics has developed their own inference engine for machine learning models, which is mentioned as being faster and more efficient than open-source solutions. This engine enables their technology to run on small devices, making it suitable for various applications.

๐Ÿ’กHearing Aids

Hearing aids are electronic devices designed to amplify sound and help individuals with hearing impairments to hear more clearly. The video suggests that AI Coustics' technology could potentially be integrated into hearing aids to help people with hearing disabilities focus on speech more effectively.

๐Ÿ’กTransparency Mode

Transparency mode, in the context of the video, refers to a feature in headphones that allows the user to hear both the audio from the device and the surrounding environment. AI Coustics envisions a future where their speech enhancement technology could be used in transparency mode headphones to help users focus on speech while filtering out irrelevant sounds.

๐Ÿ’กSound Bars and TVs

Sound bars and TVs are electronic devices used for audio and video entertainment. The video discusses the potential application of AI Coustics' technology in these devices to improve the audio experience, especially in scenarios where the voice levels are low, and the background music is too loud. Their technology could adjust the audio balance to enhance speech clarity.

๐Ÿ’กHigh Frequencies

High frequencies refer to the higher end of the audible spectrum. In the context of speech enhancement, preserving high frequencies is important because they carry essential details of speech sounds. The video explains that traditional filtering methods often cut off these frequencies, making speech less intelligible. AI Coustics' AI-based approach is designed to maintain the integrity of high frequencies in the enhanced speech.

๐Ÿ’กRust

Rust is a systems programming language that focuses on safety, speed, and concurrency. The video mentions that AI Coustics used Rust to build their inference engine, which is a testament to the language's capabilities in creating efficient and high-performance systems, especially for real-time processing of audio signals.

Highlights

AI Coustics specializes in speech enhancement with AI.

It can filter out speech and voices from any audio content to improve sound quality.

Poor quality audio can be made to sound like it was recorded with a high-quality microphone.

The technology works in the browser and can enhance pre-recorded files or real-time audio.

AI Coustics is a small Berlin-based company with four employees.

They are in talks with manufacturers to integrate their technology into devices.

Customers can use their website to upload files and experience the enhanced audio.

The technology can filter out distracting noises, leaving only speech.

AI Coustics hopes to integrate their technology into hearing aids.

It could enable hearing aid users to focus on speech more clearly.

The technology could also be used in headphones for a transparency mode.

It can adjust audio levels in sound bars or TVs for better voice clarity.

AI Coustics envisions a future where every headphone has built-in speech transparency.

Their AI works on the spectrogram to distinguish human voices from other sounds.

They have developed their own inference engine for machine learning models.

Their technology is faster than open-source solutions and can run on small devices.

AI Coustics built their own model and inference engine in Rust.

The SDK and model can be tested on their website: https://ai-coustics.com/