AI News: Everything You Missed This Week!

Matt Wolfe
5 Jul 202419:28

TLDRThis week in AI news, Gen 3 text-to-video generation was released, offering improved capabilities for Runway Pro users. 11 Labs added iconic voices to their reader app and introduced a voice isolator feature. Sunno launched a mobile app for music creation, while Meta and Open AI faced lawsuits over content use. New tools like 3D Genin and open-source models like Intern LM 2.5 and Moshi AI were unveiled, hinting at AI's rapid advancement. Developments in AI chat systems, text-to-3D imaging, and immersive teleoperation of robots showcase the diverse applications of AI technology.

Takeaways

  • 🎉 Gen 3 access was made publicly available, but only for pro users of Runway.
  • 🦅 An attempt was made to create a 4th of July themed video with Gen 3, showcasing its capabilities and limitations.
  • 🔊 11 Labs added famous voices to their reader app, including Judy Garland and James Dean, with permission from their estates.
  • 🎙️ 11 Labs introduced a new voice isolator feature to clean up audio with background noise.
  • 🎵 Sunni released an app similar to their web version, making music creation more accessible on mobile devices.
  • 📈 Meta released research on '3D Genin', a text-to-3D image generator that could impact game development and 3D video asset creation.
  • 🤖 An open-source AI research lab released a new voice model, Moshi, to compete with GPT-40's advanced voice assistant.
  • 📚 Hugging Face made an open-source large language model, Intern LM 2.5, available with a 1 million context window.
  • 🌐 The Brave browser updated to allow users to integrate their own AI models into the browser for a customized experience.
  • 🔍 Perplexity introduced Pro Search with multi-step reasoning, improved math and programming capabilities, and integration with Wolfram Alpha.
  • 📹 YouTube and Instagram introduced features to protect creators' likeness and voice from unauthorized AI usage, allowing content takedown requests.

Q & A

  • What major event in the United States affected the number of AI announcements this week?

    -The 4th of July holiday in the United States led to a slowdown in major AI announcements and big releases, as many companies reduce their activities during this holiday week.

  • What is Gen 3 and how can it be accessed by users?

    -Gen 3 is a text-to-video generator that has been made publicly available. Users who are on a pro plan for Runway can access it by visiting the 'generate videos' section and entering their prompt.

  • What is the limitation of Gen 3 compared to Luma AI?

    -While Gen 3 is the best text-to-video generator currently available, it does not support image-to-video generation. Luma AI, on the other hand, excels in image-to-video generation.

  • Which famous voices were added to the 11 Labs reader app?

    -11 Labs added the voices of Judy Garland, James Dean, Bert Reynolds, and Sir Laurence Olivier to their reader app, with permissions obtained from their estates.

  • What new feature did 11 Labs release to improve audio quality?

    -11 Labs released a new voice isolator feature that can clean up audio with background noise, making it sound clearer and more professional.

  • What is the difference between the new voice model released by Q Tai and GPT-40's advanced voice assistant?

    -The new voice model by Q Tai is open-source and currently less expressive than GPT-40's advanced voice assistant, but it has the potential to improve as other companies build upon its technology.

  • What is the significance of the open-source large language model 'intern LM 2.5' released on hugging face?

    -The 'intern LM 2.5' is significant because it offers a 1 million context window, making it a powerful tool for developers and researchers, and it is available for anyone to use and build upon.

  • What update did the Brave browser receive regarding AI models?

    -The Brave browser updated to allow users to bring their own AI models into the browser, in addition to the existing models like Mixr, Claude, and llama.

  • What is the controversy surrounding Open AI and their use of content from the open web?

    -Open AI is facing lawsuits for alleged copyright infringement, as some claim they used content from the open web without permission or compensation, despite other organizations having licensing deals with Open AI.

  • What new feature did YouTube introduce to protect the likeness of individuals?

    -YouTube introduced a feature allowing individuals to request the removal of content that simulates their likeness or voice, expanding on the existing copyright claim system.

  • What is the rumored partnership between Apple and Google's Gemini for AI technology?

    -There are rumors that Apple might partner with Google's Gemini for AI technology, potentially offering an alternative to Open AI's technology, but this is still speculative and not confirmed.

Outlines

00:00

📹 AI in Video Generation and Updates from 11 Labs

This week in AI, the focus is on advancements in video generation technology. Gen 3, a text-to-video generator, was made publicly available, although it's accessible only to Runway's pro users. The script discusses the capabilities of Gen 3 and compares it with Luma AI, which excels in image-to-video generation. Additionally, 11 Labs updates its reader app with famous voices and introduces a voice isolator feature for audio cleaning. The reader app now includes voices of iconic figures like Judy Garland and Burt Reynolds, with permissions from their estates. The voice isolator is showcased in a demo that highlights its ability to clear background noise from audio, which could be beneficial for music production and other audio-related tasks.

05:00

🤖 New Developments in AI: Text-to-3D and Open Source Models

The script covers Meta's new research in text-to-3D image generation, which could revolutionize fields like game development. It also mentions the release of a new voice model by an open-source AI research lab, Q Tai, which aims to compete with GPT-4's advanced voice. This model is available for public use and is open-source, allowing others to build upon its technology. Furthermore, an open-source large language model called Intern LM 2.5 is introduced, featuring a 1 million context window, and the Brave browser's update that allows users to integrate custom AI models is discussed. Perplexity's Pro search is also highlighted for its multi-step reasoning and integration with Wolfram Alpha for enhanced math and programming capabilities.

10:01

📚 AI and Content Rights: Lawsuits and Ethical Considerations

The script delves into the ongoing legal battles surrounding AI and content rights, with the Center for Investigative Reporting suing Open AI and Microsoft for copyright infringement. It raises questions about the use of publicly available content for AI training and the social contract of content on the open web. Additionally, Mustafa Solman's views on content use and copyright are discussed, along with Cloudflare's new feature that allows website owners to prevent their content from being scraped by AI. Figma's approach to training their AI on user designs by default, and the controversy over their weather app's design, are also covered.

15:03

🕶️ AI Integration in AR Glasses and Other Tech Innovations

The final paragraph touches on various AI-related updates, including YouTube's new policy on AI-generated content, Instagram's tweak to its AI labeling, and the anticipated release of Grock 2. It also speculates on potential partnerships between Apple and Google's Gemini for AI technology. WhatsApp's new feature that generates alternate versions of users' appearances is mentioned, along with the competition Meta faces with its Ray-Ban Stories from a new company using Chat GPT-40 in their AR glasses. The script concludes with an exciting project called Open Television, which allows remote operation of a robot, drawing parallels to the movie Avatar.

Mindmap

Keywords

💡AI News

AI News refers to the latest developments, updates, and trends in the field of artificial intelligence. In the context of the video, it is the central theme around which all the discussed topics revolve, highlighting the significance of staying informed about advancements in AI.

💡Gen 3

Gen 3 is a reference to the third generation of a product or technology, specifically mentioned in the script as 'gen 3 access' being made publicly available. It signifies an upgrade or an advanced version of a previous iteration, indicating progress and new features in AI capabilities.

💡Runway

Runway is mentioned as a platform that requires a 'pro user' status to access Gen 3. It implies a subscription-based service or a professional tier of users who have access to more advanced features, showcasing a business model commonly used in tech and AI industries.

💡Text-to-Video Generator

A text-to-video generator is an AI tool that creates videos based on textual descriptions. The script mentions Gen 3's ability to generate videos with prompts, illustrating the evolving sophistication of AI in content creation and the potential for more dynamic and personalized media.

💡Luma AI

Luma AI is an example of an image-to-video generator mentioned in the script, which is considered superior to Gen 3 in this context. It represents the competitive landscape of AI tools and the ongoing development of more advanced image processing and video generation technologies.

💡11 Labs

11 Labs is highlighted in the script for updates to their 'reader app,' including the addition of famous voices. This signifies the integration of AI with multimedia content, enhancing user experience through personalized and iconic audio narrations.

💡Voice Isolators

Voice isolators, as introduced by 11 Labs, are AI-driven features that clean up audio by removing background noise. The script demonstrates the effectiveness of this technology with a demo, emphasizing the practical applications of AI in improving audio quality.

💡SunnO

SunnO is an app mentioned for its music creation capabilities, now with an iOS version for easier access on mobile devices. It represents the intersection of AI and creative arts, allowing users to generate music, reflecting the expanding horizons of AI applications.

💡3D Genin

3D Genin is a research release by Meta for generating 3D images from text prompts. It is an example of AI's role in accelerating content creation processes, particularly in industries like gaming and 3D animation, as discussed in the script.

💡Open Source

The term 'open source' is used in the script to describe AI technologies that are publicly accessible, allowing anyone to use, modify, and build upon them. This concept is highlighted with the introduction of new voice models and large language models, emphasizing community collaboration and innovation in AI development.

💡Brave Browser

The Brave browser's update to allow 'bring your own model' functionality is mentioned, indicating a shift towards customizable AI experiences within web browsing. This feature represents the growing demand for personalization and the integration of AI into everyday digital tools.

💡Perplexity Pro

Perplexity Pro is a service that offers advanced AI search capabilities with multi-step reasoning and improved math and programming features. The script discusses its benefits, such as increased efficiency and access to Wolfram Alpha, illustrating the depth of AI integration in enhancing search and analysis.

💡Grock 2

Grock 2, as teased for release in August, is an AI model upgrade that promises improvements in training data handling. The script suggests that it will address issues related to the reuse of data in AI training, indicating ongoing efforts to refine and ethically develop AI technologies.

💡Open Television

Open Television is a project that allows remote operation of a robot, creating an immersive experience similar to the movie 'Avatar.' The script describes a demonstration of this technology, highlighting the potential for AI to revolutionize remote operation and telepresence.

Highlights

Gen 3, a text-to-video generator, was made publicly available, but only for pro users of Runway.

Comparison between Gen 3 and Luma AI for text-to-video generation, with Luma AI being rated higher for image-to-video conversion.

11 Labs updates its reader app with famous voices like Judy Garland and James Dean, ensuring legal permissions were obtained.

11 Labs introduces a voice isolator feature to clean up audio with background noise.

SunnO releases an app for iOS that mirrors the functionality of its web app for music creation.

Meta's new research, 3D Genin, allows text prompts to generate 3D images, potentially accelerating game development and 3D video asset creation.

Kotai, an open-source AI research lab, releases a new voice model to compete with GPT-40's advanced voice assistant, with real-time response capabilities.

Hugging Face makes an open-source large language model, Intern LM 2.5, with a 1 million context window available for public use.

Brave browser updates to allow users to bring their own AI models into the browser for a customized experience.

Perplexity's Pro Search feature enhances multi-step reasoning, math, and programming capabilities with the integration of Wolfram Alpha.

Apple potentially securing a board seat in OpenAI as an observer, highlighting the competitive dynamics between Microsoft and Apple.

OpenAI faces another lawsuit for copyright infringement, accused of using news stories to train their models without permission or compensation.

Mustafa Solman's controversial statements on the fair use of content on the open web and the implications for AI training.

Cloudflare introduces a feature to prevent AI scraping of websites, offering a solution for content creators concerned about their work being used in AI training.

Figma's official statement on using user designs to train their AI models, and the subsequent controversy over their weather app's design similarities to Apple's.

YouTube's new feature allowing users to request the removal of content that simulates their likeness or voice.

Instagram's update to clarify AI usage in images, changing the metadata label from 'Made with AI' to 'AI Info'.

Rumors of a potential partnership between Apple and Google's Gemini for AI-powered chat features.

WhatsApp's leaked feature that generates cartoon or alternate versions of a user's uploaded image.

Meta's competition in the smart glasses market with a new company using Chat GPT-40 for its AI capabilities.

Open Television, a platform that allows remote operation of a robot with immersive controls, inspired by the movie Avatar.