AI News: Everything You Missed This Week!
TLDRThis week in AI news, Gen 3 text-to-video generation was released, offering improved capabilities for Runway Pro users. 11 Labs added iconic voices to their reader app and introduced a voice isolator feature. Sunno launched a mobile app for music creation, while Meta and Open AI faced lawsuits over content use. New tools like 3D Genin and open-source models like Intern LM 2.5 and Moshi AI were unveiled, hinting at AI's rapid advancement. Developments in AI chat systems, text-to-3D imaging, and immersive teleoperation of robots showcase the diverse applications of AI technology.
Takeaways
- 🎉 Gen 3 access was made publicly available, but only for pro users of Runway.
- 🦅 An attempt was made to create a 4th of July themed video with Gen 3, showcasing its capabilities and limitations.
- 🔊 11 Labs added famous voices to their reader app, including Judy Garland and James Dean, with permission from their estates.
- 🎙️ 11 Labs introduced a new voice isolator feature to clean up audio with background noise.
- 🎵 Sunni released an app similar to their web version, making music creation more accessible on mobile devices.
- 📈 Meta released research on '3D Genin', a text-to-3D image generator that could impact game development and 3D video asset creation.
- 🤖 An open-source AI research lab released a new voice model, Moshi, to compete with GPT-40's advanced voice assistant.
- 📚 Hugging Face made an open-source large language model, Intern LM 2.5, available with a 1 million context window.
- 🌐 The Brave browser updated to allow users to integrate their own AI models into the browser for a customized experience.
- 🔍 Perplexity introduced Pro Search with multi-step reasoning, improved math and programming capabilities, and integration with Wolfram Alpha.
- 📹 YouTube and Instagram introduced features to protect creators' likeness and voice from unauthorized AI usage, allowing content takedown requests.
Q & A
What major event in the United States affected the number of AI announcements this week?
-The 4th of July holiday in the United States led to a slowdown in major AI announcements and big releases, as many companies reduce their activities during this holiday week.
What is Gen 3 and how can it be accessed by users?
-Gen 3 is a text-to-video generator that has been made publicly available. Users who are on a pro plan for Runway can access it by visiting the 'generate videos' section and entering their prompt.
What is the limitation of Gen 3 compared to Luma AI?
-While Gen 3 is the best text-to-video generator currently available, it does not support image-to-video generation. Luma AI, on the other hand, excels in image-to-video generation.
Which famous voices were added to the 11 Labs reader app?
-11 Labs added the voices of Judy Garland, James Dean, Bert Reynolds, and Sir Laurence Olivier to their reader app, with permissions obtained from their estates.
What new feature did 11 Labs release to improve audio quality?
-11 Labs released a new voice isolator feature that can clean up audio with background noise, making it sound clearer and more professional.
What is the difference between the new voice model released by Q Tai and GPT-40's advanced voice assistant?
-The new voice model by Q Tai is open-source and currently less expressive than GPT-40's advanced voice assistant, but it has the potential to improve as other companies build upon its technology.
What is the significance of the open-source large language model 'intern LM 2.5' released on hugging face?
-The 'intern LM 2.5' is significant because it offers a 1 million context window, making it a powerful tool for developers and researchers, and it is available for anyone to use and build upon.
What update did the Brave browser receive regarding AI models?
-The Brave browser updated to allow users to bring their own AI models into the browser, in addition to the existing models like Mixr, Claude, and llama.
What is the controversy surrounding Open AI and their use of content from the open web?
-Open AI is facing lawsuits for alleged copyright infringement, as some claim they used content from the open web without permission or compensation, despite other organizations having licensing deals with Open AI.
What new feature did YouTube introduce to protect the likeness of individuals?
-YouTube introduced a feature allowing individuals to request the removal of content that simulates their likeness or voice, expanding on the existing copyright claim system.
What is the rumored partnership between Apple and Google's Gemini for AI technology?
-There are rumors that Apple might partner with Google's Gemini for AI technology, potentially offering an alternative to Open AI's technology, but this is still speculative and not confirmed.
Outlines
📹 AI in Video Generation and Updates from 11 Labs
This week in AI, the focus is on advancements in video generation technology. Gen 3, a text-to-video generator, was made publicly available, although it's accessible only to Runway's pro users. The script discusses the capabilities of Gen 3 and compares it with Luma AI, which excels in image-to-video generation. Additionally, 11 Labs updates its reader app with famous voices and introduces a voice isolator feature for audio cleaning. The reader app now includes voices of iconic figures like Judy Garland and Burt Reynolds, with permissions from their estates. The voice isolator is showcased in a demo that highlights its ability to clear background noise from audio, which could be beneficial for music production and other audio-related tasks.
🤖 New Developments in AI: Text-to-3D and Open Source Models
The script covers Meta's new research in text-to-3D image generation, which could revolutionize fields like game development. It also mentions the release of a new voice model by an open-source AI research lab, Q Tai, which aims to compete with GPT-4's advanced voice. This model is available for public use and is open-source, allowing others to build upon its technology. Furthermore, an open-source large language model called Intern LM 2.5 is introduced, featuring a 1 million context window, and the Brave browser's update that allows users to integrate custom AI models is discussed. Perplexity's Pro search is also highlighted for its multi-step reasoning and integration with Wolfram Alpha for enhanced math and programming capabilities.
📚 AI and Content Rights: Lawsuits and Ethical Considerations
The script delves into the ongoing legal battles surrounding AI and content rights, with the Center for Investigative Reporting suing Open AI and Microsoft for copyright infringement. It raises questions about the use of publicly available content for AI training and the social contract of content on the open web. Additionally, Mustafa Solman's views on content use and copyright are discussed, along with Cloudflare's new feature that allows website owners to prevent their content from being scraped by AI. Figma's approach to training their AI on user designs by default, and the controversy over their weather app's design, are also covered.
🕶️ AI Integration in AR Glasses and Other Tech Innovations
The final paragraph touches on various AI-related updates, including YouTube's new policy on AI-generated content, Instagram's tweak to its AI labeling, and the anticipated release of Grock 2. It also speculates on potential partnerships between Apple and Google's Gemini for AI technology. WhatsApp's new feature that generates alternate versions of users' appearances is mentioned, along with the competition Meta faces with its Ray-Ban Stories from a new company using Chat GPT-40 in their AR glasses. The script concludes with an exciting project called Open Television, which allows remote operation of a robot, drawing parallels to the movie Avatar.
Mindmap
Keywords
💡AI News
💡Gen 3
💡Runway
💡Text-to-Video Generator
💡Luma AI
💡11 Labs
💡Voice Isolators
💡SunnO
💡3D Genin
💡Open Source
💡Brave Browser
💡Perplexity Pro
💡Grock 2
💡Open Television
Highlights
Gen 3, a text-to-video generator, was made publicly available, but only for pro users of Runway.
Comparison between Gen 3 and Luma AI for text-to-video generation, with Luma AI being rated higher for image-to-video conversion.
11 Labs updates its reader app with famous voices like Judy Garland and James Dean, ensuring legal permissions were obtained.
11 Labs introduces a voice isolator feature to clean up audio with background noise.
SunnO releases an app for iOS that mirrors the functionality of its web app for music creation.
Meta's new research, 3D Genin, allows text prompts to generate 3D images, potentially accelerating game development and 3D video asset creation.
Kotai, an open-source AI research lab, releases a new voice model to compete with GPT-40's advanced voice assistant, with real-time response capabilities.
Hugging Face makes an open-source large language model, Intern LM 2.5, with a 1 million context window available for public use.
Brave browser updates to allow users to bring their own AI models into the browser for a customized experience.
Perplexity's Pro Search feature enhances multi-step reasoning, math, and programming capabilities with the integration of Wolfram Alpha.
Apple potentially securing a board seat in OpenAI as an observer, highlighting the competitive dynamics between Microsoft and Apple.
OpenAI faces another lawsuit for copyright infringement, accused of using news stories to train their models without permission or compensation.
Mustafa Solman's controversial statements on the fair use of content on the open web and the implications for AI training.
Cloudflare introduces a feature to prevent AI scraping of websites, offering a solution for content creators concerned about their work being used in AI training.
Figma's official statement on using user designs to train their AI models, and the subsequent controversy over their weather app's design similarities to Apple's.
YouTube's new feature allowing users to request the removal of content that simulates their likeness or voice.
Instagram's update to clarify AI usage in images, changing the metadata label from 'Made with AI' to 'AI Info'.
Rumors of a potential partnership between Apple and Google's Gemini for AI-powered chat features.
WhatsApp's leaked feature that generates cartoon or alternate versions of a user's uploaded image.
Meta's competition in the smart glasses market with a new company using Chat GPT-40 for its AI capabilities.
Open Television, a platform that allows remote operation of a robot with immersive controls, inspired by the movie Avatar.