Why voice computers always fail
TLDRThe video script discusses the struggles of voice-based computing platforms like Microsoft's Cortana and Amazon's Alexa, which have failed to meet expectations both financially and in consumer adoption. It introduces new devices from companies like Humane and Meta, highlighting their advanced features such as generative AI, built-in cameras, and wearable design. However, the script critically examines the claim that these devices could replace smartphones, pointing out the limitations of voice interfaces for complex tasks and privacy concerns. The video suggests that voice AI is better suited as a supplementary feature rather than a primary computing interface.
Takeaways
- 📉 The voice-based computing industry has faced significant challenges, with platforms like Microsoft's Cortana shutting down and Amazon's Alexa incurring substantial losses.
- 🚫 Despite massive investments and a decade of development, tech giants have yet to monetize voice-based platforms effectively or convince consumers to use them for complex tasks.
- 🌟 New companies like Humane and Meta are emerging with claims that they will revolutionize voice computing and deliver on its long-awaited potential.
- 🔄 The new generation of voice AI includes generative AI for smarter responses, built-in cameras for computer vision interactions, and wearable designs for reduced friction in use.
- 📱 Humane AI's pin and Meta's Ray-Ban glasses propose to replace traditional smartphones, but the concept faces skepticism due to practical limitations.
- 🖥️ Setting up and managing complex tasks on voice-only devices is currently impractical, necessitating real screens and precise input methods.
- 🔍 Voice computing lacks privacy, is inefficient for complex data input or review, and is generally less precise than text-based inputs.
- 🚷 Public use of voice interfaces raises concerns about annoyance and privacy invasion as others may overhear personal information.
- 🛍️ Shopping, finance, and productivity apps that rely on visuals and precise inputs are not well-suited for voice-only interactions.
- 🎶 Existing voice assistants are already adept at simpler tasks like music playback and smart home control, questioning the need for a complete voice-based platform.
- 💡 Alternative solutions for voice computing include integrating it as an additional feature to existing devices or enhancing it with screens and precise input methods.
Q & A
What happened to Microsoft's Cortana and Amazon's Alexa in terms of their performance in the voice-based computing market?
-Microsoft's Cortana flopped and was shut down completely without a direct replacement, while Amazon's Alexa faced significant financial losses, amounting to $1 billion per year.
What was Jeff Bezos' vision for Alexa in the voice computing segment?
-Jeff Bezos personally insisted on winning the voice computing segment and turning Alexa into a major platform to rival smartphones and computers.
How has Google's platform fared in the voice assistant market?
-Google's platform has seen smaller layoffs publicly, but it has lost a lot of its initial momentum, with neither Google nor its hardware partners releasing new devices dedicated to the Google Assistant in almost three years.
What are the three major upgrades that the new generation of voice-based computers, like Humane AI and Meta Ray-Bands, have over previous voice assistants?
-The new generation has real generative AI built in, making them smarter and more flexible; they have a camera for advanced computer vision to analyze what the camera sees; and they are wearable and always on, reducing the friction of interaction.
How does the AI in the Humane AI pin and Meta Ray-Bands analyze visual information?
-The AI can analyze visual information through the built-in camera, enabling interactions like estimating the sugar content of fruit for diabetics or checking the protein content in food items.
What is the main criticism against the idea of voice-based computing platforms replacing smartphones, as proposed by companies like Humane?
-The main criticism is that voice AI is not suitable as the primary interface for general computing needs, as it is not practical for many tasks such as managing emails, using social media, handling finance, and more, which require precise inputs and visual interfaces.
What are the three fundamental shortcomings of voice as an interface for computing?
-Voice interfaces have issues with privacy and annoyance in public, they are slow one-way communication lanes for computers, and human speech is often incoherent and imprecise for precise input tasks required by computers.
What are the two potential solutions proposed for the problems faced by voice-first interfaces?
-One solution is to accept voice as a cool addition to existing devices rather than a replacement. The other is to add a good screen and precise input methods to voice-first devices, essentially reinventing the smartphone with voice capabilities.
What is the main argument for voice AI being integrated as an accessory rather than a primary interface?
-Voice AI is best suited as an accessory because it is not practical for the majority of computing tasks due to the nature of voice communication being less precise and more cumbersome for complex tasks compared to visual and touch interfaces.
How does the speaker suggest the use of voice AI in the context of the Meta Ray-Bands and Microsoft's HoloLens demo?
-The speaker suggests that voice AI is well-suited as an additional capability for devices like the Meta Ray-Bands and Microsoft's HoloLens, which serve as extensions of smartphones and provide new functionalities rather than trying to replace them entirely.
What is the alternative recommendation provided by the speaker for the holiday shopping list instead of the Humane AI pin?
-The speaker recommends an iFixit set for the holiday shopping list, which allows people to fix their existing devices instead of buying new ones, reducing e-waste and giving control over their gadgets.
Outlines
📉 The Struggles of Voice Computing Platforms
This paragraph discusses the challenges faced by major tech companies in the voice computing sector. It highlights the failure of Microsoft's Cortana, which led to its complete shutdown, and Amazon's Alexa, which suffered significant financial losses. The paragraph also mentions Google's less publicized layoffs and Apple's Siri, which has seen minimal updates. Despite improvements over time, the voice assistant category has been a major disappointment, with no clear path to profitability or complex consumer interaction.
🚀 The Next Wave of Voice Computing
The paragraph introduces a new generation of voice-based computing devices, such as those from Humane and Meta, which promise significant upgrades over old voice assistants. These devices feature generative AI for smarter responses, built-in cameras for advanced computer vision, and are wearable for constant accessibility. The speaker expresses excitement about the potential benefits these advancements could bring, especially for those with vision impairments or the elderly. However, concerns are raised about the companies' marketing strategies, which seem disconnected from practical realities.
📱 The AI Pin: Hype vs. Reality
This section critiques the AI pin by Humane, which is marketed as a revolutionary device and a potential replacement for smartphones. The paragraph points out the impracticality of using voice AI as the primary interface for complex computing tasks. It argues that setting up the device, managing privacy, and performing tasks like controlling cameras, using social media, viewing photos, handling finance, and productivity work are either impossible or highly impractical with voice commands alone. The speaker asserts that voice computing is not suitable for most smartphone needs and suggests that the AI pin is not a practical addition to one's holiday shopping list.
🛠️ Empowering Repair and Sustainability
The final paragraph shifts focus from voice computing to promoting repair and sustainability through iFixit's Black Friday and holiday deals. It emphasizes the value of repairing existing devices rather than purchasing new ones, reducing e-waste, and empowering consumers to maintain control over their gadgets. The speaker recommends iFixit's repair kits, which include high-quality tools and resources for a wide range of devices. The paragraph highlights iFixit's commitment to the right to repair and its role in providing a practical and environmentally friendly alternative to discarding broken electronics.
Mindmap
Keywords
💡Voice-based Computing
💡Generative AI
💡Computer Vision
💡Wearable Technology
💡Smart Home Devices
💡Privacy Concerns
💡User Interface
💡Productivity
💡Smartphones
💡iFixit
💡Right to Repair
Highlights
Voice-based computing has faced challenges, with platforms like Microsoft's Cortana shutting down and Amazon's Alexa incurring significant losses.
Despite massive investments, tech giants have struggled to monetize voice-based platforms and expand their use beyond basic tasks.
New companies like Humane and Meta are emerging with claims of revolutionizing voice computing and creating the interface of the future.
The new generation of voice AI includes generative AI, allowing for more flexible and less specific commands.
Devices now incorporate advanced computer vision to analyze and interact with the environment through built-in cameras.
Wearable voice AI devices are designed to be always on, reducing friction and enabling constant interaction.
The AI pin from Humane aims to replace smartphones, running on an Android-based OS and having its own cellular connection.
The AI pin's interface relies heavily on voice, with a very basic projector for visual interaction.
Voice AI has limitations, such as being unsuitable for tasks requiring privacy, complex inputs, or visual elements.
The idea of voice being the primary interface for general computing is considered unrealistic and impractical.
Voice is not suitable for public use due to privacy concerns and the annoyance it may cause to others.
Voice input is a slow, one-way communication method compared to the efficiency of visual interfaces.
Human speech is often incoherent and imprecise, making it a poor method for precise computer inputs.
Voice-first interfaces may work for simple tasks but are not practical for complex computing needs.
Rayband glasses from Meta serve as an accessory to smartphones, offering voice capabilities without replacing the device.
Microsoft's HoloLens demo showcased voice-controlled interfaces in industrial scenarios, providing additional capabilities rather than replacing existing tools.
The concept of voice AI is promising, but its application as a replacement for smartphones is misguided.
iFixit offers Black Friday deals for repairing existing devices, promoting sustainability and self-reliance in tech repair.
iFixit provides high-quality tools and repair guides, empowering users to fix their gadgets and reduce e-waste.