AI Frontiers: Jesper Hvirring Henriksen (OpenAI DevDay)

OpenAI
15 Nov 202309:25

TLDRJesper Hvirring Henriksen from Be My Eyes introduces a new feature, Be My AI on GPT-4V, in partnership with OpenAI. This AI assistant aims to provide blind and low-vision users with 24/7 visual assistance, enhancing their independence. With the ability to describe images and navigate digital content, GPT-4 Vision has opened up a world of accessibility, receiving overwhelmingly positive feedback from users. The technology's impact on assistive technologies is projected to be profound, as it continues to improve accessibility for all.

Takeaways

  • 🌟 Be My Eyes, in partnership with OpenAI, has launched a new feature called Be My AI on GPT-4V.
  • 👥 The app aims to provide visual assistance to blind and low-vision individuals through a community of over 7 million volunteers.
  • 📱 Users can request help through video calls, allowing volunteers to lend their eyes to assist with tasks and navigation.
  • 🚀 The introduction of Be My AI offers users an alternative to human assistance, providing an AI assistant available 24/7.
  • 💡 GPT-4 vision enables computers to 'see' and describe the physical and digital world, enhancing accessibility for visually impaired users.
  • 🌐 The AI can provide detailed descriptions of images, including those in group chats, websites, and other online platforms.
  • 📈 An example showcased was the AI's ability to describe a graph from a US government website, which was previously inaccessible to screen reader users.
  • 🎥 Beta testers like Caroline and Lucy have provided positive feedback, significantly increasing their independence and quality of life.
  • 📊 Since the beta launch, the service has provided around a million image descriptions per month, with a 95% satisfaction rate.
  • 🗣️ The AI supports over 36 languages, responding in the user's language, and has been integrated into enterprise customer support products like Microsoft's Disability Answer Desks.
  • 🌈 The development and integration of AI models that can see and understand human speech are expected to greatly improve future accessibility and assistive technologies.

Q & A

  • What is Be My Eyes and what does it aim to provide for blind and low-vision individuals?

    -Be My Eyes is an application that provides visual assistance to blind and low-vision individuals through a community of volunteers. It allows these individuals to request help via video calls, giving them access to sighted volunteers who can assist them in real-time.

  • How many users and volunteers does Be My Eyes have currently?

    -Be My Eyes currently has over half a million blind and low-vision users supported by more than 7 million volunteers.

  • What was the motivation behind developing the Be My AI feature on GPT-4V?

    -The Be My AI feature was developed to provide users with an alternative to human assistance. It aims to offer blind and low-vision individuals the ability to maintain their independence and receive assistance from an AI assistant that is available 24/7.

  • What are some of the real-world applications of GPT-4 vision?

    -GPT-4 vision has a wide range of real-world applications. It can be used for utility-based tasks, such as describing objects or assisting with navigation. It can also enhance digital accessibility by providing meaningful alt text for images on websites and apps, making media and photos understandable to those who can't see them.

  • How does Be My AI assist users with images they encounter online or in apps?

    -Be My AI provides thorough descriptions of any image users encounter online or in apps. This includes photos in group chats, images on websites, and other digital media that typically lack accessible alt text.

  • What was the reaction of one of the beta testers, Caroline, to the Be My AI feature?

    -Caroline, a long-time user of Be My Eyes, became a Be My AI beta tester and has since used the feature to describe more than 700 images. This indicates a significant increase in her engagement with the service and suggests a positive reception of the feature.

  • How does Lucy Edwards incorporate Be My AI into her daily life?

    -Lucy Edwards, another beta tester, uses Be My AI for various daily tasks such as checking eggs for shells, reading expiry dates on products, sorting laundry, identifying food on her plate, and even describing images on social media. This shows how Be My AI can assist with both practical tasks and enhance social inclusion.

  • What was the outcome of deploying Be My AI to a small Beta System group and later to all iOS users?

    -After deploying Be My AI to a small Beta System group and later to all iOS users, the service has seen about a million image descriptions per month. The satisfaction ratings from users are extremely high, with a 95% approval rate when excluding downtime and system errors.

  • How many languages does Be My AI currently support and how was this achieved?

    -Be My AI now supports up to 36 languages. This was achieved by instructing the model to respond in the same language that the user is using, demonstrating the flexibility and adaptability of the technology.

  • How has Be My AI been integrated into enterprise customer support?

    -Be My AI has been deployed into the enterprise customer support product, with an example being Microsoft's Disability Answer Desks. Users can now choose to start with a chatbot instead of a call, and in most cases, the chatbot is able to resolve their issues without the need for a call.

  • What is the long-term vision for AI models that can see and understand human language?

    -The long-term vision is that AI models capable of seeing and understanding human language will profoundly improve accessibility and assistive technologies in the future. They will not only enhance the lives of those with visual impairments but also contribute to making digital interactions more human-like and intuitive for everyone.

Outlines

00:00

🌟 Introduction to Be My AI and Its Impact

This paragraph introduces Jesper from Be My Eyes, a platform that connects blind and low-vision individuals with sighted volunteers for visual assistance through video calls. The introduction of Be My AI on GPT-4V, a partnership with OpenAI, marks a significant advancement. With over half a million users and seven million volunteers, the service aims to provide independence and reduce the burden on others. The new AI feature offers a 24/7 visual assistant, addressing various user feedback and scenarios where human interaction might not be desired or possible. The GPT-4 vision capability allows computers to 'see' and has a wide range of applications, from practical tasks to digital accessibility, transforming the way blind and low-vision users interact with the world around them.

05:01

📸 Real-Life Applications and User Experiences with Be My AI

This paragraph delves into the practical applications of Be My AI, showcasing its utility in everyday life. Users can now receive detailed descriptions of images they encounter online or in apps, improving their understanding and participation in social interactions. The narrative follows Lucy Edwards, a beta tester, as she demonstrates the use of Be My AI in her daily routine, from cooking and laundry to reading product labels and managing her social media. The feature's success is evident in the overwhelmingly positive feedback and high satisfaction ratings. The integration of GPT-4 Vision into Be My Eyes has also expanded language support and been implemented in enterprise customer support products, significantly enhancing accessibility and user experience.

Mindmap

Keywords

💡Be My Eyes

Be My Eyes is an innovative mobile application designed to assist blind and low-vision individuals by connecting them with sighted volunteers through video calls. The app has over half a million users and more than 7 million volunteers worldwide, facilitating a community-driven support system. In the context of the video, the introduction of Be My AI on GPT-4V represents a significant advancement, offering users an AI-powered alternative to human assistance for visual tasks, thereby enhancing independence and accessibility.

💡GPT-4V

GPT-4V is a reference to an advanced version of the Generative Pre-trained Transformer model, which is an AI language model developed by OpenAI. In the context of the video, GPT-4V is integrated with Be My Eyes to create Be My AI, an AI assistant capable of visual perception and description. This integration marks a significant leap in technology, enabling computers to 'see' and understand visual content, thus expanding the possibilities for assistive technologies and improving accessibility for users with visual impairments.

💡Independence

Independence, in the context of the video, refers to the ability of blind and low-vision individuals to perform tasks and navigate the world without relying on others. The introduction of Be My AI aims to empower these users by providing an AI assistant that can assist with visual tasks at any time, thus reducing the need for human intervention and enhancing the users' sense of self-reliance and autonomy.

💡Visual Assistance

Visual assistance refers to the support provided to individuals with visual impairments to help them understand and interact with the visual world. In the video, this is achieved through the Be My Eyes app, which connects blind and low-vision users with sighted volunteers for real-time visual assistance via video calls. The new feature, Be My AI, extends this concept by offering AI-powered visual descriptions, further enhancing the accessibility of visual information.

💡Alt Text

Alt text, or alternative text, is a descriptive text that provides information about an image for those who cannot see it. It is a crucial accessibility feature for websites and apps, ensuring that visual content is understandable for users with visual impairments. In the video, the speaker mentions that many images lack meaningful alt text, making them inaccessible. Be My AI aims to solve this issue by providing thorough descriptions of any image encountered online or in an app, thus improving digital accessibility.

💡Navigation

Navigation, in the context of the video, refers to the process of moving through physical or digital spaces, which can be challenging for individuals with visual impairments. The Be My Eyes app and its AI-powered feature assist with navigation by providing descriptions of environments, landscapes, and objects, aiding users in understanding their surroundings and making informed decisions about their movements.

💡Digital Accessibility

Digital accessibility is the design and development of digital products, such as websites, apps, and software, in a way that is accessible to all users, including those with disabilities. In the video, the speaker discusses the limitations of current digital accessibility, particularly the lack of meaningful alt text for images. The introduction of Be My AI with GPT-4V addresses this issue by providing detailed image descriptions, thus enhancing the accessibility of digital content for users with visual impairments.

💡Beta Tester

A beta tester is an individual who uses a product, typically software or an app, before its official release to identify and report any issues or bugs. In the context of the video, beta testers for Be My AI are users who have tried the AI-powered visual assistance feature before its full launch. Their feedback is invaluable in refining the product and ensuring it meets the needs of the target community.

💡Enterprise Customer Support

Enterprise customer support refers to the assistance provided to businesses or organizations that use a particular product or service. In the video, the integration of Be My AI into Microsoft's Disability Answer Desks is an example of enterprise customer support, where users can opt for an AI-powered chatbot to assist with their queries instead of making a call. This approach has shown high effectiveness, with the majority of users not needing to escalate to a phone call.

💡Language Support

Language support in the context of software or applications refers to the ability of the product to function in multiple languages, catering to a diverse user base. In the video, Be My AI's capability to respond in up to 36 languages demonstrates the app's commitment to inclusivity and accessibility for users worldwide, regardless of their linguistic background.

💡Assistive Technologies

Assistive technologies are devices, equipment, or software that help individuals with disabilities to perform tasks that might be difficult or impossible for them to do on their own. In the video, Be My AI is an example of an assistive technology that leverages AI to improve accessibility for blind and low-vision users by providing visual descriptions and enhancing their interaction with the digital world.

Highlights

Jesper Hvirring Henriksen introduces Be My Eyes and its new feature, Be My AI, in partnership with OpenAI.

Be My Eyes is an app that provides visual assistance to blind and low-vision individuals through a community of volunteers.

The app has over half a million blind and low-vision users supported by more than 7 million volunteers.

Volunteers assist users through video calls, but the new AI feature offers an alternative for users seeking independence.

GPT-4V enables computers to 'see' and assist with tasks that were previously challenging for visually impaired users.

The AI can describe images, including those on websites and in apps, making content more accessible.

GPT-V provides detailed and accurate descriptions, enhancing the user experience for visually impaired individuals.

The AI's responses are not only accurate but also exhibit a level of wit and human-like interaction.

Beta tester Caroline increased her usage from two calls a year to over 700 image descriptions with Be My AI.

Lucy Edwards, another beta tester, demonstrates how Be My AI integrates into daily life in a video.

Be My AI helps users with tasks like checking eggs for shells, reading expiry dates, and identifying laundry colors.

The AI can also assist with meal identification, whether dining in or at a buffet.

Users can receive descriptions of perfume scents from images, enhancing their understanding of products.

Social media photo descriptions are provided by Be My Eyes, improving engagement for visually impaired users.

Feedback from beta testers has been overwhelmingly positive, with high satisfaction ratings reported.

The feature was rolled out to a small Beta group in March and later to all iOS users, resulting in a million image descriptions monthly.

Language support has been expanded to 36 languages by instructing the model to respond in the user's language.

Be My AI has been integrated into enterprise customer support products, such as Microsoft's Disability Answer Desks.

9 out of 10 users who start with a chatbot do not escalate to a call, showcasing the effectiveness of the AI in resolving issues.

The development and integration of AI models that can see and hear are expected to greatly improve accessibility and assistive technologies in the future.