* This blog post is a summary of this video.

Google Gemini and DeepMind Lead the Way to Major AI Breakthroughs

Table of Contents

The Evolution from Google Bard to Gemini

It looks like Google will be rolling out some big updates soon, perhaps as early as next week. Google appears ready to transition from Google Bard to a new AI assistant named Gemini. Google Bard itself was running on an underlying AI model called Gemini, so in a way we are simply dropping the Bard name and referring to everything as Gemini moving forward.

There will be multiple tiers of Gemini released, including something called Gemini Advanced which seems to be equivalent to Gemini Ultra. This will likely be the most capable premium version of Gemini, meant for more complex tasks.

Capabilities of Gemini Advanced

According to an early product description, Gemini Advanced 'is far more capable at highly complex tasks like coding, logical reasoning, following nuanced instructions, and creative collaboration.' It can respond to queries in multiple languages, has multimodal capabilities allowing users to upload and analyze files/data, improved coding assistance features, and more. Gemini Advanced will be a paid subscription plan focused on English language queries.

Gemini Advanced Pricing

No official pricing details were provided, but Gemini Advanced is described as a 'paid plan' by Google. More details should emerge next week during the anticipated launch.

Gemini Pro Ranks High Against GPT-4 Models

According to recent AI benchmark results from Chatbot Arena, Gemini Pro using Google's proprietary models ranks near the top, close to the leading GPT-4 AI models from Anthropic. Out of all the language models tested, only GPT-4 Turbo scored higher.

Gemini Pro appears to be neck-and-neck with GPT-4, which is an impressive achievement. However, GPT-4 does not have access to the internet, whereas Gemini Pro is an online model. This likely gives Gemini Pro an edge when responding to complex real-world queries.

Chatbot Arena Benchmarks

The benchmarks from Chatbot Arena provide one perspective on how Gemini Pro compares to other private AI companies. However, these benchmarks focus solely on conversational ability. We still need to see broader, rigorous testing across areas like reasoning, creativity, and more.

Demis Hassabis Interview on the Future of AI

Demis Hassabis, CEO and co-founder of DeepMind, may be interviewed on February 7th by AI podcaster Dhruv Patel. This coincides with the rumored release date for Gemini Advanced and a closer integration between Google and DeepMind's AI projects.

As one of the leading figures in artificial general intelligence (AGI) research, Hassabis' perspective provides a valuable window into the future of AI. DeepMind has been responsible for major innovations like AlphaGo and AlphaFold, so this interview is likely to generate significant interest.

Background on Demis Hassabis and DeepMind

Demis Hassabis co-founded DeepMind in 2010 with the goal of building AGI that can master a wide range of intellectual tasks. DeepMind was acquired by Google in 2014 but operated fairly independently for years before aligning more closely with Google's own AI efforts recently. Hassabis himself is a former child chess prodigy with degrees in computer science and neuroscience. His diverse expertise and 'first principles' approach to AI research makes him uniquely qualified to speculate on the future progress of the field.

DeepMind Progress Across Robotics and More

In addition to conversational AI, DeepMind has been advancing state-of-the-art capabilities in areas like robotics, drug discovery, games, and more. They take an ambitious, multi-disciplinary approach toward developing generally intelligent agents that can learn skills and apply knowledge flexibly.

Robotics and Language Models

DeepMind's new Language, Action & Vision Model shows promise at following verbal commands to manipulate real-world objects. This could significantly improve robots' ability to understand instructions from humans and build contextual knowledge.

Drug Discovery Applications

DeepMind is also applying AI to accelerate the drug discovery process and better predict the folding structure of proteins. They partnered with several pharmaceutical companies to test real-world medical applications.

Google's Generative AI - Image, Video, and More

While DeepMind focuses on fundamental research, Google is rapidly expanding access to powerful generative AI across text, image, and video domains. They are integrating these capabilities into developer platforms like Vertex AI to enable enterprise applications.

Vertex AI Enterprise Solutions

Google Cloud's Vertex AI platform allows companies to leverage leading-edge generative models like Imagen and others at scale while addressing concerns around copyrights and content moderation.

Text-to-Image with Imagen

Imagen harnesses diffusion models to generate striking, photorealistic images from text prompts. Early benchmarks show Imagen leading in terms of image accuracy and fidelity to prompt details compared to models from other providers.

Text-to-Video Capabilities

Google's Luminous AI can produce video footage from natural language descriptions. It utilizes techniques like video prediction, inpainting, and style transfer to render high-quality results.

The Future of AI Search

Google also aims to integrate AI more deeply into search, providing auto-generated overviews to queries rather than just blue links. Features like SG Search explore how language models can summarize information and answer questions directly on the search results page.

Generative AI for Answers

Leveraging advances in generative language models, Google search now auto-populates structured information panels to help directly answer common queries with key facts, overview summaries, and more.

SG and Perplexity Solutions

Google's SG search experience faces stiff competition from startups like Perplexity AI that can ingest broader information from the web and concisely summarize details on virtually any topic in seconds using a chatbot-like interface.

Conclusion

The next few weeks promise major AI announcements out of Google and DeepMind, most notably the launch of Gemini Advanced assistant and Hassabis interview. While Google plays catch-up on conversational AI, DeepMind continues pushing boundaries in AGI research across disciplines.

Powerful generative models are reaching commercial viability, but integrating these capabilities into useful applications poses the next great challenge. As AI becomes increasingly able to synthesize data, media, code and more, it may augur shifts across industries and call for updated policies.

FAQ

Q: When will Gemini Advanced launch?
A: It looks like Gemini Advanced may launch around February 7, 2024 based on leaked details.

Q: How much will Gemini Advanced cost?
A: Pricing details for Gemini Advanced have not been officially announced yet.

Q: What is the Ultra 1.0 model?
A: Ultra 1.0 is the latest AI model from Google and DeepMind powering Gemini Advanced capabilities.

Q: How does Gemini Pro compare to GPT-4?
A: According to benchmarks, Gemini Pro is nearly equal to the top GPT-4 models like Turbo.

Q: Who is Demis Hassabis?
A: Demis Hassabis is the CEO and co-founder of DeepMind, leading development of advanced AI like Gemini.

Q: What breakthroughs has DeepMind made?
A: DeepMind has made progress in robotics, drug discovery, games like Go, and more using advanced AI.

Q: What generative AI does Google offer?
A: Google provides text-to-image with Image-in-2, text-to-video with Video-in-1, and more.

Q: How does generative AI improve search?
A: Tools like Perplexity leverage generative AI to provide informative answers without clicking links.

Q: What's next for AI after Gemini?
A: Experts expect OpenAI will release the next version of GPT to compete with Gemini and DeepMind.

Q: When is the Demis Hassabis interview releasing?
A: Demis Hassabis may appear on a February 7th podcast aligned with the Gemini launch.