Hello GPT-4o-mini & Mistral NeMo!!!
TLDROpen AI introduces GPT-40 mini, a cost-effective multimodal model scoring 82% on MLU and outperforming GPT-4 in chat and LMS. Priced at 15 cents per million input tokens and 60 cents per million output tokens, it's set to replace GPT-3.5 as the go-to model for developers. Additionally, Mistral and Nvidia collaborate on a 12 billion parameter model, Nemo, with superior multilingual capabilities and a novel tokenizer, promising efficiency and potential to revolutionize AI applications in various languages.
Takeaways
- 😀 Open AI has released a new model called GPT-40 Mini, which is a part of the GPT-40 family and is designed to be a cost-effective multimodal system.
- 💬 GPT-40 Mini is said to outperform GPT-4 in chat preferences and LMS, scoring 82% on MLU, although the speaker expresses some skepticism about the MLU metric.
- 💰 The model is priced at 15 cents per million input tokens and 60 cents per million output tokens, significantly cheaper than previous models like Gemini 1.5 Flash.
- 🔍 GPT-40 Mini supports a broad range of tasks with low cost and latency, making it ideal for summoning agents and chaining multiple model calls.
- 🌐 The model has a context window of 128,000 tokens and currently supports text and vision, with plans to add support for image, video, and audio inputs and outputs in the future.
- 📚 GPT-40 Mini has been frozen with knowledge until October 2023 and features an improved tokenizer that handles non-English text more efficiently.
- 🔍 The model has been evaluated across key benchmarks, including reasoning tasks, MLU, math, and coding, outperforming smaller models in these areas.
- 💡 GPT-40 Mini is expected to replace many developers' code bases, as it offers a more cost-effective solution compared to previous models like GPD 3.5 Turbo.
- 🌐 A new model called Mistral NeMo has been released in collaboration with Nvidia, featuring a 12 billion parameter model with strong multilingual capabilities.
- 🔢 Mistral NeMo uses a new tokenizer called Ticken, which is claimed to be more efficient at compressing natural language and source code, especially for languages like Chinese, Italian, French, German, Spanish, Russian, Korean, and Arabic.
Q & A
What is the name of the new model released by OpenAI?
-The new model released by OpenAI is called GPT 40 mini.
How does GPT 40 mini compare to previous models in terms of cost?
-GPT 40 mini is significantly cheaper than previous models, priced at 15 cents per million input tokens and 60 cents per million output tokens, which is a 99% reduction in price since Davy Tex Davy O3.
What are the key features of GPT 40 mini?
-GPT 40 mini is a multimodal model that supports text and vision, has a 128,000 context window, and has been improved for handling non-English text more efficiently with its tokenizer.
How does GPT 40 mini perform in benchmarks?
-GPT 40 mini scores 82% on MLU and outperforms GPT 4 in chat preferences and LMS, excelling in reasoning tasks, math, and coding.
What is the significance of the 128,000 context window in GPT 40 mini?
-The 128,000 context window allows the model to handle large volumes of context, which is beneficial for tasks that require chaining prompts and multiple model calls.
What is the relationship between GPT 40 mini and Gemini 1.5 flash in terms of pricing?
-GPT 40 mini is almost half the price of Gemini 1.5 flash for 1 million tokens with a 128,000 context window, making it a more cost-effective option.
What is the new model released in collaboration with Mistral and Nvidia?
-The new model released in collaboration with Mistral and Nvidia is called Mistral NeMo.
How does Mistral NeMo differ from GPT 40 mini in terms of capabilities?
-Mistral NeMo is a 12 billion parameter model with strong multilingual capabilities, supporting a wide range of languages, and uses a new tokenizer called Ticken, which is more efficient at compressing natural language and source code.
What is special about the tokenizer used in Mistral NeMo?
-The Ticken tokenizer used in Mistral NeMo is based on a new approach and is reported to be 30% more efficient at compressing source code for certain languages and two to three times more efficient for Korean and Arabic compared to the LaMa 3 tokenizer.
How does the release of Mistral NeMo impact the multilingual model landscape?
-The release of Mistral NeMo introduces a powerful multilingual model that can efficiently handle a wide range of languages, including those not usually covered, making it a significant addition to the multilingual model landscape.
What is the licensing situation for the two models released, GPT 40 mini and Mistral NeMo?
-GPT 40 mini is released under an Apache 2.0 license, while it is suggested that Mistral NeMo will likely not be open-sourced in the foreseeable future.
Outlines
🚀 Launch of GPT 40 Mini: Affordable Multimodal AI Model
OpenAI has unveiled a new AI model, GPT 40 Mini, which is set to become the go-to choice for developers due to its cost-effectiveness. This model is part of the GPT 40 family, a multimodal system that integrates language and vision capabilities unlike previous models that required separate systems. The GPT 40 Mini is likened to an 'iPhone Mini,' offering impressive scores on MLU and outperforming GPT 4 in chat and LMS, despite skepticism towards MLU scores. Priced at 15 cents per million input tokens and 60 cents per million output tokens, it's significantly cheaper than Google's Gemini 1.5 Flash, marking a 99% reduction in price since the Davy Tex Davy O3 model. The model supports a broad range of tasks with low cost and latency, making it ideal for summoning agents, chaining prompts, and handling large volumes of context with a 128,000 context window. It also boasts improved tokenizer capabilities for non-English text. The model's knowledge is current up to October 2023, and it excels in reasoning tasks, math, and coding, positioning it to replace GPT 3.5 Turbo in many developers' codebases.
💰 Cost Comparison and Impact of GPT 40 Mini on Developers
The GPT 40 Mini's pricing is compared to Google's Gemini 1.5 Flash, with the latter charging 35 cents for 1 million input tokens and $1 for 1 million output tokens. The GPT 40 Mini offers a more attractive rate, especially considering its 128,000 context window, making it a strong contender for developers looking to minimize costs. The release of GPT 40 Mini is a significant win for developers, prompting many to consider updating their codebases to leverage the new model's capabilities and cost-efficiency. Additionally, users of the GPT Free Plus tier, including those on the free plan, will have access to GPT 40 Mini, further democratizing access to advanced AI models. The summary also introduces a new model, mistl Nemo, a collaboration between Mistral and Nvidia, which is a 12 billion parameter model with enhanced multilingual support and a new tokenizer that promises better efficiency in handling a wide range of languages.
🌐 Introducing mistl Nemo: A Multilingual Large Language Model
The script introduces mistl Nemo, a new model developed in collaboration with Nvidia, which is distinguished by its 12 billion parameters and strong multilingual capabilities. Nemo supports a wide array of languages, including but not limited to English, French, German, Spanish, Italian, Portuguese, Chinese, Japanese, Korean, Arabic, and Hindi. It also features a 128,000 context window, which, while beneficial, raises questions about the RAM requirements for deployment. The model utilizes a novel tokenizer called 'ticken,' which is based on the 'tick token' and is claimed to be more efficient than previous tokenizers, particularly in compressing natural language and source code. The 'ticken' tokenizer is reported to be 30% more efficient for certain languages and up to three times better for others, compared to the LaMa 3 tokenizer. The mistl Nemo model is available on Hugging Face and can be used with Nvidia's Nim inference microservice, showcasing its potential for efficient multilingual applications.
Mindmap
Keywords
💡GPT 40 mini
💡MLU
💡API
💡Cost Efficiency
💡Context Window
💡Tokenizer
💡Function Calling
💡Benchmarks
💡Mistral NeMo
💡Tick Tokenizer
Highlights
Open AI has released a new model called GPT 40 mini.
GPT 40 mini is a multimodal system, unlike previous models.
GPT 40 mini is considered to be the 'iPhone Mini' of the GPT 40 series.
GPT 40 mini scores 82% on MLU, but the metric's trustworthiness is questioned.
GPT 40 mini outperforms GPT 4 in chat preferences and LMS.
GPT 40 mini is priced at 15 cents per million input tokens and 60 cents per million output tokens.
The price of GPT 40 mini has reduced by 99% since Davy Tex Davy O3.
GPT 40 mini enables a broad range of tasks with low cost and latency.
GPT 40 mini supports text and vision, with other modalities coming in the future.
GPT 40 mini has a 128,000 context window.
GPT 40 mini has been frozen with knowledge until October 2023.
GPT 40 mini has an improved tokenizer for handling non-English text.
GPT 40 mini demonstrates stronger performance in function calling.
GPT 40 mini has been evaluated across key benchmarks, reasoning tasks, math, and coding.
GPT 40 mini is expected to replace GPD 3.5 Turbo in many developers' code bases.
Mistral NeMo is a new 12 billion parameter model in collaboration with Nvidia.
Mistral NeMo has excellent multilingual capabilities, including many languages not usually covered.
Mistral NeMo uses a new tokenizer called Ticken, which is more efficient than Sentence Piece.
Mistral NeMo is released on Hugging Face and can be used in Nvidia's NIM inference microservice.