【人工智能】OpenAI发布最新小模型GPT-4o mini | 取代GPT-3.5 | 性能超GPT-4 | 价格下降60% | Mistral发布小模型NeMo | 超越Mistral 7B
TLDROpenAI发布了性能超越GPT-4且价格下降60%的GPT-4o mini模型,立即取代了GPT-3.5 Turbo。此模型在MMLU得分82%,支持128K token上下文窗口,能效高,适用于多种任务。商用价格极具竞争力,输入每百万token 15美分,输出60美分。同时,MistralAI与NVIDIA发布了Mistral NeMo 12B小模型,支持多语言任务,性能大幅提升,采用新分词器Tekken,压缩效率显著提高。
Takeaways
- 🆕 OpenAI发布了GPT-4o mini模型,性能超越GPT-4,价格大幅下降。
- 🚀 GPT-4o mini模型在MMLU测试中得分82%,聊天性能优于GPT-4。
- 💰 GPT-4o mini商用价格显著降低,输入和输出token成本仅为GPT-3.5 Turbo的60%和40%。
- 📈 GPT-4o mini模型支持128K token的上下文窗口,知识截止2023年10月。
- 🌐 改进版tokenizer使GPT-4o mini处理非英语文本更经济高效。
- 🔍 GPT-4o mini在多模态推理和文本智能方面超越了其他小型模型。
- 📊 在数学和编码能力方面,GPT-4o mini在MGSM和HumanEval基准测试中得分高。
- 🔒 OpenAI从模型开发初期就内置了安全措施,通过自动和人工评估确保模型安全性。
- 🔧 Mistral发布了NeMo小模型,支持多语言任务和高效压缩多种语言文本。
- 📝 GPT-4o mini和Mistral NeMo的发布预示着AI模型朝着更小、更高效、更安全的方向发展。
- 🔮 业界预测,未来将出现更小但思考能力强、可靠的AI模型,当前大模型的庞大是训练过程中的浪费所致。
Q & A
What is the significance of the release of GPT-4o mini by OpenAI?
-The release of GPT-4o mini is significant as it immediately replaces the previous GPT-3.5 Turbo model, offering improved performance and a substantial reduction in cost, making it more accessible for a wider range of applications.
What is the MMLU score of GPT-4o mini, and how does it compare to GPT-4 in terms of chat performance?
-GPT-4o mini has an MMLU score of 82%, which is superior to GPT-4 in terms of chat performance, indicating its enhanced capabilities in language understanding and interaction.
How much cheaper is GPT-4o mini compared to GPT-3.5 Turbo and GPT-4 in terms of commercial use pricing?
-GPT-4o mini is more than 60% cheaper than GPT-3.5 Turbo and 96%-97% cheaper than GPT-4, with a commercial price of 15 cents per million input tokens and 60 cents per million output tokens.
What are the key features of GPT-4o mini in terms of context window and tokenizer improvements?
-GPT-4o mini features a 128K token context window and an improved tokenizer based on GPT-4o, which makes it more efficient and cost-effective in handling non-English texts.
When is OpenAI expected to release the voice modality for GPT-4o mini?
-OpenAI plans to release a voice modality test version for GPT-4o mini in late July, with public access permissions to be opened at a later date.
How does GPT-4o mini perform in comparison to other small models in terms of multimodal reasoning and text intelligence?
-GPT-4o mini outperforms other small models in multimodal reasoning and text intelligence, as evidenced by its higher scores in benchmarks like MMLU, MGSM, and MMMU.
What is the token generation speed of GPT-4o mini, and how does it compare to other models in terms of inference efficiency?
-GPT-4o mini generates 183 tokens per second, making it the fastest model on the list and 18 tokens per second faster than the second-fastest model, Gemini 1.5 Flash.
What safety measures has OpenAI implemented in GPT-4o mini from the beginning of its development?
-OpenAI has implemented safety measures from the start of GPT-4o mini's development, including filtering out unwanted information during pre-training and using techniques like RLHF to align model behavior with desired strategies.
What is the significance of the release of Mistral NeMo 12B by MistralAI and NVIDIA?
-The release of Mistral NeMo 12B signifies a new small model that is customizable and deployable for various tasks, with improved performance and efficiency due to its consideration of quantization and the use of a new tokenizer, Tekken.
How does the Tekken tokenizer compare to previous tokenizers in terms of compression efficiency for different languages?
-The Tekken tokenizer shows higher compression efficiency in about 85% of languages compared to Llama 3's tokenizer and offers significant improvements in languages like Korean and Arabic, with efficiency increases of up to 2x and 3x respectively.
What is Andrej Karpathy's perspective on the future of large language model development?
-Andrej Karpathy predicts that we will see very small but highly capable and reliable models in the future. He suggests that the current large size of language models is due to wasteful training processes, and that models must first become large before they can be effectively reduced in size through automated assistance and improved training data.
Outlines
🚀 Launch of GPT-4o Mini Model
OpenAI has announced the 'Mini' version of the GPT-4o model, which is now available to replace the previous GPT-3.5 Turbo. This new model is accessible to free users and offers significant cost reductions compared to previous models, with a commercial price of 15 cents per million input tokens and 60 cents per million output tokens. The GPT-4o mini model achieves an 82% score on the MMLU benchmark, outperforming GPT-4 in chat-related tasks. It features a 128K token context window and improved tokenizer for handling non-English text. The model is capable of performing a wide range of tasks, including applications requiring multiple model calls, large context transfers, and real-time text interactions. Currently, the API supports text and visual inputs, with plans to introduce voice and other modalities later. The model also excels in text intelligence, multimodal reasoning, and function calls, scoring higher than other small models in various benchmarks. OpenAI CEO Sam Altman emphasizes the model's affordability and the company's commitment to safety and reliability, with built-in security measures and continuous improvements.
🌐 Emergence of Smaller AI Models and Their Impact
The script discusses the release of the GPT-4o mini model by OpenAI and its implications for the AI industry. It highlights the model's cost-effectiveness and its potential to replace larger models in various applications. Additionally, MistralAI and NVIDIA have introduced the Mistral NeMo 12B model, which is customizable and supports tasks like chatbots, multilingual tasks, programming, and summarization. The Mistral NeMo model is optimized for FP8 inference and shows significant performance improvements over other models like Gemma 2 9B and Llama 3 8B. It also uses a new tokenizer, Tekken, which is more efficient in compressing natural language text and source code across multiple languages. The script concludes with a discussion on the future of AI models, with predictions about the development of smaller but highly capable models. It also mentions the upcoming release of Meta's 400B parameter Llama 3 model and the anticipation for GPT-5, indicating a competitive landscape in the AI industry.
Mindmap
Keywords
💡GPT-4o mini
💡MMLU
💡上下文窗口
💡商用价格
💡RLHF
💡多模态推理
💡Tekken分词器
💡FP8推理
💡安全缓释措施
💡Mistral NeMo
Highlights
OpenAI has released a new 'Mini' version of the GPT-4o model, immediately replacing the previous GPT-3.5 Turbo.
GPT-4o mini is now available for free users and has scored 82% on the MMLU benchmark, outperforming GPT-4 in chat capabilities.
The commercial price for GPT-4o mini is significantly lower than previous models, with a 60% reduction compared to GPT-3.5 Turbo and a 96%-97% decrease from GPT-4o.
OpenAI's CEO, Sam Altman, described the cost of accessing intelligence as 'too cheap to meter'.
GPT-4o mini has a context window of 128K tokens and knowledge up to October 2023, with improved tokenization for non-English texts.
The model is capable of performing a wide range of tasks with low cost and latency, such as handling multiple model calls or real-time text interactions.
GPT-4o mini's API currently only supports text and vision, with plans to include voice and other modalities later in July.
GPT-4o mini has shown superior performance in text intelligence and multimodal reasoning compared to other small models.
In mathematical and coding capabilities, GPT-4o mini scores higher than previous models on the MGSM and HumanEval benchmarks.
The model demonstrates strong performance in multimodal reasoning with a score of 59.4% on the MMMU benchmark.
GPT-4o mini's reasoning efficiency is the highest among its peers, generating 183 tokens per second.
OpenAI has partnered with collaborators to test GPT-4o mini, finding it superior in tasks such as data extraction from receipts or generating email responses.
The model includes built-in safety measures from the beginning of its development, with continuous reinforcement throughout the process.
GPT-4o mini is the first model to apply the instruction layering method, enhancing its resistance to various security threats.
OpenAI plans to release a fine-tuned version of GPT-4o mini in the coming days, making it available to ChatGPT users.
Mistral AI and NVIDIA have also released a new small model, Mistral NeMo 12B, which is customizable and supports various tasks.
Mistral NeMo supports 128k Tokens context window and can replace systems using Mistral 7B, with improved performance in multiple benchmarks.
The new tokenizer Tekken, used in Mistral NeMo, offers higher compression efficiency for natural language text and source code across many languages.
Andrej Karpathy predicts a future of very small but highly capable and reliable models, as the competition in model scale moves in the opposite direction.
The significant price drop has made it economically feasible to reason about every word spoken or heard in the US within 24 hours for under $200,000.