A New Era of NovelAI Begins Now
TLDRThe video introduces a new era for NovelAI with the addition of painting to image generation and the launch of Cleo, a custom-made AI model developed in-house. Cleo, trained on 1.5 trillion tokens, boasts a Lambada score of 73%, surpassing other models of similar size. With an 8192 token context and a compact 3 billion parameters, Cleo signifies the company's progress and serves as a proof of concept. While still experimental, Cleo is available for Opus subscribers to test, with wider availability expected in two weeks. The team thanks their audience for their patience and teases more exciting developments to come.
Takeaways
- 🎨 Painting is being integrated into image generation, offering a new creative tool for users.
- 🐶 A cute dog is used as a visual distraction while discussing the shift from text to image generation.
- 🖼️ Image-to-image interface allows users to modify images by adding or replacing elements.
- 📅 The official release of the painting feature is scheduled for two days from the date of the transcript, on a Thursday.
- 🚀 Two new V2 modules are being introduced for Sigurd and Andrew, enhancing text generation capabilities.
- 🤖 Cleo, a custom-made model developed in-house, is introduced as a significant advancement in AI technology.
- 🧠 Cleo has been trained from scratch with a custom tokenizer and a 6-terabyte pre-trained dataset.
- 📚 Trained on 1.5 trillion tokens, Cleo boasts a high level of general knowledge and outperforms other models.
- 🏆 Cleo achieved a Lambada score of 73 percent, surpassing similarly sized models during fine tuning.
- 🔢 With an 8192 token context, Cleo provides a significant increase in the amount of context it can process.
- 🔧 Cleo is a 3-billion-parameter model, representing a proof of concept for the company's ability to train large language models.
- 🔍 While Cleo is still experimental, Opus subscribers will have early access, with a wider release planned for two weeks later.
Q & A
What is the main announcement regarding image generation?
-The main announcement is the introduction of painting to image generation, which allows users to modify and enhance images through an image-to-image interface.
What new modules are being introduced for text generation?
-Sigurd and Andrew Terpy are getting brand new V2 modules, which are complete replacements to the original modules with several new ones added.
What is significant about the new model Cleo?
-Cleo is the first custom-made model created entirely in-house, trained from scratch with a custom tokenizer, 6 terabyte pre-trained data set, custom fine tune, and a custom pre-trained model. It is designed to excel in storytelling.
How many tokens of data has Cleo been trained on?
-Cleo has been trained on 1.5 trillion tokens of data.
What is Cleo's Lambada score and how does it compare to other models?
-Cleo's Lambada score is 73 percent, which is better than any other similarly sized model and surpasses the evaluations of Creek.
What is the token context length of Cleo?
-Cleo features an 8192 token context length.
How many parameters does Cleo have?
-Cleo has 3 billion parameters.
What is the purpose of Cleo as a proof of concept model?
-Cleo serves as a proof of concept to demonstrate the team's ability to train a large language model, finalize training processes, fix dataset issues, and smooth out any challenges before moving on to larger models.
Who will have access to Cleo first and why?
-Opus subscribers will have access to Cleo first to play with it and help iron out any last issues, as Cleo is still somewhat experimental.
When will Cleo be available to the general public?
-Cleo will be available to the general public in two weeks.
What does the team have planned for the future?
-The team has more exciting developments planned for the year and is currently training much larger models, with more details to be shared in the future.
How can the audience stay updated with the latest developments?
-The audience can stay updated by following the team's announcements and subscribing to their services, such as Opus.
Outlines
🎨 New Developments in AI Image and Text Generation
The speaker begins by addressing the audience and quickly moves on to announce exciting updates in the field of AI. The introduction of painting to image generation is mentioned, which is a significant shift from text generation. The audience is encouraged to look at a cute dog as a distraction while the speaker teases the upcoming features. The main focus is on the release of new modules for Sigurd and Andrew, which are of the V2 variety, indicating a substantial upgrade. The speaker also introduces Cleo, a custom-made AI model developed in-house. Cleo is highlighted for its extensive training on 1.5 trillion tokens of data, leading to a high level of general knowledge and superior performance compared to existing models, as evidenced by its Lambada score of 73 percent. The model also boasts an impressive 8192 token context within a compact 3 billion parameters package. Cleo serves as a proof of concept, showcasing the team's capability to train large language models. While still experimental, Cleo is made available to Opus subscribers for testing, with a wider release expected in two weeks. The speaker expresses gratitude to the audience for their patience and support.
🚀 Upcoming Features and Subscriber Engagement
The speaker teases that there are even more exciting features and developments planned for the year, expressing hope that the audience shares in the excitement. The mention of Opus subscribers getting first access to new features suggests a tiered approach to releasing updates, which could be a strategy to gather initial feedback and ensure a smooth rollout for the broader user base. The paragraph concludes with a musical interlude, adding a light-hearted touch to the anticipation of future advancements.
Mindmap
Keywords
💡Image Generation
💡Text Generation
💡Modules
💡V2 Variety
💡Tokenizer
💡Pre-trained Data Set
💡Fine Tune
💡Parameter Count
💡Lambada Score
💡Context Length
💡Proof of Concept
💡Opus Subscribers
Highlights
Introduction of painting to image generation, a new feature in NovelAI.
The feature allows users to modify and replace elements in images.
Announcement of the official release of painting in two days, on Thursday.
Sigurd and Andrew Terpy are receiving brand new V2 modules.
The new modules are complete replacements to the original ones with additional features.
Introduction of Cleo, the first custom-made model created in-house at NovelAI.
Cleo has been trained from scratch with a custom tokenizer and dataset.
Cleo has a 1.5 trillion token training, providing better general knowledge.
Cleo achieved a Lambada score of 73 percent, surpassing other similarly sized models.
During fine-tuning, Cleo reached a Lambada score of 74.
Cleo features an 8192 token context, a significant increase from previous models.
Cleo is compact, with only 3 billion parameters.
Cleo serves as a proof of concept for NovelAI's capability to train large language models.
Training Cleo helped finalize the training process and resolve any issues.
NovelAI has begun training even larger models following the success with Cleo.
Cleo is currently experimental and available for Opus subscribers to test.
General availability of Cleo for all users is expected in two weeks.
The NovelAI team expresses gratitude for the patience and support of the community.
More exciting developments are planned for the year ahead.