Chinese Company Unveils SORA Competitor - "Vidu" AI Video Generator
TLDRA Chinese company named Shu has announced a new AI video generator, Vidu, which is positioned as a competitor to SORA. Vidu is built on a proprietary architecture called Universal Vision Transformer (UViT), which combines the strengths of diffusion and Transformer models to create more coherent and accurate video outputs. The company claims that Vidu can generate a 16-second, 180p video clip with a single click. While the results from Vidu's show reel appear to be impressive and more realistic than current competitors like Runway and Pika, the comparison with SORA's quality is not definitive as SORA is yet to be released. The core technology of UViT was reportedly proposed by Vidu's team before SORA's model architecture. Interested users can apply to use Vidu through their website, shanguai.com. The emergence of Vidu and other recent advancements from China highlight the global competition in the AI space, with countries like China making significant strides.
Takeaways
- 🎉 A Chinese company named Shu has announced a new AI video generator called Vidu, which they claim is a competitor to SORA.
- 🚀 Vidu can generate a 16-second 180p video clip with a single click and is built on a self-developed architecture called Universal Vision Transformer (Uvit).
- 🤖 The architecture combines two AI models: diffusion and Transformer, which is considered an advancement in generative AI.
- 📈 Vidu's core technology was first proposed by its research team in September 2022, predating Sora's model architecture.
- 👀 The Transformer model is known for its ability to understand context, which should theoretically improve the coherence of generated videos.
- 🆚 A side-by-side comparison with SORA shows that while Vidu produces high-quality videos, there are some inconsistencies and lower resolution compared to SORA.
- 🌟 Vidu's video generation capabilities are showcased in its show reel, which includes realistic hand movements and detailed imagery.
- 📹 Vidu's videos, while impressive, do have some noticeable flaws, such as inconsistent transformations and inaccuracies in object representation.
- 📝 The article from Global Times suggests that Vidu can output 1080p videos, which were not fully showcased in the provided examples.
- 🌐 Interested users can apply to use Vidu through the website shanguai.com, where they can leave their contact information for further assistance.
- 📈 There has been a surge of AI advancements from China recently, with new language models and robots being unveiled, indicating a competitive edge in the global AI race.
- 💬 The video concludes by encouraging viewers to share their thoughts on Vidu and whether they plan to apply for access, fostering a sense of community and engagement.
Q & A
What is the name of the Chinese company that announced the AI video generator?
-The Chinese company that announced the AI video generator is called Shu.
What is the name of the AI video generator Shu announced?
-The name of the AI video generator is 'Vidu'.
What is the Universal Vision Transformer (Uvit)?
-The Universal Vision Transformer (Uvit) is a self-developed visual transformation model architecture that integrates two text video AI models of diffusion and Transformer, aiming to be the next step in generative AI.
How does the Vidu AI video generator compare to Sora in terms of video quality?
-While Vidu generates high-quality videos and has some advantages such as generating hands well, it is not yet on par with Sora in terms of video quality and consistency, as shown in the side-by-side comparisons.
What are some limitations of the stable diffusion model?
-Some limitations of the stable diffusion model include its inability to generate text very well and its difficulty in understanding context or following more complicated prompts.
How does the Transformer model contribute to the improvement of the diffusion model?
-The Transformer model, known for its ability to understand context, is merged with the diffusion model to create more coherent and accurate videos or images.
What is the significance of the merger between the Transformer model and the diffusion model?
-The merger is significant because it is considered the next step in generative AI, potentially overcoming the limitations of the diffusion model alone.
What is the role of the Institute of AI at Chingua University in the development of Uvit?
-The Institute of AI at Chingua University, led by Ju Jun, the vice dean and chief scientist at Shangu, played a role in the research and development of Uvit.
How can one apply to use the Vidu AI video generator?
-To apply to use the Vidu AI video generator, one can visit the website shanguai.com, fill out the application form with their name, phone number, company name, and expect to be contacted by a marketing consultant.
What are some other recent advancements in AI from China?
-Other recent advancements from China include the launch of Since Nova 5.0 by the Chinese company Since Time, which claims to outperform GPT on nearly all benchmarks, and the unveiling of the S1 robot by the company ASOT.
How does the global AI community view the competition between Vidu and Sora?
-The global AI community generally views competition positively as it drives innovation and improvement in AI technology. The emergence of Vidu as a potential close competitor to Sora is seen as a positive development.
What is the current resolution capability of the Vidu AI video generator?
-The Vidu AI video generator is capable of outputting videos in 1080p resolution, although the examples shown in the script were in 720p.
Outlines
🚀 Introduction to Shu's AI Video Generator
The video introduces a new AI video generator developed by a Chinese company called Shu. The generator, named VD, is claimed to be a competitor to the AI tool Sora. The video showcases a show reel from Shu and discusses the capabilities of VD, which is built on a self-developed architecture called Universal Vision Transformer (UViT). UViT combines the strengths of the diffusion model and the Transformer model, which is known for its context understanding capabilities. The video also compares VD with Sora, highlighting that while VD has some advantages, it is not yet on par with Sora, which is yet to be released.
📊 Comparative Analysis of VD and Sora
The second paragraph provides a side-by-side comparison of VD and Sora's video generation capabilities. It discusses the quality and realism of the videos produced by both AI tools. The video points out some flaws in VD's output, such as inconsistencies in hair transformation and the disappearance of elements like a green leaf. It also contrasts VD's generated videos with Sora's, noting that while VD's videos are not in full HD resolution, the details are less crisp compared to Sora's. The paragraph also mentions the accessibility of VD through the website shanguai.com and the process of applying for its use.
🌏 Global AI Competition and Recent Chinese Innovations
The final paragraph of the script shifts the focus to the broader context of global AI competition. It emphasizes the recent advancements in AI from China, including a new language model and a fast S1 robot by a company called ASOT. The speaker expresses excitement about the unveiling of VD and the potential for it to be a close competitor to Sora. The paragraph also encourages viewers to share their thoughts on VD and whether they will apply for access. It concludes with a call to action for viewers to like, share, subscribe, and stay tuned for more content.
Mindmap
Keywords
💡AI Video Generator
💡Universal Vision Transformer (Uvit)
💡Diffusion Model
💡Transformer Model
💡Stable Diffusion
💡Shangu AI
💡Generative AI
💡Runway and Pika
💡Resolution
💡Competition in AI
Highlights
Shu, a Chinese company, has announced a new AI video generator called Vidu, which is a competitor to SORA.
Vidu claims to be on par with Sora and can generate a 16-second 180p video clip with a single click.
The technology behind Vidu is based on a self-developed visual transformation model architecture called Universal Vision Transformer (Uvit).
Uvit merges the diffusion and Transformer models, which is considered the next step in generative AI.
The Transformer model is known for its ability to understand context, which should make the generated content more coherent.
Ju Jun, Vice Dean of The Institute of AI at Chingua University, claims that Vidu's core technology was proposed before Sora's model architecture.
Vidu's show reel demonstrates the AI's ability to generate realistic videos, including detailed elements like hands.
Comparisons between Vidu and Sora's videos show that while Vidu is impressive, Sora's results appear higher quality and more realistic.
Vidu's video generation has some noticeable flaws, such as inconsistencies in elements like hair and leaves.
The resolution of Vidu's videos is lower than Sora's, with the Vidu show reel only in 720p.
Vidu can output 1080p videos, as mentioned in the Global Times article.
To apply for access to use Vidu, one can fill out a form on the website shanguai.com.
The application process does not specify eligibility and requires basic contact information.
China has been making significant strides in the AI space with recent advancements in language models and robotics.
The unveiling of Vidu adds competition to the AI video generation market, which is beneficial for innovation.
The presenter expresses enthusiasm for competition and the potential of Vidu as a close competitor to Sora.
The presenter encourages viewers to share their thoughts on Vidu and whether they will apply for access.