In a bold move to compete in the rapidly growing artificial intelligence (AI) industry, Chinese tech company Alibaba on Wednesday launched a new version of its AI model, Qwen 2.5-Max, claiming it surpassed the performance of well-known models like DeepSeek’s AI, OpenAI’s GPT-4o and Meta’s Llama.
The release of Qwen 2.5-Max on the first day of the Lunar New Year, a time when many Chinese people are traditionally off work and spending time with their families, strategically underscores the pressure DeepSeek’s meteoric rise in the past three weeks has placed on not only its overseas rivals but also its domestic competitors, such as Tencent Holdings Ltd. and Baidu Inc.
The companyโs new model has reportedly been developed on over 20 trillion tokens and further post-trained with curated Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF) methodologies.
“Qwen 2.5-Max outperformsโฆ almost across the board GPT-4o, DeepSeek-V3 and Llama-3.1-405B,” Alibaba’s Cloud unit said in an announcement posted on its official WeChat account, referring to international giants like OpenAI and Meta.
Alibaba announced that its Qwen2.5-Max outperforms DeepSeek V3 in multiple benchmarks, including Arena-Hard, LiveBench, LiveCodeBench, and GPQA-Diamond.
It also demonstrated impressive results in other evaluations, including MMLU-Pro.
The companyโs base models have shown substantial improvements across the majority of benchmarks, and it is confident that advancements in post-training methods will raise the next version of Qwen2.5-Max to even greater levels of performance.
โWhen comparing base models, we are unable to access the proprietary models such as GPT-4o and Claude-3.5-Sonnet. Therefore, we evaluate Qwen2.5-Max against DeepSeek V3, a leading open-weight MoE model, Llama-3.1-405B, the largest open-weight dense model, and Qwen2.5-72B, which is also among the top open-weight dense models,โ the company said in a blog.
Further, Alibabaโs Qwen added: โNow Qwen2.5-Max is available in Qwen Chat, and you can directly chat with the model, or play with artifacts, search, etc.โ
This is all due to DeepSeek, a Chinese start-up founded in 2023 in Alibaba’s hometown of Hangzhou.
With the release of its DeepSeek-V3 and R1 models, DeepSeek has sent shockwaves across the U.S. AI landscape.
It has also quickly gained global attention for its significant low cost and computing power, with investors questioning the viability of costly AI projects from U.S.-based companies.
The start-upโs success in China has sparked intense competition among the countryโs tech giants.
For instance, just two days after DeepSeek launched its R1 model, TikTokโs parent company ByteDance responded with an update to its flagship AI model, claiming it outperformed OpenAIโs o1 on AIME, in a crucial benchmark test that evaluates AI performance in understanding and executing complex instructions.
However, DeepSeek had already made its own bold claim, claiming that its R1 model could rival or surpass OpenAIโs o1 on multiple performance benchmarks.