302.AI 基准实验室 | 阿里千问发布数学模型Qwen2-Math，最好的数学模型出现了？！

302.AI • 2024 年 8 月 19 日下午7:09 • 基准实验室 • 1925 意见

8月9日，阿里通义团队发布新一代数学模型Qwen2-Math，据官方称，Qwen2-Math 是一系列基于 Qwen2 LLM 构建的专门用于数学解题的语言模型，其数学能力显著超越了开源模型，甚至超过了闭源模型（如 GPT-4o），Qwen2-Math包含1.5B、7B、72B三个参数的基础模型和指令微调模型。

在一系列数学基准评测上，Qwen2-Math-72B-Instruct 超越了最先进的模型，包括 GPT-4o、Claude-3.5-Sonnet、Gemini-1.5-Pro 和 Llama-3.1-405B。

Qwen2-Math 的基础模型使用 Qwen2-1.5B、7B、72B 进行初始化，然后在精心设计的数学专用语料库上进行预训练。在三个广泛使用的英语数学基准 GSM8K、Math 和 MMLU-STEM 上评估了 Qwen2-Math 基础模型。同时，还评估了三个中国数学基准 CMATH、高考数学完形填空和高考数学问答。所有评估都使用少量的思路链提示进行测试。

阿里通义团队基于 Qwen2-Math-72B 训练了数学专用奖励模型，并结合二进制信号通过 GRPO 进行强化学习。对 Qwen2-Math-Instruct 在英语和中文的数学基准评测上进行了评估。除了常用的基准评测，如 GSM8K 和 MATH 之外，还加入了更具挑战性的考试以全面检测 Qwen2-Math-Instruct 的能力。其中，Qwen2-Math-Instruct 在基准测试中表现最佳，证明了数学奖励模型的有效性。

在更复杂的数学竞赛评估（例如 AIME 2024 和 AMC 2023）中，Qwen2-Math-Instruct 在各种设置中也表现良好，包括 Greedy、Maj@64、RM@64 和 RM@256。

在官方文档中，千问团队也展示了一些竞赛题的示例，比如：

据了解，Qwen2-Math目前主要针对英文场景，中英双语和多语言模型正在开发中。另外，根据Qwen2-Math的许可协议，对于72B版本，如果每月活跃用户数超过1亿，是需要向千问团队申请许可。

然而，在302.AI的API超市中，已经更新了Qwen2-Math-72B的API。302.AI提供按需付费的付费方式，支持在线调试，通过302.AI的API超市，用户可以通过简单的API调用来集成复杂的功能，而且提供技术支持和帮助文档，帮助用户解决集成过程中遇到的问题。

值得一提的是，302.AI的聊天机器人也同步更新了Qwen2-Math-72B模型，为用户提供了一个更为便捷的使用途径，对于不熟悉API使用的AI爱好者，可以直接通过302.AI的聊天机器人来使用这一模型，同样是按需付费的模式，无需月费或捆绑套餐，使用户能够灵活地体验和应用这一先进的数学模型。

最后，用一个常用的数学问题来测试下Qwen2-Math-72B模型。结果显示，这道曾经让多个模型蒙圈的题目，不仅没有难倒Qwen2-Math-72B模型，且每一步的解释都比较清楚：

随着Qwen2-Math数学模型的推出，它不仅为数学教育和研究领域带来了新的发展机遇，更标志着人工智能技术的进一步融入我们的日常生活。数学模型的出现，其意义远超解决单一深奥数学题目的范畴，它为解题者提供了一种全新的思路和方法，通过展示解题过程，帮助用户逐步深入理解数学概念和原理，从而培养用户的逻辑思维和问题解决能力。未来，我们可以期待支持多语言的数学模型出现。

👉立即注册免费试用302.AI，开启你的AI之旅！👈

为什么选择302.AI？

● 灵活付费：无需月费，按需付费，成本可控
● 丰富功能：从文字、图片到视频，应有尽有，满足多种场景需求
● 开源生态：支持开发者深度定制，打造专属AI应用
● 易用性：界面友好，操作简单，快速上手

LLM Qwen2-Math 阿里通义302.AI 基准实验室 | 模型测评

喜欢 (0)

302.AI

302.AI 新品发布 | 当FLUX结合LoRA技术，你还分得清现实和AI吗？

上一页 2024 年 8 月 19 日下午6:44

302.AI 基准实验室 | OpenAI更新模型ChatGPT-4o-latest，与GPT-4o对比不同在哪里？

下一页 2024 年 8 月 20 日下午6:33

从文本助手到生产力智能体——2025大模型年度测评：多模态、强推理与真交付 | 302.AI 基准实验室

导读：2025年，大语言模型完成从“文本助手”到“生产力智能体”的关键跃迁。本报告深度实测Gemini 3 Pro、Claude Opus 4.5、GPT-5.2、Grok 4.1、GLM-4.7、DeepSeek-V3.2六大旗舰模型，覆盖模型幻觉控制、复杂逻辑推理、多模态融合理解、创意生成与人类直觉、编程与工程化交付五大高难度真实场景。评测结果显示：G…
2026 年 1 月 14 日 • 基准实验室
1.8K00
懂交付，更懂质感：MiniMax M2.1 Vs. GLM 4.7 国产开源顶流对决丨302.AI 基准实验室

12 月 23 日，MiniMax 正式对外发布其新一代旗舰级 Coding & Agent 模型 MiniMax M2.1。与许多大模型发布会执着于罗列通用知识得分不同，M2.1 这次把所有的聚光灯都打在了“编程”与“智能体”这两个关键词上，官方定位直言不讳：为真实世界的复杂任务而生。显然，这不仅仅是一次常规的版本迭代，更像是 MiniMax 在…
2025 年 12 月 31 日 • 基准实验室
3.1K01
302.AI客户端：零配置，支持任意模型，最适合新手的Vibe Coding工具 | 新品发布

在AI行业飞速发展的2025 年，最炙手可热的关键词之一绝对少不了 “Vibe Coding” 。所谓 Vibe Coding，即“氛围感编程”——你只需使用自然语言描述需求，AI 便会为你生成代码。这一变革彻底粉碎了编程的技术高墙，让每一位普通人都能跳过晦涩的编程语言，亲手打造专属应用。为Vibe Coding打造的工具也层出不穷，在 Cursor、L…
2025 年 12 月 26 日 • 新品发布
1.5K00
智谱压轴力作 GLM-4.7 实测：从基准刷榜到任务交付，稳坐开源第一丨302.AI 基准实验室

随着2025年接近尾声，大模型领域的竞争未见放缓，反而迎来了一波重磅更新。今日凌晨，智谱突袭发布了其新一代旗舰模型——GLM-4.7，以一系列 SOTA 表现，为今年的开源战场献上了堪称“压轴”的力作。此次更新将核心焦点投向了编码能力、长程任务规划与智能体协作，不仅在多项国际主流基准测试中横扫开源榜单，更以任务交付为核心，致力于成为开发者手中真正高效、可靠…
2025 年 12 月 23 日 • 基准实验室
5.5K00

发表回复

tlovertonet 2025 年 5 月 23 日上午9:28
There is noticeably a bundle to know about this. I assume you made certain nice points in features also.
回复
Tom Wenger 2025 年 6 月 4 日下午5:34
Thank you for the sensible critique. Me & my neighbor were just preparing to do a little research on this. We got a grab a book from our area library but I think I learned more clear from this post. I’m very glad to see such great info being shared freely out there.
回复
Elnora Mamaril 2025 年 6 月 16 日下午5:45
This web site is my breathing in, rattling great design and style and perfect content material.
回复
Davis Audas 2025 年 6 月 30 日下午3:39
You made some good points there. I did a search on the topic and found most folks will agree with your blog.
回复
Zachery Burnside 2025 年 7 月 1 日上午8:32
I have to show my passion for your generosity for men and women who really need help on this particular subject matter. Your real commitment to getting the message up and down had been wonderfully significant and have really permitted some individuals much like me to get to their dreams. Your own invaluable recommendations means a lot a person like me and even more to my office workers. Best wishes; from each one of us.
回复
Hawaii medical malpractice lawyer 2025 年 7 月 24 日上午9:41
I’m writing to let you be aware of of the nice discovery my wife’s princess gained checking yuor web blog. She figured out so many pieces, most notably how it is like to have an ideal giving style to get other folks without problems fully understand various extremely tough subject matter. You undoubtedly surpassed our expected results. Thanks for rendering those helpful, dependable, informative and also unique tips on that topic to Jane.
回复
Ronna Lemaitre 2025 年 7 月 28 日下午7:26
Simply wanna say that this is very useful, Thanks for taking your time to write this.
回复
short play script 2025 年 7 月 30 日上午4:03
You made some clear points there. I did a search on the subject matter and found most people will consent with your site.
回复
cabling technician near me san Antonio,tx 2025 年 7 月 30 日上午4:43
Hello! I could have sworn I’ve been to this blog before but after browsing through some of the post I realized it’s new to me. Anyways, I’m definitely happy I found it and I’ll be book-marking and checking back frequently!
回复
industrial workshop cloth 2025 年 7 月 31 日上午12:49
I enjoy examining and I believe this website got some really useful stuff on it! .
回复
hosting services 2025 年 8 月 7 日上午11:19
I like what you guys are up too. Such clever work and reporting! Carry on the excellent works guys I have incorporated you guys to my blogroll. I think it’ll improve the value of my site :)
回复
hptoto 2025 年 8 月 16 日上午3:28
Thank you, I have recently been searching for information about this topic for ages and yours is the greatest I’ve discovered till now. But, what about the bottom line? Are you sure about the source?
回复
toto macau 2025 年 8 月 17 日下午8:54
I got what you mean , appreciate it for putting up.Woh I am pleased to find this website through google. “Food is the most primitive form of comfort.” by Sheila Graham.
回复
casino games online 2025 年 8 月 19 日下午1:53
I’ve been exploring for a little for any high quality articles or blog posts on this kind of area . Exploring in Yahoo I at last stumbled upon this site. Reading this information So i am happy to convey that I’ve a very good uncanny feeling I discovered exactly what I needed. I most certainly will make certain to do not forget this site and give it a look on a constant basis.
回复
toto macau 2025 年 8 月 21 日上午12:20
I carry on listening to the newscast speak about receiving boundless online grant applications so I have been looking around for the top site to get one. Could you advise me please, where could i acquire some?
回复
shorts deportivos mujer 2025 年 8 月 26 日上午12:19
I like this post, enjoyed this one regards for putting up.
回复