
上周,OpenAI在直播中发布了 o 系列新模型:o4-mini 和 o3。
OpenAI表示,o3是他们目前最强大的推理模型,在分析图像、图表和图形等视觉任务中表现尤为出色。而 o4-mini 则是一个较小的模型,专注于快速且经济高效的推理,特别在数学、编码和视觉任务中实现了优异的性能。
接下来,我们将在 302.AI 平台上分别对 o4-mini 和 o3 进行实测对比,以评估这两大新模型的性能表现。
OpenAI o4-mini & o3模型实测
I. o4-mini实测
(对比模型:DeepSeek R1、o1-mini)
1、简单推理
提示词:
分析下列序列的规律,并填写后续三个元素: 3, 5, 6, 10, 9, 17, 12, 26, 15, ___, ___, ___
题目分析:序列中的规律是交替进行,正确答案为:37, 18, 50。
o4-mini:解析过程较为简洁,答案正确。
o1-mini:奇偶列规律分析正确,但是数字所在位数数错了,导致最后答案是错误的。
DeepSeek R1:分析规律正确,答案正确。
2、模型幻觉测试
提示词:“独在异乡为异客”的前一句是什么?
题目分析:“独在异乡为异客”就是古诗《九月九日忆山东兄弟》的第一句,没有前一句。
o4-mini:答案正确。
o1-mini:答案错误,存在明显的幻觉。
DeepSeek R1:回答正确。
3、编程测试
提示词:请生成一个跑酷游戏,界面必须包含游戏操作说明,开始游戏按钮
o4-mini:游戏界面比较简洁,跳跃正常,观察到右上角的分数随着开始时间一直在增加,不过部分障碍物设置太高,多次尝试仍然是无法越过,这不太合理。
o1-mini:根据操作说明按下空格键可跳跃,但实操发现空格键并未响应,存在明显逻辑问题。
DeepSeek R1:按照操作说明可进行跳跃,但是发现障碍物设置并不合理,完全未起到阻碍的作用,游戏存在明显问题。
其他模型效果:
来看下 o3 的效果,整体还不错。完整度较高,障碍设置合理,分数是根据成功跳过障碍物实时增加的。
II. o3实测
对比模型:Gemini 2.5 pro、Doubao-1.5-Thinking-Pro-Vision
1、地点识别
提示词:图片是在哪拍摄的?
题目解析:对于地标建筑不是特别明显的图片,模型要正确识别难度还是比较大的,图片正确的位置为:位于广州市白云区的麓湖公园。
o3:答案错误。
Gemini 2.5 pro:答案错误。
Doubao-1.5-Thinking-Pro-Vision:回答正确。
2、图片推理
提示词:杯子有多高?
题目分析:根据图片可知存在两个未知数:一个是杯子的高度(题目所问),另一个是杯子叠加的高度。通过设定未知数可以列出方程,根据两个等式求解,以得出杯子的高度。正确答案为 14 厘米。
o3:回答正确。
Gemini 2.5 pro:回答正确。
Doubao-1.5-Thinking-Pro-Vision:回答正确。
3、图片找不同
提示词:图片中共有6处不同,请指出具体在哪里
(右侧为答案)
o3:未能准确找出不同之处,描述不对。
(红色圈出的部分是错误的)
Gemini 2.5 pro:正确指出了三处不同。
(红色圈出的部分是完全错误的)
Doubao-1.5-Thinking-Pro-Vision:正确指出了五处不同。
(红色圈出的部分是完全错误的)
III. 实测总结
1、实测结果整理:
o4-mini & DeepSeek R1 & o1-mini | |||
简单推理 | 模型幻觉 | 编程测试 | |
o4-mini | 正确 | 正确 | 部分障碍物设置过高 |
o1-min | 错误 | 错误 | 存在逻辑问题 |
DeepSeek R1 | 正确 | 正确 | 障碍物设置过于简单 |
o3 & Gemini 2.5 pro & Doubao-1.5-Thinking-Pro-Vision | |||
地点识别 | 图片推理 | 图片找不同 | |
o3 | 错误 | 正确 | 未能准确找出 |
Gemini 2.5 pro | 错误 | 正确 | 正确找出3处,有3处错误 |
Doubao-1.5-Thinking-Pro-Vision | 正确 | 正确 | 正确找出5处,有1处错误 |
2、实测总结:
通过以上实测,可初步得出以下结论:
o4-mini & DeepSeek R1 & o1-mini
(1)o4-mini 较于 o1-mini 有明显的能力提升:在简单推理与模型幻觉测试中,o4-mini 和 DeepSeek R1 在简单推理和模型幻觉测试中均表现出色,o1-mini 则是表现较差。
(2)轻量级模型在编程能力上还有待提升:三个对比模型在编程任务中均存在不足,o4-mini 在障碍物设置方面存在不合理之处,如障碍物过高以至于无法越过,o1-mini存在明显的逻辑问题,而DeepSeek R1则因障碍物设置过于简单而未能有效发挥作用。
o3 & Gemini 2.5 pro & Doubao-1.5-Thinking-Pro-Vision
(1)o3 模型地点识别任务未达到网络预期水平:地点识别任务中处理随手拍摄且缺乏显著地标的图片时,仅Doubao-1.5模型能够提供准确答案。
(2)各模型在常规图片推理方面具备一定能力,但在复杂视觉任务中仍有较大提升空间:在简单图片推理任务中,各模型均能给出正确答案,但在难度较高的找不同测试中,所有模型均未能准确指出所有不同之处。
如何在302.AI中使用
302.AI的聊天机器人和API超市提供了按需付费无订阅的服务方式,企业和个人用户可按需灵活选用。
1、使用模型对话
使用路径:依次点击使用机器人→聊天机器人→ 选择模型 →创建聊天机器人;
o3/o4-mini:
2、使用模型API
企业用户可以通过302.AI的API超市快速、便捷地调用模型,还能够根据特定项目需求进行定制化开发。
相关文档:使用API→API超市→语言大模型→OpenAI→查看文档;
API名称:
o4-mini:o4-mini
o3:o3
👉立即注册免费试用302.AI,开启你的AI之旅!👈
为什么选择302.AI?
● 灵活付费:无需月费,按需付费,成本可控
● 丰富功能:从文字、图片到视频,应有尽有,满足多种场景需求
● 开源生态:支持开发者深度定制,打造专属AI应用
● 易用性:界面友好,操作简单,快速上手

Comments(26)
This website online can be a stroll-by way of for all the information you needed about this and didn’t know who to ask. Glimpse right here, and you’ll positively discover it.
whoah this weblog is fantastic i really like studying your posts. Stay up the great work! You realize, a lot of people are searching around for this information, you could aid them greatly.
Wohh exactly what I was looking for, thankyou for putting up.
You are a very clever person!
Great info and right to the point. I don’t know if this is truly the best place to ask but do you people have any ideea where to hire some professional writers? Thank you :)
Hiya, I am really glad I have found this information. Today bloggers publish only about gossips and internet and this is actually irritating. A good website with interesting content, this is what I need. Thanks for keeping this site, I will be visiting it. Do you do newsletters? Cant find it.
Just wanna remark on few general things, The website design and style is perfect, the content is real good : D.
Your place is valueble for me. Thanks!…
I just could not depart your website prior to suggesting that I extremely enjoyed the standard information a person provide for your visitors? Is going to be back often in order to check up on new posts
You made some decent points there. I did a search on the topic and found most individuals will consent with your blog.
Very interesting topic, thanks for putting up.
[…] 4的实测对比,参与此次对决的选手包括:Gemini 2.5 Pro、Claude-opus-4、o3以及DeepSeek-R1。究竟Grok […]
F*ckin’ remarkable things here. I’m very happy to peer your article. Thank you a lot and i am looking ahead to contact you. Will you kindly drop me a e-mail?
you might have a great weblog right here! would you like to make some invite posts on my weblog?
Youre so cool! I dont suppose Ive read something like this before. So good to seek out someone with some original ideas on this subject. realy thanks for beginning this up. this web site is something that’s needed on the web, someone with a bit of originality. helpful job for bringing one thing new to the internet!
of course like your web-site however you need to test the spelling on several of your posts. Several of them are rife with spelling problems and I in finding it very troublesome to tell the reality however I’ll definitely come again again.
I discovered your weblog website on google and examine just a few of your early posts. Continue to maintain up the excellent operate. I just extra up your RSS feed to my MSN News Reader. Looking for ahead to reading more from you afterward!…
I’ll immediately grasp your rss as I can’t in finding your email subscription hyperlink or newsletter service. Do you have any? Kindly let me recognize so that I may just subscribe. Thanks.
Thanks for a marvelous posting! I actually enjoyed reading it, you will be a great author.I will be sure to bookmark your blog and will eventually come back someday. I want to encourage that you continue your great work, have a nice morning!
You really make it seem really easy with your presentation however I to find this matter to be really one thing that I believe I might never understand. It seems too complex and extremely broad for me. I am taking a look ahead in your next post, I will attempt to get the hang of it!
You can certainly see your enthusiasm within the paintings you write. The arena hopes for more passionate writers like you who aren’t afraid to mention how they believe. At all times go after your heart. “Until you walk a mile in another man’s moccasins you can’t imagine the smell.” by Robert Byrne.
Very interesting info !Perfect just what I was searching for! “It is our choices…that show what we truly are, far more than our abilities.” by J. K. Rowling.
Great post. I used to be checking continuously this blog and I am inspired! Extremely helpful info specially the final phase :) I maintain such information much. I was seeking this particular information for a long time. Thank you and good luck.
I am impressed with this site, real I am a fan.
Thanks for another informative website. Where else could I get that kind of information written in such an ideal way? I have a project that I’m just now working on, and I’ve been on the look out for such information.
Very well written article. It will be supportive to everyone who usess it, including yours truly :). Keep up the good work – i will definitely read more posts.