from Hacker News

OpenAI o3-pro

by mfiguiere on 6/10/25, 8:15 PM with 199 comments

by DanMcInerney on 6/10/25, 9:09 PM
I'm really hoping GPT5 is a larger jump in metrics than the last several releases we've seen like Claude3.5 - Claude4 or o3-mini-high to o3-pro. Although I will preface that with the fact I've been building agents for about a year now and despite the benchmarks only showing slight improvement, I have seen that each new generation feels actively better at exactly the same tasks I gave the previous generation.
It would be interesting if there was a model that was specifically trained on task-oriented data. It's my understanding they're trained on all data available, but I wonder if it can be fine-tuned or given some kind of reinforcement learning on breaking down general tasks to specific implementations. Essentially an agent-specific model.
by chad1n on 6/10/25, 9:03 PM
The guys in the other thread who said that OpenAI might have quantized o3 and that's how they reduced the price might be right. This o3-pro might be the actual o3-preview from the beginning and the o3 might be just a quantized version. I wish someone benchmarks all of these models to check for drops in quality.
by manmal on 6/10/25, 8:23 PM
The benchmarks don’t look _that_ much better than o3. Does that mean Pro models are just incrementally better than base models, or are we approaching the higher end of a sigmoid function, with performance gains leveling off?
by mark_l_watson on 6/11/25, 4:18 AM
I am still not willing to upgrade to a Pro account. I pay $20 a month for both Gemini and ChatGPT, and for what I need this is currently enough.
I have dreamed of having powerful AI ever since I read Bertram Raphael's great book Mind Inside Matter around 1978, getting hooked on AI research and sometimes practical applications for my life since then.
I can easily afford $200 for a Pro account but I get this nagging feeling that LLMs are not the final path to the powerful AI I have always dreamed of and I don't want to support this level of hype.
I have lived through a few AI winters and I worry that accountants will tally up the costs, environmental and money, versus the benefits and that we collectively have an 'oh shit' moment.
by swyx on 6/10/25, 8:40 PM
here's a nice user review we published: https://www.latent.space/p/o3-pro
sama's highlight[0]:
> "The plan o3 gave us was plausible, reasonable; but the plan o3 Pro gave us was specific and rooted enough that it actually changed how we are thinking about our future."
I kept nudging the team to go the whole way to just let o3 be their CEO but they didn't bite yet haha
0: https://x.com/sama/status/1932533208366608568
by WhitneyLand on 6/10/25, 9:03 PM
So, we currently have o4-mini and o4-mini-high, which represent medium and high usage of “thinking” or use of reasoning tokens.
This announcement adds o3-pro, which pairs with o3 in the same way the o4 models go together.
It should be called o3-high, but to align with the $200 pro membership it’s called pro instead.
That said o3 is already an incredibly powerful model. I prefer it over the new Anthropic 4 models and Gemini 2.5. It’s raw power seems similar to those others, but it’s so good at inline tool use it usually comes out ahead overall.
Any non-trivial code generation/editing should be using an advanced reasoning model, or else you’re losing time fixing more glitches or missing out on better quality solutions.
Of course the caveat is cost, but there’s value on the frontier.
by eru on 6/12/25, 9:29 AM
I'm trying out o3-pro now with some algorithmic questions. It seems to be doing alright, but it's taking an awfully long time (as expected) and the UIs seem to time out a lot, especially the Android app and the MacOS desktop app. The web interface seems the least flaky, but that's not saying much.
by ChrisArchitect on 6/10/25, 8:39 PM
Related:
OpenAI dropped the price of o3 by 80%
https://news.ycombinator.com/item?id=44239359
by tiahura on 6/10/25, 8:23 PM
So, upgrade to Teams and pay the $50? Plus more usage of o3. Seems like it might be a shot at the $100 claude max?
by honeybadger1 on 6/12/25, 9:23 AM
Gemini still, for me, feels like the king for speed and accuracy.
by nickandbro on 6/11/25, 1:27 AM
"create a svg of a pelican riding on a bicycle"
https://www.svgviewer.dev/s/c3j6TEAP
in case anyone is interested
by vintagedave on 6/12/25, 9:05 AM
> Update to o4-mini (June 6, 2025) > We are rolling back an o4-mini snapshot, that we deployed less than a week ago and intended to improve the length of model responses, because our automated monitoring tools detected an increase in content flags.
Does anyone know what it did or returned? I had not seen anything, nor have I read anything, about issues here.
by ikerino on 6/11/25, 2:33 AM
https://www.latent.space/p/o3-pro
Have completed around a dozen chats with o3-pro so far. Can't say I'm impressed, output feels qualitatively very similar to regular o3.
Tried feeding in loads of context as suggested in the article but generally feels like a miss.
by conradfr on 6/12/25, 12:17 PM
Nitpicking but this page is not practical to share has there's no individual url per post (AFAIK) (the # part is not picked up by Slack etc to generate preview).
by paul7986 on 6/12/25, 2:36 AM
GPT needs way better image creation! Today I asked it to create a full image of a 2025 calendar highlighting all weekday workdays excluding federal holidays. At the bottom of legend tell me how many weekday work hours are available within criteria noted.
It created the image showing each month but when you looked at each month it was so janky ... February 31st and other huge errors!
I'm not using image creation to create 3d art for fun or art sake im trying to use it to create utility images to share for discussion with friends & co-workers. The above is just one of many ways it fails when creating utility images!
by mmsc on 6/10/25, 8:22 PM
I understand that things are moving fast and all, but surely the.. 8? models which are currently available is a bit .. overwhelming for users that just want to get answers to their questions of life? What's the end goal with having so many models available?
by carmelion on 6/10/25, 8:33 PM
Jl App