by decide1000 on 3/16/25, 10:23 AM with 257 comments
by GavCo on 3/16/25, 1:07 PM
The source for this claim is apparently a chart in the second tweet in the thread, which compares ERNIE-4.5 to GPT-4.5 across 15 benchmarks and shows that ERNIE-4.5 scores an average of 79.6 vs 79.14 for GPT-4.5.
The problem is that the benchmarks they included in the average are cherry-picked.
They included benchmarks on 6 Chinese language datasets (C-Eval, CMMLU, Chinese SimpleQA, CNMO2024, CMath, and CLUEWSC) along with many of the standard datasets that all of the labs report results for. On 4 of these Chinese benchmarks, ERNIE-4.5 outperforms GPT-4.5 by a big margin, which skews the whole average.
This is not how results are normally reported and (together with the name) seems like a deliberate attempt to misrepresent how strong the model is.
Bottom line, ERNIE-4.5 is substantially worse than GPT-4.5 on most of the difficult benchmarks, matches GPT-4.5 and other top models on saturated benchmarks, and is better only on (some) Chinese datasets.
by ksec on 3/16/25, 11:43 AM
This is just like everything in China. They will find ways to drive down cost to below anyone previously imagined, subsidised or not. And even just competing among themselves with DeepSeek vs ERNIE and Open sourcing them meant there is very little to no space for most.
Both DRAM and NAND industry for Samsung / Micron may soon be gone, I thought this was going to happen sooner but it seems finally happening. GPU and CPU Designs are already in the pipelines with RISC-V, IMG and ARM-China. OLED is catching up, LCD is already taken over. Batteries we know. The only thing left is foundries.
Huawei may release its own Open Source PC OS soon. We are slowly but surely witnessing the collapse of Western Tech scene.
by patrickhogan1 on 3/16/25, 11:39 AM
https://research.baidu.com/Blog/index-view?id=89
I am excited to see Baidu catchup. It feels like they have earned it. Being very early.
by jampekka on 3/16/25, 11:08 AM
by pacifika on 3/16/25, 11:26 AM
by decide1000 on 3/16/25, 10:26 AM
Comparison models: https://x.com/Baidu_Inc/status/1901094083508220035/photo/1
by simonw on 3/16/25, 11:26 AM
by Logge on 3/16/25, 12:20 PM
by colesantiago on 3/16/25, 12:21 PM
OpenAI, Anthropic, et al, are getting sucked into a vortex of competition with China that is ultimately going to zero.
AI is the ultimate race to zero.
There is no moat. AI and intelligence is becoming a commodity with nobody (except Nvidia) is making money. This is known for a while now.
The acceleration and adoption would only make those in the middle who aren't aware of the change happening without a job and unable to get a job.
The US-China competition in addition to Jevons Paradox will be so viciously fierce that jobs will be removed as soon as they are created.
by jamesblonde on 3/16/25, 1:21 PM
https://github.com/PaddlePaddle/Paddle
They have pedigry.
by kleiba on 3/16/25, 12:02 PM
China: Thanks, already on it.
by curl-up on 3/16/25, 12:46 PM
Being able to clearly and correctly discuss science topics, to write about art, to understand nuances in (previously unseen) literature, etc. is impossible simply through powerful-reasoning + RAG, and so many advanced use cases would be enabled by this. Sonnet 3.5+ and GPT 4.5 are still unparalleled here, and it's not even close.
by pera on 3/16/25, 12:34 PM
by cubefox on 3/16/25, 1:39 PM
by ohso4 on 3/16/25, 2:15 PM
by gitfan86 on 3/16/25, 12:25 PM
BUT new use cases are now realistic. The question is how long until demand for the new use cases shows up
by logicchains on 3/16/25, 11:34 AM
by unhappy_meaning on 3/16/25, 2:39 PM
by infrawhispers on 3/16/25, 11:31 AM
by itsTyrion on 3/24/25, 4:40 PM
by hjgjhyuhy on 3/16/25, 11:28 AM
by camillomiller on 3/16/25, 11:17 AM
by buyucu on 3/16/25, 12:35 PM
OpenAI is increasingly irrelevant. They no longer push the boundaries of technology.
by folli on 3/16/25, 1:36 PM
I assume there's some reasonable tool out there to convert PDFs to Markup and than feed it to some LLM API with okay costs (Gemini? DeepSeek?). Any suggestions?