by oli5679 on 1/30/25, 11:47 AM with 59 comments
by ben30 on 1/30/25, 2:15 PM
What's particularly striking is their deliberate choice to stay lean and problem-focused, avoiding the bureaucratic bloat that often plagues AI departments at larger companies. By hiring people driven primarily by curiosity and technical challenges rather than career advancement, they've created an environment where genuine innovation can flourish.
AI development doesn't necessarily require massive resources - it's more about fostering the right culture of open collaboration and maintaining focus on the core technical challenges.
by walterbell on 1/30/25, 6:51 PM
> Liang Wenfeng is a very rare person in China's AI industry who has abilities in “strong infrastructure engineering, model research, and also resource mobilization”, and “can make accurate high-level judgments, and can also be stronger than a frontline researcher in the technical details”. He has a “terrifying ability to learn” and at the same time is “less like a boss and more like a geek”.
by infecto on 1/30/25, 2:11 PM
by eduction on 1/30/25, 2:18 PM
“In disruptive tech, closed-source moats are fleeting. Even OpenAI’s closed-source model can’t prevent others from catching up.
“Therefore, our real moat lies in our team’s growth—accumulating know-how, fostering an innovative culture. Open-sourcing and publishing papers don’t result in significant losses. For technologists, being followed is rewarding. Open-source is cultural, not just commercial. Giving back is an honor, and it attracts talent.”
by falcor84 on 1/30/25, 2:40 PM
> An Yong: What do you envision as the endgame for large AI models?
I don't know if it has a different meaning/connotation in Chinese, but reading this metaphor with a Chess connotation scared me. If there is a game, who are the players? what is the victory condition? will there be a static stalemate, or a definitive win? and most importantly, will there be an opportunity for future games after it, or is this the final game we get to play?
by oli5679 on 1/30/25, 11:54 AM
Researcher at Meta or OpenAI spending hundreds of millions on compute, and being paid millions themselves, whilst not publishing any of their learnings openly, here a bunch of very smart, young Chinese researchers have had some great ideas, proved they work, and published details that allow everyone else to replicate.
"No “inscrutable wizards” here—just fresh graduates from top universities, PhD candidates (even fourth- or fifth-year interns), and young talents with a few years of experience."
"If someone has an idea, they can tap into our training clusters anytime without approval. Additionally, since we don’t have rigid hierarchical structures or departmental barriers, people can collaborate freely as long as there’s mutual interest."
by cchance on 1/30/25, 2:24 PM
by jgord on 1/30/25, 2:39 PM
Maybe DeepSeeks creative use of RL within LLMs will open up founder and VC interest in using RL to solve real problems - I expect to see a cambrian explosion of high growth applied RL startups in engineering,logistics,finance,medicine
by newbie578 on 1/30/25, 2:14 PM
by wouldbecouldbe on 1/30/25, 2:35 PM
by mythz on 1/30/25, 2:42 PM
Were fortunate that not all SOTA AI models are controlled by US Tech corps. Right now they're in the "maximum marketshare at all costs" stage, but they'll be looking for their ROI after achieving a dominant share. I trust OpenAI the least, it's still early on in the AI age and they look like the company that they were formed to prevent.
Can only hope that DeepSeek, Facebook, Qwen and Mistral continue to release open models. Unfortunately if a companies motivation is ROI from cloud hosting then they're going to be incentivised to stop releasing their models as OSS to prevent competition which we've seen with Mistral's best models although in their latest model released today under Apache 2.0 the CEO is saying they’re renewing their commitment to Open Source [1], so we’ll have to see how long that holds. We're also starting to see that from Alibaba whose latest Qwen2.5-Max model is only available through their Alibaba Cloud. Luckily Facebook business model isn't reliant on cloud hosting so we should continue to expect Open models from them. So far efficiency seems to be DeepSeek's competitive advantage as despite being OSS they're still the cheapest hosting provider [2] despite other hosting providers not having to recoup any R&D and training costs.