from Hacker News

Intellect-2 Release: The First 32B Model Trained Through Globally Distributed RL

by Philpax on 5/12/25, 1:46 AM with 65 comments

  • by throwanem on 5/12/25, 3:04 AM

    There's a name and a logo. "Hubris" feels slightly beggared. https://en.m.wikipedia.org/wiki/The_Metamorphosis_of_Prime_I...
  • by refulgentis on 5/12/25, 3:02 AM

    I guess I'm bearish?

    It's not that they trained a new model, but they took an existing model and RL'd it a bit?

    The scores are very close to QwQ-32B, and at the end:

    "Overall, as QwQ-32B was already extensively trained with RL, it was difficult to obtain huge amounts of generalized improvement on benchmarks beyond our improvements on the training dataset. To see stronger improvements, it is likely that better base models such as the now available Qwen3, or higher quality datasets and RL environments are needed."

  • by iTokio on 5/12/25, 5:00 AM

    It’s interesting that it does something useful (training a LLM) without trust and in a decentralized way.

    Maybe this could be used as proof of work? To stop wasting computing resources in crypto currencies and get something useful as a byproduct.

  • by 3abiton on 5/12/25, 4:55 AM

    This is rather exciting! I see the future of Co-op models made by a community of experts on a specific field that would still allow them to be competitive with "AI monopolies". Maybe not all hope is lost!
  • by Thomashuet on 5/12/25, 8:08 AM

    Summary: We've use the most complexest, buzzwordiest training infrastructure to increase the performance of our base model by a whopping 0.5% (±1%).
  • by danielhanchen on 5/12/25, 4:52 AM

    I made some GGUFs at https://huggingface.co/unsloth/INTELLECT-2-GGUF

    ./llama.cpp/llama-cli -hf unsloth/INTELLECT-2-GGUF:Q4_K_XL -ngl 99

    Also it's best to read https://docs.unsloth.ai/basics/tutorial-how-to-run-qwq-32b-e... on sampling issues for QwQ based models.

    Or TLDR, use the below settings:

    ./llama.cpp/llama-cli -hf unsloth/INTELLECT-2-GGUF:Q4_K_XL -ngl 99 --temp 0.6 --repeat-penalty 1.1 --dry-multiplier 0.5 --min-p 0.00 --top-k 40 --top-p 0.95 --samplers "top_k;top_p;min_p;temperature;dry;typ_p;xtc"

  • by abtinf on 5/12/25, 3:46 AM

    Does this have anything to do with The Metamorphosis Of Prime Intellect, or did they just abuse the name and the cover art?
  • by esafak on 5/12/25, 2:41 AM

    How are they ensuring robustness against adversarial responses?
  • by schneehertz on 5/12/25, 3:40 AM

    I used to have an idea related to science fiction novels that artificial intelligence could aggregate computing power through the network to perform ultra-large-scale calculations, thereby achieving strong artificial intelligence. Reality will also develop in this way, which is very interesting
  • by mountainriver on 5/12/25, 3:01 AM

    Awesome work this team is doing. Globally distributed MoE could have real legs
  • by quantumwoke on 5/12/25, 2:47 AM

    Wonder what the privacy story is like. Enterprises don't usually like broadcasting their private data across a freely accessible network.
  • by bwfan123 on 5/12/25, 3:32 PM

    The most interesting thing I see is the productization of the diloco work done here [1]. If someone can make this scale, then we can say goodbye to expensive backend networking and mainframe-like AI training machinery.

    [1] https://arxiv.org/abs/2311.08105

  • by ikeashark on 5/12/25, 3:53 PM

    I wonder why they randomly noted a torch-compile vs non torch-compile figure where torch-compile degraded model performance. What made it degrade? It seems to only appear in one figure and nowhere else.
  • by ndgold on 5/12/25, 2:46 AM

    Pretty badass
  • by Mougatine on 5/12/25, 11:14 AM

    very cool work!
  • by jumploops on 5/12/25, 3:04 AM

    Congrats to the team on the launch!

    Personal story time: I met a couple of their engineers at an event a few months back. They mentioned they were building a distributed training system for LLMs.

    I asked them how they were building it and they mentioned Python. I said something along the lines of “not to be the typical internet commenter guy, but why aren’t you using something like Rust for the distributed system parts?”

    They mumbled something about Python as the base for all current LLMs, and then kinda just walked away…

    From their article: > “Rust-based orchestrator and discovery service coordinate permissionless workers”

    Glad to see that I wasn’t entirely off-base :)