from Hacker News

GLM-4-9B: open-source model with superior performance to Llama-3-8B

by marcelsalathe on 6/5/24, 6:26 PM with 17 comments

  • by ilaksh on 6/6/24, 1:14 AM

    Looks like terrific technology. However, the translation says that it's an "irrevocable revocable" non-commercial license with a form to apply for commercial use.
  • by great_psy on 6/5/24, 11:55 PM

    I’m excited to hear work is being done on models that support function calling natively.

    Does anybody know if performance could be greatly increased if only a single language was supported ?

    I suspect there’s a high demand for models that are maybe smaller and can run faster if the tradeoff is support for only English.

    Is this available in ollama ?

  • by abrichr on 6/6/24, 4:38 PM

    > GLM-4V-9B possesses dialogue capabilities in both Chinese and English at a high resolution of 1120*1120. In various multimodal evaluations, including comprehensive abilities in Chinese and English, perception & reasoning, text recognition, and chart understanding, GLM-4V-9B demonstrates superior performance compared to GPT-4-turbo-2024-04-09, Gemini 1.0 Pro, Qwen-VL-Max, and Claude 3 Opus.

    But according to their own evaluation further down, gpt-4o-2024-05-13 outperforms GLM-4V-9B on every task except OCRBench.

  • by norwalkbear on 6/6/24, 12:37 PM

    Isnt 3-70b so good, reddit llamaers are saying people should buy hardware to run it?

    Llama-3-8b was garbage for me but damn 70b is good enough

  • by oarth on 6/6/24, 7:17 AM

    If those numbers are true then it's very impressive. Hoping for llama.cpp support.
  • by nubinetwork on 6/6/24, 2:47 PM

    1M context, but does it really? I've been hit with 32K models that crap out after 10K before...
  • by fragmede on 6/6/24, 6:23 AM

    model available, not open source.
  • by refulgentis on 6/6/24, 10:31 PM

    Ehhh man this is frustrating, 7B was a real sweet spot for hobbyist. 8B...doable. I've been joking to myself/simultaneously worried that Llama 3 8B and Phi-3 "3B" (3.8B) would start a "ehhh, +1, might as well be a rounding error" thing. It's a big deal! I measure a 33% decrease just going from 3B to 3.8B when inferencing on CPU.