from Hacker News

Open source LLM with 32k Context Length

by shubham_saboo on 8/24/23, 5:53 AM with 28 comments

  • by alsodumb on 8/24/23, 7:00 AM

    Abacus always seemed to me like a 'we got a lot of VC money with inflated claims now we gotta show we do everything' company. I don't really understand what they do, they seem to offer everything but I don't see anyone talking about using their offerings in the real-world. Ever. The only time I see mentions of the company are when I am targeted with ads or promoted posts of the founder.
  • by weinzierl on 8/24/23, 12:52 PM

    This is just another fine-tuned LLaMA and Llama 2, like there are already some. I doubt that this will give seriously meaningful results for long context inference.

    32k context length sounds nice of course, and it seems to be common to call the just fine-tuned models like that. I think it is more of a marketing thing and we really should distinguish between the context length of the pre-trained model and the fine-tuned model, with the latter being the default meaning of context length.

  • by supermatt on 8/24/23, 7:03 AM

    It seems this is built on LLAMA. Did meta change the license to make it open source now? It still seems to be showing otherwise in the repo.

    Edit: No mention of it being open source in the linked article. Maybe the title here is just wrong? @dang

  • by vekker on 8/24/23, 9:13 AM

    It's probably too new for anyone to have integrated this into text-generation-webui / Gradio? I've been looking for a large context LLM (self-hosted or not) for a project, and as a European I unfortunately don't have access to Anthropic's Claude API yet.
  • by Havoc on 8/24/23, 9:47 AM

    Does anyone know if larger context lengths are inherent worse at other task?

    i.e. all other things being equal is a 8k model better at math than a 32k model