by kateklink on 9/4/23, 4:13 PM with 100 comments
by vikp on 9/4/23, 6:18 PM
- They compare the performance of this model to the worst 7B code llama model. The base code llama 7B python model scores 38.4% on humaneval, versus the non-python model, which only scores 33%.
- They compare their instruct tuned model to non-instruct-tuned models. Instruction tuning can add 20% or more to humaneval performance. For example, WizardLM 7B scores 55% on humaneval [1], and I've trained a 7B model that scores 62% [2].
- For another example of instruction tuning, Stablecode instruct tuned benchmarks at 26%, not the 20% they cite for the base model [3]
- Starcoder, when prompted properly, scores 40% on humaneval [4]
- They do not report their base model performance (as far as I can tell)
This is interesting work, and a good contribution, but it's important to compare similar models.[1] https://github.com/nlpxucan/WizardLM
[2] https://huggingface.co/vikp/llama_coder
[3] https://stability.ai/blog/stablecode-llm-generative-ai-codin...
[4] https://github.com/huggingface/blog/blob/main/starcoder.md
by Havoc on 9/4/23, 4:51 PM
The open rail license seems to reference some sort of limitations on safety and unethical use but I can’t see where in the repo that’s spelled out precisely what the authors have in mind?
by brucethemoose2 on 9/4/23, 6:35 PM
This is not really true. Llama 7B runs with Vulkan/llama.cpp on ~8GB smartphones and ~12GB laptops. That ease is going to get much better over time, as lower RAM hardware starts dropping out of the market and the Vulkan implementations get more widespread.
For users trying to run LLMs on 8GB or less machines, the AI Horde approach of distributed models seems much more practical anyway.
by mholubowski on 9/4/23, 5:29 PM
What is the point of a new model that isn’t better than the best possible model (example: OpenAI GPT-4)?
What’s the point in having a smaller model? Who cares?
—-
This is a real, genuine question that I don’t have a clear answer to. Excuse my ignorance, plz enlighten your boi.
by smcleod on 9/5/23, 9:29 AM
The web interface for the LLM server is especially nice and clean compared to many of the others I've tried - and it "just works". Very interested to see how this evolves.
by holoduke on 9/4/23, 7:27 PM
by ldjkfkdsjnv on 9/4/23, 5:16 PM
by howon92 on 9/4/23, 5:10 PM
by umutisik on 9/4/23, 5:16 PM
by glutamate on 9/4/23, 5:10 PM
See last page for restrictions
by acheong08 on 9/4/23, 6:13 PM
by palmer_fox on 9/4/23, 7:20 PM
E.g. a model specializing in chemistry doesn't need to include data on world's history or to be able to write poetry.
by Manjuuu on 9/4/23, 7:25 PM
by igammarays on 9/4/23, 5:07 PM
by kateklink on 9/4/23, 4:13 PM
It has much better performance than all of the code models of similar size, and almost reaches the same HumanEval as Starcoder being 10x smaller in size.
With the small size, it can work with most modern GPUs requiring just 3GB Ram.
You can try self-hosting it in Refact https://github.com/smallcloudai/refact/ and get a local fast copilot alternative with decent suggestions.
Weights and model card https://huggingface.co/smallcloudai/Refact-1_6B-fim.
We would love to hear your feedback!
by zcesur on 9/4/23, 7:15 PM
https://algora.io/org/smallcloudai/bounties
disclaimer: i'm a cofounder of algora, the platform enabling these bounties
by iFire on 9/4/23, 5:04 PM
bigscience-openrail-m
https://huggingface.co/smallcloudai/Refact-1_6B-fim/blob/mai...
by notsahil on 9/4/23, 5:04 PM