by Freddie111 on 3/20/23, 8:38 PM with 166 comments
by superkuh on 3/20/23, 9:05 PM
For now my IRC bots run the alpaca 7B 4-bit. 13B was not a significant improvement for twice the computational time. But it's best to learn them now because as soon as openai gets sued for the first time all the turing test passing older models without the legal-butt-covering bolted on will be removed.
by freediver on 3/20/23, 9:17 PM
It does not matter what current capabilities of open source models are, because this opens the door to tremendous democratization of the ability to train and self-deploy these models.
In less than 6 months we will have open source models with gpt3-like capabilities, running locally on laptops, and potentially in phones and web browsers.
by doctoboggan on 3/20/23, 9:41 PM
Reading through the README and issues on the llama.cpp project, there is some speculation that there is a bug in the quantization, or possibly a bug in the inference (less likely I think).
I hope this is true and once fixed the models can perform up to or past the ChatGPT level. If its not true and these models are performing correctly, then either the metrics used to compare it to GPT is garbage and don't capture the real world uses, or the instruction tuning done by the Stanford team is not up to par.
by Waterluvian on 3/20/23, 10:15 PM
This is what I think comparing these bots is like. You can argue that they’re very close. But the delta makes a very big difference for any practical purposes because we’re looking for nuanced capability.
by braingenious on 3/20/23, 9:19 PM
I gave it the prompt “cats aren’t always fuzzy” and it wrote a lengthy livejournal-esque rambling journal entry about a woman and her husband having money issues. It was funny, but lightyears away from chatgpt.
It does sometimes create some really funny hallucinations though, like inventing prefectures in Japan that don’t exist etc.
by awinter-py on 3/20/23, 9:39 PM
hmm I wonder if this is essentially a probe[1] technique + relies on chatgpt already having been extensively trained
like did they basically exfiltrate the weights
1. probing per https://arxiv.org/abs/2102.12452
by gaogao on 3/20/23, 8:54 PM
by dang on 3/20/23, 10:52 PM
Stanford Alpaca web demo suspended “until further notice” - https://news.ycombinator.com/item?id=35200557 - March 2023 (77 comments)
Stanford Alpaca, and the acceleration of on-device LLM development - https://news.ycombinator.com/item?id=35141531 - March 2023 (66 comments)
Alpaca: An Instruct Tuned LLaMA 7B – Responses on par with txt-DaVinci-3 - https://news.ycombinator.com/item?id=35139450 - March 2023 (11 comments)
Alpaca: A strong open-source instruction-following model - https://news.ycombinator.com/item?id=35136624 - March 2023 (296 comments)
by simonw on 3/20/23, 10:29 PM
I think you can train LLaMA 7B (the model underlying Alpaca) for around $82,000, based on the Meta Research paper about it. Then you can fine-tune it ala Alpaca for a few hundred dollars more.
My wilder speculation is that, if you can shrink the model down to 4GB with llama.cpp 4bit quantization, it may be possible to run it entirely in the browser (ala Stable Diffusion from the other day).
by satvikpendem on 3/20/23, 11:01 PM
I've found OA to be better than Alpaca but I'll wait until the 65B 3-bit quantization efforts for Alpaca are underway to compare them.
by neilellis on 3/20/23, 11:42 PM
We've got big names like OpenAI, Google, Apple, Meta, Baidu, and Amazon putting in serious time and money to ensure their language models are safe and ethical. However, now that we know it's possible to build powerful AI models on a budget, it's crucial to think about what this means for the future of AI regulation and safety.
This Alpaca AI project is a stark reminder that we need to have a serious conversation about the possible repercussions of AI proliferation. We can't just sit back and assume the big companies will take care of everything. The genie is out of the bottle, and it's time for everyone in the tech community to face the music and take responsibility for the AI revolution.
by raydiatian on 3/20/23, 9:10 PM
Who writes this shit?
by cjohnson318 on 3/20/23, 10:19 PM
"godlike"? Really? I'm not religious, but this seems like an overreaction for something that has no agency.
by welly34h on 3/20/23, 9:13 PM
OpenAI is an eventually to be obsoleted initial brute force approach that will be abstracted over and over into a simpler code implementation with rules to recreate the old state.
kkrieger is a simple example of a tiny data model that can be deterministically rehydrated. It’s not unrealistic for AI models to become a seed value for a normalized code base to deterministically unpack into necessary electron state
by starik36 on 3/20/23, 9:07 PM
Doesn't 7B indicates that it was trained on 7 billion tokens? Or am I misunderstanding the nomenclature?
by jakedata on 3/20/23, 9:55 PM
I don't remember many books where this ends particularly well. Perhaps the Culture universe could be a survivable outcome. Hopefully we don't get Berzerkers first.
by xwdv on 3/20/23, 8:50 PM
by amrb on 3/21/23, 12:14 AM
by alecco on 3/20/23, 9:02 PM
by earthboundkid on 3/20/23, 9:41 PM
by UncleOxidant on 3/20/23, 10:24 PM
by twblalock on 3/20/23, 9:04 PM
by EGreg on 3/20/23, 10:06 PM
Everyone will soon have the equivalent of online nuclear weapons: bot swarms that infiltrate every forum, including this one.