from Hacker News

The genie escapes: Stanford copies the ChatGPT AI for less than $600

by Freddie111 on 3/20/23, 8:38 PM with 166 comments

  • by superkuh on 3/20/23, 9:05 PM

    Hardly. I've played a lot with the 7,13, and 30B llamas as well as the 7 and 13B alpacas fine tuned by Stanford. They do not have emergent abilities like being able to generate rhymes or, say, represent a movie plot as emoji. Even openai's old text-davinci-003 (gpt3.5, but text completion, not the chat ones) far outperforms them. That said, I have hopes for a 65B 3-bit quantized alpaca-fine tuned. We'll see when someone spends the money to do the (more costly) 65B training. The alpacas are also much more likely to go off rails and start regurgitating their fine-tuning inputs. Either that or openai is doing a lot of post processing on their end to hide the same problems in their LLM.

    For now my IRC bots run the alpaca 7B 4-bit. 13B was not a significant improvement for twice the computational time. But it's best to learn them now because as soon as openai gets sued for the first time all the turing test passing older models without the legal-butt-covering bolted on will be removed.

  • by freediver on 3/20/23, 9:17 PM

    The incredible contribution of Alpaca is showing the world how to efficicently train LLM on instructions. The fact that it did so on 52k instructions generated by GPT is poetic.

    It does not matter what current capabilities of open source models are, because this opens the door to tremendous democratization of the ability to train and self-deploy these models.

    In less than 6 months we will have open source models with gpt3-like capabilities, running locally on laptops, and potentially in phones and web browsers.

  • by doctoboggan on 3/20/23, 9:41 PM

    I've used both the 7B and 13B instruction tuned llama weights (quantized using the llama.cpp scripts). Either I am doing something wrong, or these two models are no-where near the level of ChatGPT. Many times they return something totally irrelevant to my question, stop responding, use a different language, or otherwise return the wrong answer. ChatGPT does none of this. (other than the wrong answer due to hallucinating sometimes...)

    Reading through the README and issues on the llama.cpp project, there is some speculation that there is a bug in the quantization, or possibly a bug in the inference (less likely I think).

    I hope this is true and once fixed the models can perform up to or past the ChatGPT level. If its not true and these models are performing correctly, then either the metrics used to compare it to GPT is garbage and don't capture the real world uses, or the instruction tuning done by the Stanford team is not up to par.

  • by Waterluvian on 3/20/23, 10:15 PM

    If you use consciousness as a baseline, the intellectual difference between a grade schooler and a PhD is tiny.

    This is what I think comparing these bots is like. You can argue that they’re very close. But the delta makes a very big difference for any practical purposes because we’re looking for nuanced capability.

  • by braingenious on 3/20/23, 9:19 PM

    I have not found alpaca to be comparable to chatgpt, but it could be because of bugs in the version I installed through dalai. I might try reinstalling it because I suspect there might be some sort of file corruption issue or whatever.

    I gave it the prompt “cats aren’t always fuzzy” and it wrote a lengthy livejournal-esque rambling journal entry about a woman and her husband having money issues. It was funny, but lightyears away from chatgpt.

    It does sometimes create some really funny hallucinations though, like inventing prefectures in Japan that don’t exist etc.

  • by awinter-py on 3/20/23, 9:39 PM

    > asked GPT to take 175 human-written instruction/output pairs, and start generating more in the same style and format ... through one of OpenAI's helpfully provided APIs, and ... the team had some 52,000 sample conversations to use in post-training the LLaMA model

    hmm I wonder if this is essentially a probe[1] technique + relies on chatgpt already having been extensively trained

    like did they basically exfiltrate the weights

    1. probing per https://arxiv.org/abs/2102.12452

  • by gaogao on 3/20/23, 8:54 PM

    Has anyone tried this yet on the 65B version? I'm curious if it knows how to rhyme and other emergent behavior, as alpace-7B does not.
  • by dang on 3/20/23, 10:52 PM

    Recent and related:

    Stanford Alpaca web demo suspended “until further notice” - https://news.ycombinator.com/item?id=35200557 - March 2023 (77 comments)

    Stanford Alpaca, and the acceleration of on-device LLM development - https://news.ycombinator.com/item?id=35141531 - March 2023 (66 comments)

    Alpaca: An Instruct Tuned LLaMA 7B – Responses on par with txt-DaVinci-3 - https://news.ycombinator.com/item?id=35139450 - March 2023 (11 comments)

    Alpaca: A strong open-source instruction-following model - https://news.ycombinator.com/item?id=35136624 - March 2023 (296 comments)

  • by simonw on 3/20/23, 10:29 PM

    Related, my post "Could you train a ChatGPT-beating model for $85,000 and run it in a browser?" https://simonwillison.net/2023/Mar/17/beat-chatgpt-in-a-brow...

    I think you can train LLaMA 7B (the model underlying Alpaca) for around $82,000, based on the Meta Research paper about it. Then you can fine-tune it ala Alpaca for a few hundred dollars more.

    My wilder speculation is that, if you can shrink the model down to 4GB with llama.cpp 4bit quantization, it may be possible to run it entirely in the browser (ala Stable Diffusion from the other day).

  • by satvikpendem on 3/20/23, 11:01 PM

    Alpaca is cool but it's also not technically allowed by OpenAI's TOS, and LLaMA is certainly not allowed to be used for non-commercial purposes. With that in mind, OpenAssistant is an Apache 2.0 licensed fully open source alternative that's pretty good (the model is OpenAssistant/oasst-sft-1-pythia-12b): https://huggingface.co/spaces/olivierdehaene/chat-llm-stream....

    I've found OA to be better than Alpaca but I'll wait until the 65B 3-bit quantization efforts for Alpaca are underway to compare them.

  • by neilellis on 3/20/23, 11:42 PM

    Wow, Stanford's Alpaca AI project is a real game-changer. The fact that it performs on par with ChatGPT but costs less than $600 to build is both exciting and terrifying. Sure, it's great to see AI becoming more accessible, but it's also a massive wakeup call for the potential misuse of these technologies.

    We've got big names like OpenAI, Google, Apple, Meta, Baidu, and Amazon putting in serious time and money to ensure their language models are safe and ethical. However, now that we know it's possible to build powerful AI models on a budget, it's crucial to think about what this means for the future of AI regulation and safety.

    This Alpaca AI project is a stark reminder that we need to have a serious conversation about the possible repercussions of AI proliferation. We can't just sit back and assume the big companies will take care of everything. The genie is out of the bottle, and it's time for everyone in the tech community to face the music and take responsibility for the AI revolution.

  • by raydiatian on 3/20/23, 9:10 PM

    > It seems these godlike AIs are already frighteningly cheap and easy to replicate.

    Who writes this shit?

  • by cjohnson318 on 3/20/23, 10:19 PM

    > It seems these godlike AIs are already frighteningly cheap and easy to replicate.

    "godlike"? Really? I'm not religious, but this seems like an overreaction for something that has no agency.

  • by welly34h on 3/20/23, 9:13 PM

    Code can be abstracted into a simpler code model and deterministically recreate the old code model.

    OpenAI is an eventually to be obsoleted initial brute force approach that will be abstracted over and over into a simpler code implementation with rules to recreate the old state.

    kkrieger is a simple example of a tiny data model that can be deterministically rehydrated. It’s not unrealistic for AI models to become a seed value for a normalized code base to deterministically unpack into necessary electron state

  • by starik36 on 3/20/23, 9:07 PM

    From the article: Pre-trained on a trillion "tokens"...

    Doesn't 7B indicates that it was trained on 7 billion tokens? Or am I misunderstanding the nomenclature?

  • by jakedata on 3/20/23, 9:55 PM

    AI bootstrapping AI is a sci-fi trope that goes back decades. I first encountered it in The Cybernetic Samurai while in high school. While the details differ, the reality is that AI is a catalyst for more of itself.

    I don't remember many books where this ends particularly well. Perhaps the Culture universe could be a survivable outcome. Hopefully we don't get Berzerkers first.

  • by xwdv on 3/20/23, 8:50 PM

    Given the high prices of OpenAI offerings it seems it’s better to pirate an AI model before resorting to paying for anything.
  • by amrb on 3/21/23, 12:14 AM

    Anything open will need training and attention to be an openai competitor, tho I'm happy to see the function of this one: https://huggingface.co/spaces/togethercomputer/OpenChatKit
  • by alecco on 3/20/23, 9:02 PM

  • by earthboundkid on 3/20/23, 9:41 PM

    I bet you could “exfiltrate” an LLM relatively cheaply by using LLM A to generate training data for LLM B.
  • by UncleOxidant on 3/20/23, 10:24 PM

    Is it accurate to say they were trained for less than $600? Wouldn't that just be the finetuning that was done to the already existing LLaMA parameters which likely cost way more than $600 to train?
  • by twblalock on 3/20/23, 9:04 PM

    This is why it's not possible to slow down or "stop" AI: once the problems are solved the solutions turn out to be trivial to replicate. All it takes is compute.
  • by EGreg on 3/20/23, 10:06 PM

    I warned about this for years. Finally an article gets it right.

    Everyone will soon have the equivalent of online nuclear weapons: bot swarms that infiltrate every forum, including this one.