from Hacker News

LLMs Unleashed: The Power of Fine-Tuning

by lucaspauker on 7/27/23, 5:08 PM with 90 comments

by jph00 on 7/27/23, 10:27 PM
> "The idea of fine-tuning has a strong research pedigree. The approach can be traced back to 2018, when two influential papers were published."
The article refers to the BERT and GPT papers as the source of the fine-tuning idea. However, we actually first demonstrated it for universal models in 2017 and published the ULMFiT (Howard and Ruder) paper in early 2018. Prior to that, Dai and Le demonstrated the technique for in-corpus datasets. So it would be more accurate to say the approach can be traced back to those two papers, rather than to BERT and GPT.
BERT and GPT showed the effectiveness of scaling up the amount of data and compute, and switching the model architecture to Transformers (amongst other things).
by LASR on 7/27/23, 11:28 PM
We’ve found that 1-shot or few-shot methods with 3.5Turbo or 4 are vastly simpler and exceeds the quality of fine-tuned models from the GPT-3 era.
We have some 100k context models too that can ingest entire documents.
So right now, I would say fine-tuning is probably only useful for a very narrow set of use cases.
by Animats on 7/27/23, 11:58 PM
This is basically an ad for fine-tuning as a service.
Can anyone offer an example of a free public-facing LLM which has been fine-tuned by adding much specific info about some narrow area? Say, one that knows all the PR about some car brand or fandom? Somebody must have tried that by now.
by nullc on 7/27/23, 10:31 PM
> Fine tuning is better for complex tasks where the model’s generated output must be accurate and trusted.
uhhh. I understand what was intended there but while fine tuning may reduce the rate of hallucinations and make hallucinations more plausible, it's not magic accurate and trust-worthy dust.
Unfortunately many people think this stuff is magic and care should be taken to not encourage people to confuse improvements with resolving the issue.
One way of characterizing the LLM accuracy problem is that it often looks very accurate and convincing even when it is emitting nonsense. If you cast the problem in those terms-- as a problem of looking more trustworthy than it actually is-- fine tuning actually exacerbates the problem.
by treprinum on 7/27/23, 10:24 PM
This is 2020-level stuff. These days with emergent abilities in LLMs trained with over 1T tokens like GPT-4 a single-shot chain-of-thought beats most fine-tunings. I did research on transformer adapters i.e. parameter-efficient fine-tuning and that stuff is now completely obsolete outside of some restricted domains where small models can still perform well.
by mickeyfrac on 7/27/23, 8:50 PM
The link to your terra cotta product, which I assume is the point of the article, is broken.
by SpaceManNabs on 7/27/23, 10:02 PM
I like that your article was well cited. Fun read. Nothing stands out as too inaccurate.
You should try a post on parameter efficient tuning next!
by bugglebeetle on 7/27/23, 10:34 PM
Are there any good tutorials on fine-tuning the quantitized versions of the LLama models anywhere? I have a few NLP tasks I’d like to test out, with plenty of training data, but everything I’ve seen doesn’t seem generalizable enough or lacks necessary details.
by coffee_am on 7/28/23, 6:43 AM
Noob question: when folks talk about fine-tuning LLM, do they usually fine-tune the encoder (of the prompt), the decoder (that generates the text) or both ?
by zmmmmm on 7/28/23, 3:42 AM
Fine tuning seems to me to be dangerously close to a new snake oil of AI these days.
The narrative goes, "look how awesome ChatGPT is, imagine how good it would be trained on just your company's documents".
Which 1000% misses the point. ChatGPT is because (a) it is trained on almost nothing short of the entire corpus of human language ever created. At > 1 trillion parameters, it can have ~1000 parameters for every human on the planet. Let that sink in. And then (b) because it has been subjected to an unknown but likely massive amount of human reinforcement feedback.
The idea that you can meaningfully impact the output of the model towards factual accuracy or logical correctness just by doing a small amount of fully automated training using a tiny corpus of company documents is seductive, but super far from robustly demonstrated as far as I'm aware. Yet this is the pitch being sold very often.
by phas0ruk on 7/27/23, 10:43 PM
Helpful. I was thinking today about when it makes sense to fine tune vs use embeddings to feed into the LLM prompt and this helped solidify my understanding.
by autokad on 7/27/23, 11:47 PM
I've tried text generation on gpt3 and it was very very bad. has anyone done so and got good results? care to share the code?
by marcopicentini on 7/27/23, 11:19 PM
What if we fine tune a model like LLaMA on all published research papers? Would be able to product new knowledge?
by ramesh31 on 7/27/23, 11:26 PM
Can anyone provide a step-by-step ELI5 guide to fine tuning Llama? I still don't quite understand.