by smokel on 3/19/25, 8:51 PM
I'm interested to know if anyone is using fine-tuning to train a model on proprietary or in-house codebases and documentation.
RAG solutions seem to have their limitations, and fine-tuning might be a more effective approach.
How much effort is required to turn code into something one can use for fine-tuning?
by zk on 3/20/25, 5:24 AM
Is there a version of Gemma 3 that has tool calling? Google's blog claimed it supports tools but it doesn't seem like it actually does.
by bryan0 on 3/19/25, 7:39 PM
Are people fine-tuning LLMs on their local machines with a single GPU? What are people using to scale their training to multiple nodes / gpus? I've been playing around with Hugging Face Estimators in sagemaker.huggingface but not sure if there are better options for this?
by rockwotj on 3/19/25, 6:44 PM
is anyone outside of the research labs fine tuning models for production use cases? I have been seeing more people just using foundational models off the shelf especially in light of a new advancement that seems to come every few months
by yieldcrv on 3/19/25, 7:07 PM
Instead of versions, these things should be labeled by their release date, since this kind of training is based on started at a dataset snapshot in time, colloquially called knowledge-cutoff date which isnt really accurate
we are optimizing these on different dimensions at once, and multiple branches of evolution from each model
so a successor version name doesn't really convey that
by huqedato on 3/19/25, 11:14 PM
Great article, but I didn't see anything about the costs.
I'm particularly interested in this aspect because we're considering fine-tuning Gemma 3, but our budget is tight. We're looking into (real-world) cost estimates for this approach.
by siliconc0w on 3/19/25, 6:54 PM
It likely makes sense to use more expensive frontier models as teachers or architects for smaller fine-tuned ones that generate the majority of tokens (though possibly against the ToS).
by admiralrohan on 3/19/25, 9:44 PM
Have anyone used those small models in any production environment?
If yes, what they are good and bad at?
by dhooper on 3/20/25, 11:03 AM
Please try to enjoy each Gemma tuning equally, and not show preference for any over the others