by prvnsmpth on 3/17/25, 5:40 PM with 0 comments
Just wanted to share an interesting experiment I ran to see what kind of performance gains can be achieved by fine-tuning a model to code from a single repo.
Results: The fine-tuned model achieves a 47% improvement in the code completion task (tab autocomplete). Accuracy goes from 25% to 36% (exact match against ground truth) after a short training run of only 500 iterations on a single RTX 4090 GPU.
This is an interesting result because it shows that there are significant gains to be had by fine-tuning to your own code.
Full details of the experiment: https://prvn.sh/build-your-own-github-copilot/
Highlights of the experiment: - Model: qwen2.5-coder 14b, 4-bit quantized - Training data: Svelte source files from this repo: https://github.com/hcengineering/platform - Unsloth for LoRA training with rank 16, 4096 sequence length - GPU: single RTX 4090 - 500 iterations with effective batch size 8
A fine-tuned open-source model could be a real alternative to commercial coding assistants like GitHub Copilot, Cursor, etc., especially for organizations with a ton of legacy code that these LLMs have not seen, and also for those that are wary of exposing their code to external systems.