from Hacker News

LORA: Low-Rank Adaptation of Large Language Models

by LukeEF on 5/5/23, 7:10 PM with 3 comments

by LukeEF on 5/5/23, 7:12 PM
From the google leaks paper:
'LoRA is an incredibly powerful technique we should probably be paying more attention to
LoRA works by representing model updates as low-rank factorizations, which reduces the size of the update matrices by a factor of up to several thousand. This allows model fine-tuning at a fraction of the cost and time. Being able to personalize a language model in a few hours on consumer hardware is a big deal, particularly for aspirations that involve incorporating new and diverse knowledge in near real-time. The fact that this technology exists is underexploited inside Google, even though it directly impacts some of our most ambitious projects.' [1]
[1] https://www.semianalysis.com/p/google-we-have-no-moat-and-ne...
by gdiamos on 5/6/23, 10:56 AM
Fine tuning where you freeze the weights of a neural network has been used for a long time in computer vision. There are many variations of these methods.
More recently there are some good libraries that make them easier to use. For example PEFT, which implements LoRA and several other related methods.
https://huggingface.co/blog/peft