from Hacker News

Ask HN: Have you fine-tuned LLMs to know the contents of a specific code base?

by andreyk on 4/25/23, 1:13 AM with 1 comments

I am interested in trying to make LLMs know the contents of my project, so it can know what classes/functions/variables there are outside the current file/prompt. The first idea for "adding" knowledge of the code base (assuming it is too large to fit into the prompt) would be to fine-tune the LLM on the code. Has anyone tried this or knows of any work on it?
  • by TroyZ on 4/25/23, 4:26 AM

    Fine-tuning is probably not the way to do it.

    Try embedding, semantic search, retrieval, and plugging the relevant parts into the prompt.

    You may need: - summarizer prompt to summarize your project structure, main functions, methods. - vector store/database to store and retrieve your relevant code from code base - coder prompt to write code based on the retrieved part.

    Check out langchain: https://langchain.readthedocs.io/