from Hacker News

DeepSeek Coder: Let the Code Write Itself

by fintechie on 1/31/24, 9:43 PM with 61 comments

  • by rickstanley on 2/1/24, 1:42 AM

    Hello, I would like to take this opportunity and ask for help here, about using A.I. with my own codebase.

    Context: I missed [almost] the entire A.I. wave, but I knew that one day I would have to learn something about and/or use it. That day has come. I'm allocated in one team, that is migrating to another engine, let's say "engine A → engine B". We are looking from the perspective of A, to map the entries for B (inbound), and after the request to B is returned, we map back to A's model (outbound). This is a chore, and much of the work is repetitive, but it comes with its edge cases that we need to look out for and unfortunately there isn't a solid foundation of patterns apart from the Domain-driven design (DDD) thing. It seemed like a good use case for an A.I.

    Attempts: I began by asking to ChatGPT and Bard, with questions similar to: "how to train LLM on own codebase" and "how to get started with prompt engineering using own codebase".

    I concluded that, fine-tuning is expensive, for large models, unrealistic for my RTX 3060 with 6Gb VRAM, no surprise there; so, I searched here, in Hacker News, for keywords like "llama", "fine-tuning", "local machine", etc, and I found out about ollama and DeepSeek.

    I tried both ollama and DeepSeek, the former was slow but not as slow as the latter, which was dead slow, using a 13B model. I tried the 6/7B model (I think it was codellama) and I got reasonable results and speed. After feeding it some data, I was on my way to try and train on the codebase when a friend of mine came and suggested that I use Retrieval-Augmented Generation (RAG), I have yet to try it, with a setup Langchain + Ollama.

    Any thoughts, suggestions or experiences to share?

    I'd appreciate it.

  • by _boffin_ on 1/31/24, 11:37 PM

    Been using DeepSeek Coder 33B Q8 on my work laptop for a bit now. I like it, but am still finding myself going to GPT-4's API for the more nuanced things.

    They just released a v1.5 (https://huggingface.co/deepseek-ai/deepseek-coder-7b-instruc...), but for some reason, they reduced the context length from ~16k to ~4k.

  • by sestinj on 2/1/24, 1:18 AM

    We've been playing with the 1.3b model for continue.dev's autocomplete and it's quite impressive. One unclear part is whether the license really permits commercial usage, but regardless it's exciting to see the construction of more complex datasets. They mention that training on multiple tasks (FIM + normal completion) improves performance...wonder whether training to output diffs would be equally helpful (this is the holy grail needed to generate changes in O(diff length) time)
  • by elwebmaster on 2/1/24, 12:16 AM

    Mixtral > Codellama > DeepSeek Coder. Very weird model, writes super long comments on one line, definitely not at the level of Codellama, benchmarks be damned.
  • by Havoc on 2/1/24, 1:26 AM

    I’ve been using their 7B with tabbyML.

    Works well but closer to a very smart code complete rather than generating much novel blocks of code

  • by chii on 2/1/24, 5:57 AM

    Just tried it by asking how to create a game that is turn based, using an ECS system, and how to add a decision tree, and a save/load system, in the language Haxe.

    It outputs relatively correct haxe code, but it did halucinate that there is a library called 'haxe-tiled' to read tmx map files...

  • by hackerlight on 1/31/24, 10:04 PM

    In the benchmarks, are they using the base GPT-4, or are they using a GPT like Grimoire which will be better at coding? If they aren't using Grimoire, isn't it unfair to compare their fine tuned model to base GPT-4?
  • by byyoung3 on 1/31/24, 11:23 PM

    looks like code llama 70B outperforms on humaneval I believe