by lewq on 2/6/24, 7:12 AM with 47 comments
by isaacfrond on 2/8/24, 8:42 AM
Users tend to ask broad, vague questions of the document in order to test that the system is working. We want those queries to work well. For example, a user would ask "what are the doctors going to do?" of a document that is about a junior doctors' strike. Take this into account when generating the questions - in particular, refer to noun phrases by less specific descriptions, so for example instead of "junior doctors", say "doctors" in your questions.
[1]: https://github.com/helixml/helix/blob/main/api/pkg/dataprep/...
by bugglebeetle on 2/8/24, 1:40 AM
https://github.com/unslothai/unsloth
It’s my default now for experimenting and basic training. If I want to get into the weeds, I use axolotl, but 9/10, it’s not really necessary.
by nl on 2/8/24, 12:36 PM
People way understimate what RAG can do, even if in general people don't talk about the right things. For example LlamaIndex spends a lot of time talking about various extractors which is the easy part. The hard thing is deciding what you are actually searching for given a chat context.
RAG is a horrible hack (and the more you understand about the more it seems so!) but it does work.
I (and I'm sure everyone else) is experimenting with surgery on an LLM so it takes a vector representation of the docs directly alongside a text input so you don't have to do the lossy doc vector -> text -> LLM context -> vector thing. Not sure why no one has shipped this yet though!
by gdiamos on 2/8/24, 11:01 AM
I think many users get put off it because just pushing a button doesn’t work and the whole thing seems like a black box that you don’t know how to fix when it breaks.
It turns out that finetuning can be debugged, but the methods aren’t well documented (yet), eg by generating q/a, oversampling them, etc
When you get it to work it’s powerful - new abilities emerge beyond memorization.
Just like how llama2/claude2/gpt4 learned reasoning by memorizing sentences from Reddit posts :P
Also, I don’t get the comparison of rag vs finetuning in articles like this - why not do both. RAG is easy to setup - it’s push button. Just do it on all models (including finetuned models).
by joshka on 2/8/24, 7:43 AM
I often wonder how you'd go about organizing training data for a full historic github repo in a way that makes sense for training (or RAG)? The vast majority of the data is previous changes to the repo. I think this would generally mean that it would outweigh the current information and cause problems (i.e. old method names before refactoring etc.)
Also, perhaps being able to expand that out to doing the same thing for a bunch of consumers of the library that I'm maintaining would be neat.
Sprinkle in the PR and Issue history, docs website, API docs, and discord history and I think you'd have a helluva model.
by cuuupid on 2/8/24, 4:27 AM
My only gripe with Helix would be that it's smaller than the above and my org would be peeved about data security. The ability to self host is cool, but too much can go wrong too quickly with plain Docker ML. Would love to see, for example, a `cog` version of the images that we can deploy distributed with more confidence/bravado.
[1] https://replicate.com/mistralai/mistral-7b-instruct-v0.2 [2] https://modal.com [3] https://llm-engine.scale.com/
by AznHisoka on 2/8/24, 1:01 PM
by _pdp_ on 2/8/24, 11:11 AM
by nicolezhu on 2/8/24, 4:58 AM
by ipsum2 on 2/8/24, 11:54 AM
by HanClinto on 2/7/24, 5:16 AM
by deforciant on 2/6/24, 8:39 AM