by NewUser76312 on 2/14/25, 4:38 PM with 1 comments
Bit of background - I have a CS background/undergrad, have caught up on the academics of attention, transformers, etc, have worked in the software industry for ~10 years and even built some small deep learning networks in PyTorch for various POCs (including a small transformer). I really want to get into researching, finetuning, testing new architectures, etc, for LLMs because I view it as more interesting than doing a wrapper SaaS product (my interest is more towards research than product).
But I'm not sure how to really get into foundation model LLMs in a meaningful way as an individual. I'm not part of a university research group, haven't gotten anywhere applying to the big AI companies (I'm just some dev dude without a PhD or name school), and I don't have the scale of compute/GPUs to do my own self-experimentation and research. I have a single 12gb vram GPU, but I doubt that gets me anywhere interesting.
So what exactly could I do? Open to any creative and practical ideas.
by jonahbenton on 2/23/25, 10:49 PM
You need to do work that you can do. Deep and low level. A single gpu is fine. Lots of projects and repos out there, esp for inference.
It is difficult in terms of motivation to apply low level focus to problems your social brain thinks are irrelevant. You have to learn how to ignore the social brain and listen only to the technical curiosity brain. Keep a journal. Document your findings. Find the motivation train every researcher has inside.
And look if you wind up not having the talent or time, then this is just a hobby. And that is ok.