by ganisgan on 5/24/24, 2:36 PM with 10 comments
by sebg on 5/24/24, 2:55 PM
This question was recently asked in the /r/machinelearning subreddit.
You can find the link here: https://www.reddit.com/r/learnmachinelearning/comments/18yzv...
Generally the first question I ask when someone asks me that question is:
What are you trying to do?
If you are learning machine learning to do research, your answer will be very different than if you are learning machine learning to start a business or create something with the ML tools/applications available today.
General knowledge ML book:
* Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems 3rd Edition by Aurélien Géron
General knowledge data applications book:
* Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems 1st Edition by Martin Kleppmann
General knowledge statistical machine learning book:
* The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition (Springer Series in Statistics) 2nd Edition by Trevor Hastie (Author), Robert Tibshirani (Author), Jerome Friedman (Author)
Tooling wise, the rabbit hole goes very deep depending on what technologies and what areas you are exploring. Anything from PyTorch all the way to programming GPUs and how to store / work with massive data as well as general algos to work with all of this.
by dongobread on 5/24/24, 3:17 PM
Stats & ML - https://www.statlearning.com/
Deep Learning - https://udlbook.github.io/udlbook/
Reinforcement Learning - https://web.stanford.edu/class/psych209/Readings/SuttonBarto...
As with anything else, people usually fail to learn ML not because of content quality but because of lack of effort/time/consistency. Take handwritten notes, solve exercises, etc., and expect to spend at least a hundred hours on each book.
by cellis on 5/24/24, 3:32 PM
ChatGPT for clarification, especially intricate APIs e.g. how auto differentiation works.
I found the follow YouTube channels very useful, in order:
FastAI Serrano Academy Karpathy
by sujayk_33 on 5/25/24, 10:27 PM
And don't compare yourself with what others are building.
by nextos on 5/24/24, 6:33 PM
Excellent background knowledge in the appendix. Easy to get bootstrapped.
Then move to Murphy's two tomes.
by megapoliss on 5/24/24, 2:56 PM
by ActorNightly on 5/24/24, 9:00 PM
That isn't to say that you can't watch course to supplement your knowledge. But generally, this should be done after you familiarized yourself with the material.
Start with Karpathy Micrograd project. It basically shows you how stuff works under the hood. Get the repo, run the examples, play around with it. Try to figure out how to approximate a mathematical function, i.e you give it one input and then the output is a single value.
Then familiarize yourself with Pytorch. You want to start here: https://pytorch.org/tutorials/beginner/introyt/tensors_deepe...
and read through the tutorials and try some stuff.
This is also a good read.
https://pytorch.org/blog/inside-the-matrix/
Then you have to basically grind out the following steps
1. Read a paper about some model. Start with small papers like digit recognition with MINST. Then move onto things like image recognition, e.t.c. Generally you want to cover basic classification models, image detection models that draw bounding boxes around stuff, and things like auto encoder.
2. Implement the model in Pytorch (or model application, like for example recreate the deepfake face swap with autoencoders). This is going to be the hardest part to grind out, but it gets exponentially easier after you start cause you will realize that a good portion of the stuff is boilerplate code.
3. Load the weights downloaded from the internet
4. Run the model and verify it works.
5. Retrain the model. If you don't want to generate a dataset, you can basically just shuffle up the labels. But you can find different datasets for different models available.
As a cheat, tinygrad repo has some of these implemented in tinygrad, so you can look at the code and reimplement it in pytorch. https://github.com/tinygrad/tinygrad/tree/master/examples
Then, familiarize yourself with the LLMs. Karpathy Nano GPT is a good start. Then same idea, read->recreate For example, read about flash attention and then try to recreate it.
Also separately, you want to look at Hugging Face Accelerate library and learn how to work with available hardware for both inference and training. I.e try to use all compute resources (CPU/GPU/Disk/Ram) on your box to run stuff.
If you can do all of this, you will be significantly more skilled than a lot of ML people in Big Tech.
As a bonus, you can also explore tinygrad, and how things actually happen on the gpu when you run a model.
by fsndz on 5/24/24, 6:22 PM