from Hacker News

Ask HN: What's the best way to learn machine learning in 2024?

by ganisgan on 5/24/24, 2:36 PM with 10 comments

Books/Blogs/Videos/Courses/Interactive content

by sebg on 5/24/24, 2:55 PM
Hi Friend -
This question was recently asked in the /r/machinelearning subreddit.
You can find the link here: https://www.reddit.com/r/learnmachinelearning/comments/18yzv...
Generally the first question I ask when someone asks me that question is:
What are you trying to do?
If you are learning machine learning to do research, your answer will be very different than if you are learning machine learning to start a business or create something with the ML tools/applications available today.
General knowledge ML book:
* Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems 3rd Edition by Aurélien Géron
General knowledge data applications book:
* Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems 1st Edition by Martin Kleppmann
General knowledge statistical machine learning book:
* The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition (Springer Series in Statistics) 2nd Edition by Trevor Hastie (Author), Robert Tibshirani (Author), Jerome Friedman (Author)
Tooling wise, the rabbit hole goes very deep depending on what technologies and what areas you are exploring. Anything from PyTorch all the way to programming GPUs and how to store / work with massive data as well as general algos to work with all of this.
by dongobread on 5/24/24, 3:17 PM
Assuming you already know some basic linear algebra and calculus, know Python (or R), and have a decent-but-not-advanced grasp of statistics, I'd recommend working through these books. They are very readable and focus on intuitive understanding/practical applications, but give enough technical foundation for you to jump into more specific subfields if needed.
Stats & ML - https://www.statlearning.com/
Deep Learning - https://udlbook.github.io/udlbook/
Reinforcement Learning - https://web.stanford.edu/class/psych209/Readings/SuttonBarto...
As with anything else, people usually fail to learn ML not because of content quality but because of lack of effort/time/consistency. Take handwritten notes, solve exercises, etc., and expect to spend at least a hundred hours on each book.
by cellis on 5/24/24, 3:32 PM
First, regardless of language, try to write as much code and run as many experiments as possible. Reading theory is cool but there’s nothing quite like seeing it work. Then:
ChatGPT for clarification, especially intricate APIs e.g. how auto differentiation works.
I found the follow YouTube channels very useful, in order:
FastAI Serrano Academy Karpathy
by sujayk_33 on 5/25/24, 10:27 PM
I would say, to get started just take any course to understand the basics and start building and implementing what you've learnt. I advise you not to get stuck in the loop where you are just learning or reading some concepts but not actually coding them. Look for existing projects, add your own touch to it and make it more meaningful.
And don't compare yourself with what others are building.
by nextos on 5/24/24, 6:33 PM
https://d2l.ai. Rigorous, but not too rigorous. Code for everything. Free.
Excellent background knowledge in the appendix. Easy to get bootstrapped.
Then move to Murphy's two tomes.
by megapoliss on 5/24/24, 2:56 PM
https://udlbook.github.io/udlbook/
by ActorNightly on 5/24/24, 9:00 PM
As with everything CS, the best way to learn is by doing. Being spoon fed info through a course or a book is a horrible and inefficient way to learn. At most you will memorize patterns of how to do stuff without understanding (which to be fair is majority of the CS workforce, but strive to be better regardless).
That isn't to say that you can't watch course to supplement your knowledge. But generally, this should be done after you familiarized yourself with the material.
Start with Karpathy Micrograd project. It basically shows you how stuff works under the hood. Get the repo, run the examples, play around with it. Try to figure out how to approximate a mathematical function, i.e you give it one input and then the output is a single value.
Then familiarize yourself with Pytorch. You want to start here: https://pytorch.org/tutorials/beginner/introyt/tensors_deepe...
and read through the tutorials and try some stuff.
This is also a good read.
https://pytorch.org/blog/inside-the-matrix/
Then you have to basically grind out the following steps
1. Read a paper about some model. Start with small papers like digit recognition with MINST. Then move onto things like image recognition, e.t.c. Generally you want to cover basic classification models, image detection models that draw bounding boxes around stuff, and things like auto encoder.
2. Implement the model in Pytorch (or model application, like for example recreate the deepfake face swap with autoencoders). This is going to be the hardest part to grind out, but it gets exponentially easier after you start cause you will realize that a good portion of the stuff is boilerplate code.
3. Load the weights downloaded from the internet
4. Run the model and verify it works.
5. Retrain the model. If you don't want to generate a dataset, you can basically just shuffle up the labels. But you can find different datasets for different models available.
As a cheat, tinygrad repo has some of these implemented in tinygrad, so you can look at the code and reimplement it in pytorch. https://github.com/tinygrad/tinygrad/tree/master/examples
Then, familiarize yourself with the LLMs. Karpathy Nano GPT is a good start. Then same idea, read->recreate For example, read about flash attention and then try to recreate it.
Also separately, you want to look at Hugging Face Accelerate library and learn how to work with available hardware for both inference and training. I.e try to use all compute resources (CPU/GPU/Disk/Ram) on your box to run stuff.
If you can do all of this, you will be significantly more skilled than a lot of ML people in Big Tech.
As a bonus, you can also explore tinygrad, and how things actually happen on the gpu when you run a model.
by fsndz on 5/24/24, 6:22 PM
check out lycee.ai they have amazing resources to learn ML