from Hacker News

Ask HN: How to learn AI from first principles?

by HardikVala on 1/26/25, 5:20 AM with 56 comments

A variant of this question seems to get asked every 6 mo. but so far, I haven't seen this question tackled directly: If I want to learn the concepts and fundamentals of AI from first principles, what educational resources should I use?

I'm not interested in hands-on guides (eg. how to train a DNN classifier in TensorFlow) or LLM-centric resources.

So far, I've put together the following curriculum:

1 Artificial Intelligence: A Modern Approach (https://aima.cs.berkeley.edu/) - Great for learning the breadth of foundational concepts, eg. local search algorithms, building up to modern AI.

2 Probabilistic Machine Learning: An Introduction (https://probml.github.io/pml-book/book1.html) - Going more in-depth into ML.

3 Dive into Deep Learning (https://d2l.ai/) - Going deep into DL, including contemporary ideas like Transformers and Diffusion models.

4. Neural networks and Deep Learning (http://neuralnetworksanddeeplearning.com/) could also be a great resource but the content probably overlaps significantly with 3.

Would anybody add/update/remove anything? (Don't have to limit recommendations to textbooks. Also open to courses, papers, etc.)

Sorry for the semi-redundant post.

by noduerme on 1/26/25, 7:15 AM
The following is not a take that will get you a job or teach you precisely how LLMs work, because you can look that up yourself. However, it may inspire you and you may create something that has a better-than-lottery-ticket chance of being an improvement over the AI status quo:
Without reading about how it's done now, just think about how you think a neural network should function. It ostensibly has input, output, and something in the middle. Maybe its input is a 64x64 pixel handwritten character, and its output is a unicode number. In between the input pixels (a 64x64 array) and the output, are a bunch of neurons. Layers of neurons. That talk to each other and learn or un-learn (are rewarded or punished).
Build that. Build a cube where one side is a pixel grid and the other side delivers a number. Decide how the neurons influence each other and how they train their weights to deliver the result at the other end. However you think it should go. Just raw code it with arrays in whatever dimensions you want and make it work; you can do it in Javascript or BASIC. link them however you want. Don't worry about performance, because you can assume that whatever marginally works can be tested on a massive scale and show "impressive" results.
by InkCanon on 1/26/25, 6:14 AM
The question depends what you mean by first principles. Usage of the phrase "first principles" has sprawled into many different things since (I think) Musk first mentioned it as a way to learn. The original, philosophical meaning of first principles meant a fundamental truth which could be used to derive others. Much of the philosophising of thinkers like Aristotle or Descartes was to uncover these truths (eg I think, therefore I am). In physics and other sciences, it means calculations using established laws, rather than approximations or assumptions. Then it got borrowed into certain circles of the tech crowd with the vague meaning of thinking about what's important or true and ignoring the rest. Then it trickled down into the learning/self help world as a hack of some sort to learn. If we take the original meaning of first principles, there aren't a great deal of absolute truths in machine learning. It is a very empirical, approximated and engineering oriented endeavor. Most of the research involves thinking of a new approach, building it and trying it on new datasets.
The other big question is why you want to learn it. If you want to learn ML in itself, than anything including the search algorithms (which used to be considered core to ML a long time ago) you mentioned is part of that. But if you want to learn ML to contribute to modern developments like LLMs, then search algorithms are virtually useless. If you aren't going to be engineering any ML or ML products, what you want is to gain some insight into it's future and the business of it. So learning things like transformer architecture is going to be far more unhelpful than say, reading about the economics of compute clusters.
Given the empirical/engineering quality of current ML, I'd say building it from scratch is really good for getting the handful of possible first principles (the fundamental functions involved, data cleaning, training, etc)
by CamperBob2 on 1/27/25, 1:00 AM
Watch Karpathy's 'Zero to Hero' videos on YouTube.
If you want a historical perspective, which is very worthwhile, start by reading about the mid-century work of McCullough and Pitts, and Minsky, Papert and their colleagues at MIT CSAIL after that.
There will be a dry spell after Minsky and Papert because of their conclusion that the OG neural-network topology that everyone was familiar with, the so-called "perceptron", was a dead end. That conclusion was premature to say the least, but in any event the hardware and training techniques weren't available to support any serious progress.
Adding hidden layers and nonlinear activation functions to the perceptron network seemed promising, in that they worked around some of Minsky's technical objections. The multi-layer perceptron was now a "universal approximator" capable of modeling any linear or nonlinear function. In retrospect that should have been considered a bigger deal than it was, but the MLP was still a PIA to train, and it didn't seem very useful at the scales achievable in hardware at the time. Anything a neural net could do, specialized code could usually do better and cheaper.
Then, in the circa-2010 timeframe, AlexNet dusted off some of the older ideas and used them to win image-recognition benchmark competitions, not by a small margin but by blowing everybody else into the weeds. That brought the multi-layer perceptron back into vogue, and almost everything that has happened since can be traced back to that work.
The Karpathy videos are the best intro to the MLP concept I've run across. Understanding the MLP is the key prereq if you want to understand current-gen AI from first principles.
by grepLeigh on 1/27/25, 4:14 PM
As a learning exercise, I enjoyed Neural Networks From Scratch: https://nnfs.io/
There's also a world of statistics and machine learning outside of deep learning. I think the best way to get started on that end is an undergrad survey course like CS189: https://people.eecs.berkeley.edu/~jrs/189/
by jmholla on 1/27/25, 6:50 PM
It isn't first principles, but I would recommend 3blue1brown's ongoing series about neural networks [0]. I think there's a benefit to seeing the high level overview helps understand the purpose of the pieces as your learning them; it can help with motivation. Or watching overviews like this after the fact it can help bridge connections theory may not elucidate.
[0]: https://www.3blue1brown.com/topics/neural-networks
by andyjohnson0 on 1/27/25, 9:45 AM
Back in 2018 I did Andrew Ng's course in Machine Learning on Coursera. It was pretty much "from first principles" in that you learned a bit of linear algebra and then you implemented algorithms in Octave, working up to MNIST etc. I felt like I came out of it with a good understanding of the basics, and that ML is maths not magic.
Looks like the course has turned into a multi-course "specialization" and I have no idea if any of it is the aame as the course I did. But it might be a place to start.
by riwsky on 1/27/25, 2:04 AM
1. Brush up on linear algebra (matrix and vector arithmetic, etc) 2. Draw the rest of the fucking owl
by Maro on 1/27/25, 8:06 AM
Hi, it depends on what you mean by "first principles".
If you don't have a solid background in math, then that's what you should improve upon (calculus, linear algebra, discrete math, probability theory, information theory). Some of the books you mention do cover this at the beginning, but most people take separate courses on these topics at University, with lots of homework, etc.
Also, the first book on your list is the classic textbook by Norvig, but I don't think it's actually very good. I remember reading it in my college AI course 25 years ago and it was painful back then (anybody remember "wumpus"?). It's a big book that covers too much, it's like printing out a lot of Wikipedia pages. You're better off finding books with smaller scope that focus on something you actually care about / is relevant to the way the field has developed.
by ipnon on 1/27/25, 2:09 AM
https://a16z.com/ai-canon/
I prefer the a16z AI canon for this purpose. It’s useful and historical. It’s structured to begin with no prerequisites and work up to cutting edge research papers. And best of all it’s free and open source.
by talles on 1/27/25, 12:38 PM
For deep learning:
1. Linear algebra. Be comfortable with vector transformations in the vector space. This is the framework to understand how data is represented and what is going on inside the model.
2. Calculus. Specifically derivatives, up to partial derivatives and the chain rule. This is needed to later understand backpropagation, the learning. It's fine to skip integrals.
3. Vanilla neural network. Study how a simple feed forward and fully connected neural network works, in detail. Every single bit about it.
I wouldn't worry or plan anything ahead until having those. After number 3 you'll have different branches to follow and will be better equipped to pick a path.
by __alexander on 1/26/25, 11:00 PM
I’m starting down a similar path with these two books
How AI Works - https://nostarch.com/how-ai-works
+
Why Machines Learn: The Elegant Math Behind Modern AI - https://www.penguinrandomhouse.com/books/677608/why-machines...
by cdicelico on 1/27/25, 1:36 PM
A loved AI: A Modern Approach—still the best all around textbook on AI imho. I'd only add Susanna Epp's Discrete Mathematics with Applications, and I'd just focus on methodically and thoroughly working through those two, personally. First, Discrete Mathematics, then AI: A Modern Approach. This is exactly what I did and it was a great experience, super helpful.
by Bjartr on 1/27/25, 1:34 PM
I've been a fan of The Little Learner for the first principles side of things. It builds up the theory from almost nothing, step by step. It's got a conversational style that may turn some off, but I quite enjoyed it.
https://www.thelittlelearner.com/
by gamblor956 on 1/27/25, 7:37 PM
Step 1, abandon the concept of "first principles." First principles don't exist for most areas of study outside of pure mathematics, and for newer fields like AI they haven't been established yet so you'd just be hamstringing yourself.
Step 2: The steps above are a good plan for learning about traditional AI, and the traditional approaches, which were based on an attempt to model human thought processes. Machine learning was what the industry turned to in the early 2000s because we didn't have the hardware capabilities then to meaningfully model neural networks. We do now, but machine learning has taken over so there's very little research into modeling neural networks...about the same as there was when I was an undergrad.
by 3abiton on 1/26/25, 11:59 PM
The question is rather a bit flawed without knowing your background, as to determine what to suggest. That being said, I would argue a lot of the novelty and recent advances in AI/ML are not yet documented in books, but rather in dense scientific papers.
by nosioptar on 2/2/25, 6:41 PM
Artificial Intelligence by Poole and Macworth was a pretty solid,and free, book when I read it for a class.
https://www.artint.info/
by meltyness on 1/28/25, 3:13 AM
What these guys are on about: https://arxiv.org/abs/2301.10743
by null_investor on 1/30/25, 10:05 AM
Just do Andrew NG's course.
It will show you the maths, you'll build simple neural nets (from maths!) that can read digits and scale it from there.
While doing it, you may struggle with some of the maths, just take a deep breath and invest sometime to fill the gaps you need and continue.
Learn about CNNs and everything else step-by-step. It's awesome.
by exe34 on 1/27/25, 7:29 AM
https://udlbook.github.io/udlbook/
by syxp on 1/26/25, 11:08 PM
I recommend the following video lectures from a course: https://www.youtube.com/@deeplearningsystemscourse1116/
It covers ground I don't know where else to find on getting from theory to practice.
by virginwidow on 2/5/25, 3:34 AM
Kind thanks for this...
When a notion is disturbing before dismissal I must know why (MY why) 90% boils down to fear (mine) per lacking information hence failure to understand.
The discussion & refences found aided much for my understanding
by Mr-Frog on 1/26/25, 5:40 AM
I really enjoyed the concepts in "Artificial Intelligence, a Modern Approach" which really grounds a first-principles foundation of automated reasoning. Warning: The first 80% of the book doesn't have any sexy new deep learning approaches, but I still think it is very valuable to see the history.
by wodenokoto on 1/27/25, 6:26 AM
In the coursera course accompanying the statlearning book he starts with introducing a nearest neighbour algorithm and from that he develops a linear regression.
I think that is "first principals of AI". Like, what does it even mean when we ask an algorithm to "learn" from data?
by romperstomper on 1/29/25, 8:45 AM
I think Ian Goodfellow's "The Deep Learning" textbook fits well to the list.
https://www.deeplearningbook.org/
by hnarayanan on 1/26/25, 10:20 PM
This article takes a journey from first principles: https://harishnarayanan.org/writing/artistic-style-transfer/
by animesh on 1/27/25, 8:25 AM
Does anyone know how to purchase the PDF version of the book: "Artificial Intelligence, a Modern Approach"?
by charlieyu1 on 1/27/25, 3:44 AM
I want to study game AI but I don’t think I have found much material yet. I guess it is too niche now
by rcarr on 1/27/25, 5:13 PM
Heard nothing but good things about Andrej Karpathy's series on Youtube
by crimsoneer on 1/27/25, 4:07 PM
Going to plug the thoroughly excellent fast.ai courses here
by markus_zhang on 1/27/25, 2:37 PM
IMO learning AI from first principles means a Math/CS Master from a reputable CS university, preferably top 20.
by swah on 1/28/25, 11:34 AM
Ask them?