from Hacker News

Hacker's Guide to Neural Networks

by bernatfp on 11/3/14, 8:35 PM with 34 comments

  • by karpathy on 11/3/14, 11:34 PM

    Thanks for the upvotes! I'm a little conflicted about this being linked around too much because it is still very much work in progress. I work on this guide on a side because I think there's a lot of interest in these models out there, and not enough explanations.

    On a related note, some of you might also be interested in a Stanford CS class I'll be co-teaching with my adviser next quarter. It will be focused primarily on Convolutional Networks (but a lot of the learning machinery is generic):

    CS231n: Convolutional Neural Networks for Visual Recognition http://vision.stanford.edu/teaching/cs231n/

    I hope to make a lot of the materials/code freely available so everyone can follow along, and I will continue my work on this guide in parallel whenever I can squeeze in time. (And I'd be happy to hear any feedback on the content/style/presentation)

  • by antimora on 11/3/14, 9:54 PM

    I wanted to share this resource as well.

    I started this online free book "Neural Networks and Deep Learning" (http://neuralnetworksanddeeplearning.com/). I think it has a pretty good explanation and illustration.

  • by jaza on 11/4/14, 5:46 AM

    Great guide - the only material I've ever read on the subject, that hasn't completely made my head hurt and my brain not grok. I'm not a maths or ML guy, just a regular programmer. I've dabbled in NN before, but only to the extent of using some libraries as a "black box" to pass parameters / training data to. No doubt there are many more people like me (too many!).

    My understanding after reading this guide, is that a neural network is essentially just a formula for guessing the ideal parameter(s) of another formula, and for successively refining those parameters as more training data is passed in. I already knew this "in theory" before, but now I think most of the magic and mystery has been brushed off. Thanks karpathy!

  • by zackmorris on 11/4/14, 12:58 AM

    This is the first explanation of neural nets that has really clicked for me. That's because it's written in the same explanatory tone that I would use in person while trying to convey the concepts. I wish more people would write like this.

    I think of these things like lectures in college. Assuming that a) you have all the context up to that point, b) you grok every step, and c) it doesn't run beyond the length of you attention span - call it an hour - then you can learn astonishing things in a matter of a few sessions. But if you miss a step, and don't raise your hand because you think your question is stupid, then you might as well pack up and go home.

    I find that even though the time element of articles is removed, I still trip over certain sections because the author was too lazy to anticipate my thinking process, which forces me to search other sites for the answers, and before you know it the impact of the writing has been completely diluted.

  • by xanderjanz on 11/4/14, 12:23 AM

    https://class.coursera.org/ml-007

    Andrew Ng's Standford course has been a god send in laying out the mathmatics of Machine Learning. Would be a good next step for anybody who was intrigued by this article.

  • by bribri on 11/4/14, 5:13 AM

    Absolutely Amazing. I wish I could be taught more math in terms of programming. Hope some more people make "Hacker's Guide to _"
  • by hoprocker on 11/4/14, 5:01 AM

    This is great. Always appreciate different approaches to NN and ML. I'm amazed that a Stanford PhD candidate has the time to put this together (I won't tell your advisor :-), but still, thank you!
  • by ilyaeck on 11/4/14, 3:58 AM

    I saw the JS demos a few months ago and was blown away. It seems like the community is really thirsty for NN tools and materials, and the way you are going about this (interactive JS) is right on the money? Why not engage the community to keep building the site up, then? You may want to look into open-sourcing the entire site + the JS library, it may really pick up steam. Worse case, it breaks apart (can always reclaim it), but the exercise could be a very interesting one.
  • by sireat on 11/4/14, 11:31 AM

    This was a very good read.

    My naive question are the gates are always so trivial or are they usually black boxes with an unknown function underneath(instead of clearly defined ,+, etc).

    In other words do we usually/always know what stands behind f(x,y)?

    Otherwise, if f(x,y)=xy then obviously you can hardcode a heuretic that just cranks up max x and max y and gets max f(x,y).

    That is the question is why should we tweak the input slightly not go all out.

  • by breflabb on 11/4/14, 9:41 AM

    inspiring! i have always been interested in AI and neuralnets but i have been discuraged by the math, many educators that try to learn out these topics assume that their readers have the same math skills as them selfs and omits the boring parts for them. its actually the first time i enjoyed to read about AI and at the same time learn calculus! :) keepup the great work!
  • by infinitone on 11/4/14, 12:23 AM

    I was wondering the other day, since I have a CV project that could use neural networks to solve some problems. How big of a training dataset does one need? Is there some analysis on the relationship between the size of training data and accuracy?
  • by jdminhbg on 11/4/14, 6:15 PM

    I worked through the first few chapters on a flight yesterday and found this to be exactly what I need -- my ability to read code and sort of intuit alogrithms vastly exceeds my ability to read and understand math. Thanks for putting this together.
  • by jimmaswell on 11/4/14, 3:29 AM

    "it is important to note that in the left-hand side of the equation above, the horizontal line does not indicate division."

    It pretty much is, though.

  • by JoelHobson on 11/4/14, 2:07 AM

    I haven't finished it yet, but I find this much easier to understand than any other article on neural networks that I've read.
  • by ajtulloch on 11/3/14, 11:02 PM

    This is a great piece of work - thanks @karpathy.
  • by thewarrior on 11/4/14, 9:44 AM

    This seems less like a biological neuron and more like a multi level table lookup.
  • by midgetjones on 11/4/14, 1:13 PM

    This was a fantastic read, thanks so much!

    Can anyone define 'pertubations' for me?

  • by Elzair on 11/4/14, 6:36 PM

    Thank you karpathy. This is a great introduction.
  • by sathya_vj on 11/4/14, 1:19 PM

    Thanks a lot for this!
  • by therobot24 on 11/3/14, 11:02 PM

    HN loves Neural Nets & Deep Learning - it's all i ever see in my RSS feed (with regard to ML methods)