by bernatfp on 11/3/14, 8:35 PM with 34 comments
by karpathy on 11/3/14, 11:34 PM
On a related note, some of you might also be interested in a Stanford CS class I'll be co-teaching with my adviser next quarter. It will be focused primarily on Convolutional Networks (but a lot of the learning machinery is generic):
CS231n: Convolutional Neural Networks for Visual Recognition http://vision.stanford.edu/teaching/cs231n/
I hope to make a lot of the materials/code freely available so everyone can follow along, and I will continue my work on this guide in parallel whenever I can squeeze in time. (And I'd be happy to hear any feedback on the content/style/presentation)
by antimora on 11/3/14, 9:54 PM
I started this online free book "Neural Networks and Deep Learning" (http://neuralnetworksanddeeplearning.com/). I think it has a pretty good explanation and illustration.
by jaza on 11/4/14, 5:46 AM
My understanding after reading this guide, is that a neural network is essentially just a formula for guessing the ideal parameter(s) of another formula, and for successively refining those parameters as more training data is passed in. I already knew this "in theory" before, but now I think most of the magic and mystery has been brushed off. Thanks karpathy!
by zackmorris on 11/4/14, 12:58 AM
I think of these things like lectures in college. Assuming that a) you have all the context up to that point, b) you grok every step, and c) it doesn't run beyond the length of you attention span - call it an hour - then you can learn astonishing things in a matter of a few sessions. But if you miss a step, and don't raise your hand because you think your question is stupid, then you might as well pack up and go home.
I find that even though the time element of articles is removed, I still trip over certain sections because the author was too lazy to anticipate my thinking process, which forces me to search other sites for the answers, and before you know it the impact of the writing has been completely diluted.
by xanderjanz on 11/4/14, 12:23 AM
Andrew Ng's Standford course has been a god send in laying out the mathmatics of Machine Learning. Would be a good next step for anybody who was intrigued by this article.
by bribri on 11/4/14, 5:13 AM
by hoprocker on 11/4/14, 5:01 AM
by ilyaeck on 11/4/14, 3:58 AM
by sireat on 11/4/14, 11:31 AM
My naive question are the gates are always so trivial or are they usually black boxes with an unknown function underneath(instead of clearly defined ,+, etc).
In other words do we usually/always know what stands behind f(x,y)?
Otherwise, if f(x,y)=xy then obviously you can hardcode a heuretic that just cranks up max x and max y and gets max f(x,y).
That is the question is why should we tweak the input slightly not go all out.
by breflabb on 11/4/14, 9:41 AM
by infinitone on 11/4/14, 12:23 AM
by jdminhbg on 11/4/14, 6:15 PM
by jimmaswell on 11/4/14, 3:29 AM
It pretty much is, though.
by JoelHobson on 11/4/14, 2:07 AM
by ajtulloch on 11/3/14, 11:02 PM
by thewarrior on 11/4/14, 9:44 AM
by midgetjones on 11/4/14, 1:13 PM
Can anyone define 'pertubations' for me?
by Elzair on 11/4/14, 6:36 PM
by sathya_vj on 11/4/14, 1:19 PM
by therobot24 on 11/3/14, 11:02 PM