from Hacker News

List of data science and machine learning resources

by seats on 12/17/12, 5:32 AM with 15 comments

  • by clarle on 12/17/12, 6:36 AM

    Great write-up, and awesome list of resources!

    The only thing I'd probably add is that there's a pretty significant gap going from learning linear algebra to more advanced topics such as LDA.

    For people who are just getting started with machine learning, it's probably best to get started with implementing some of the more "intuitive" algorithms such as decision trees, k-means, and naive Bayes before moving over to some of the more recent academic work.

    Other things that are pretty useful, but often forgotten, such as feature selection, data normalization, and even data visualization. Algorithms are usually just one part of machine learning, but even the best algorithm wouldn't be able to do anything without identifying what the best features of your data are.

    Still, it's a great list of more advanced topics, and definitely something I'll keep bookmarked for future reference.

  • by antman on 12/17/12, 1:56 PM

    Google is your friend. You will usually find something about those things by altering the following

    best machine learning site:stackoverflow.com "closed as not"

  • by eli_awry on 12/17/12, 6:30 AM

    I've spent the last 1.5 years as a machine learning PhD student slowly discovering many of these resources and topics, and I wish I had had this list at the beginning - it contains most of the gems I've found. I'd add that PGM course on Coursera clearly explains fundamental topics in probabilistic graphical models.

    It's important to understand individual algorithms, but in many ways it's more important to have a broad overview of the field and its more modern methods, so that given a problem it's possible to think about the best way to solve it, and to share a common language with others who may have ideas. Beyond this list and various online courses, I've found that talking to people about their work and explain the high-level concepts of every black-box classifier or similarity metric or whatever it is they use has been quite educational

  • by RaSoJo on 12/17/12, 7:17 AM

    Awesome post. It has been bookmarked, Evernoted, printed and stuck up on my wall.

    I did note the absence of the oft quoted Andrew Ng's Coursera course on ML. I assume the author has put it under : "disruptive educational sites".

    But genuinely want to know how Ng's course measures up to the other resources mentioned in this post??

  • by conductrics on 12/17/12, 6:32 PM

    I reached out to Yann LeCun and he emailed me a couple more recent links. I updated the deep learning section of the post to include them. Feel free to check them out.
  • by paulgb on 12/17/12, 3:50 PM

    Great list. Anyone have a recommendation for a good, rigorous coverage of Bayesian Statistics?
  • by icedin on 12/19/12, 3:07 PM

    Great list, well done Matt. Clear and concise as always.
  • by tgwilson on 12/17/12, 3:14 PM

    Fantastic list of resources!
  • by simplerichard on 12/17/12, 5:25 PM

    Great list.