by madisonmay on 7/23/18, 12:22 PM with 8 comments
by ovi256 on 7/25/18, 1:05 PM
Transfer learning works great for vision problems (just reuse one of the big SoTA trained on ImageNet networks - I like resnet50). This was enabled by the amazingly shared structure of vision problems. There was nothing similar for NLP, besides pre-trained first layers like word2vec. If you want to learn more, check out the fast.ai DL course, it features transfer learning a lot.
But this model and ULMFiT (nlp.fast.ai) show that deeper nets can be pretrained for NLP, and achieve good results when transfered to other datasets and problems.
This enables not just the obvious use case of "I don't have N GPUs to train a deep net from scratch but I can now finetune a pre-trained model" but more subtle and interesting cases like fine-tuning on a very small dataset (compared to ImageNet or 100000 samples NLP data sets) and cheap training on demand. Training a new model for every user was way too expensive if training from scratch, but if fine-tuning a pre-trained net takes just a few minutes, why not ?
by Tarq0n on 7/25/18, 10:41 AM
Not that this library isn't promising, but the name and presentation makes it seem far more general than it really is.
by stared on 7/25/18, 11:12 AM
https://pytoune.org/ (Keras-like interface for PyTorch) and https://github.com/dnouri/skorch (Scikit-learn interface for PyTorch).
As a side note, a project of mine: super-simple Jupyter Notebook training plots for Keras and PyToune: https://github.com/stared/livelossplot (with bare API, so you can connect it to anything you wish)