from Hacker News

An MNIST-like fashion product dataset

by kashifr on 8/28/17, 6:04 PM with 21 comments

  • by jph00 on 8/28/17, 11:21 PM

    I don't understand why this seems to be getting so much attention. There are plenty of small image datasets around, and wide recognition of the issues with MNIST.

    I see no evidence at all that this particular dataset is better than MNIST. None of the issues they themselves list with MNIST are discussed with relation to their proposed replacement.

    The benchmarks they provide are entirely useless - sklearn does not claim to be a platform for computer vision models. A quick WRN model gets 96% of this dataset (h/t @ajmooch on Twitter), suggesting that it doesn't deal with the "too easy" issue.

    The images clearly don't deal with the problem of lack of translation invariance.

    On the downside, they don't have the same ease of understanding of hand-drawn digits, which is extremely helpful for teaching, debugging, and visualizing.

  • by nip on 8/28/17, 8:11 PM

    How would you go about generating such dataset?

    1. Scrape images and store as png

    2. Downscale to 28px

    3. Convert each image to grayscale

    4. Convert to matrices and add label (additional row?)

    5. Normalize to have matrices of 1 and 0 for faster computation

    6. Vectorize said matrices

    7. Concatenate into one big vector

    Did I miss something / Am I fooling myself?

    I plan on working on my first ML side project and I would love to gain some insights from HN.

  • by eggie5 on 8/28/17, 8:01 PM

    Looks like this was sourced from in-house at some German online retailer: zalando.de. There is a similar data set from from amazon sourced by UCSD: http://jmcauley.ucsd.edu/data/amazon/

    And our research on recommenders using it: http://sharknado.eggie5.com

    Particularly, the 2D scatter of the CNN features: http://sharknado.eggie5.com/tsne

  • by edshiro on 8/28/17, 8:10 PM

    I'd love to play around with this dataset! It certainly seems richer than MNIST, and would most likely force the network to extract more features.

    But just like MNIST, it seems to lack variety in the positioning of the important elements, they are all centered which means that they don't train the network in being translation invariant. I presume this issue can be tackled with data augmentation techniques like applying affine transformations.

  • by stared on 8/29/17, 9:45 AM

    For a MNIST-like dataset, I often use not-MNIST (http://yaroslavvb.blogspot.com/2011/09/notmnist-dataset.html), which is more difficult than the original one (see examples of misclassified digits here: https://docs.neptune.ml/get-started/character-recognition/).

    However, I am not sure if we need more MNIST-like datasets. With small size many things make much less sense (data augmentation, even convnets as images are centered anyway) plus using many channels is a typical things (IRL I rarely work with grayscale images). So I am curious, in which way this dataset is better than CIFAR-10?

    See my note on datasets in Learning Deep Learning, http://p.migdal.pl/2017/04/30/teaching-deep-learning.html#da....

  • by a3864 on 8/28/17, 7:41 PM

    If I am understanding the side-by-side comparison correctly, then the performance is highly correlated with MNIST (at least for high accuracy methods).

    https://i.imgur.com/viV7gFB.png (x-axis: Fashion, y-axis: MNIST)

  • by ntenenz on 8/29/17, 4:03 PM

    One of the reasons people have shifted away from MNIST is that it's simply too easy. Single channel, small image size, few classes, etc. Unfortunately, this does not address any of these concerns.
  • by singularity2001 on 8/28/17, 9:57 PM

    How is this 'better' then cifar10 / cifar100?