by felixhandte on 12/19/18, 9:04 PM with 87 comments
by karavelov on 12/19/18, 10:41 PM
Bullshit, Zstd was open-source from the very beginning, they just hired Yann and moved the project under facebook org. How do I know? I have written the JVM bindings [1] since v0.1 that are now used by Spark, Kafka, etc.
EDIT: Actually, my initial bindings were against v0.0.2 [2]
Kudos to FB for hiring him and helping Zstd getting production ready. This is just a PR false claim.
[1] https://github.com/luben/zstd-jni
[2] https://github.com/luben/zstd-jni/commit/3dfe760cbb8cc46da32...
by IvanK_net on 12/19/18, 10:07 PM
The response had a header: content-encoding: gzip
Zstandard looks like an improvement of DEFLATE (= gzip = zlib) and its specification is only 3x longer (even though it is introduced 22 years later): https://tools.ietf.org/html/rfc8478
Since Zstandard is so simple and efficient, I thought it would get into browsers very quickly. Then, it could make sense to compress even PNG or JPG images, which are usually impossible to compress with DEFLATE.
by valarauca1 on 12/19/18, 10:24 PM
It is nice to have disk IO be the limiting factor on decompression even when you are using NVMe drives.
by stochastic_monk on 12/19/18, 10:01 PM
by golergka on 12/20/18, 9:34 AM
by josephg on 12/19/18, 9:57 PM
by koolba on 12/19/18, 10:30 PM
What happens if your user data trained dictionary ends up storing user data and you receive a GPDR destruction request?
by jclay on 12/19/18, 9:30 PM
This is a resource I've found helpful: https://www3.nd.edu/~pkamat/pdf/graphs.pdf
"Consider readers with color blindness or deficiencies"
"Avoid colors that are difficult to distinguish"
by m0zg on 12/19/18, 10:11 PM
Right now it's impressive "in the middle", but I find myself in a lot of situations where I care about the extremes. I.e. for something that will be transferred a lot, or cold-stored, I want maximum compression, CPU/RAM usage be damned, within reason. So I tend to use LZMA there if files aren't too large. For realtime/network RPC scenarios I want minimum RAM/CPU usage and Pareto-optimality on multi-GbE networks. This is where I use LZ4 (and used to use Snappy/Zippy).
At their scale, though, FB is surely saving many millions of dollars thanks to deploying this, both in human/machine time savings and storage savings.
by jzawodn on 12/19/18, 10:19 PM