from Hacker News

What every programmer should know about SSDs

by sprachspiel on 6/20/21, 5:39 PM with 158 comments

by bob1029 on 6/20/21, 8:13 PM
Things I have learned about SSDs:
If you want to go fast & save NAND lifetime, use append-only log structures.
If you want to go even faster & save even more NAND lifetime, batch your writes in software (i.e. some ring buffer with natural back-pressure mechanism) and then serialize them with a single writer into an append-only log structure. Many newer devices have something like this at the hardware level, but your block size is still a constraint when working in hardware. If you batch in software, you can hypothetically write multiple logical business transactions per block I/O. When you physical block size is 4k and your logical transactions are averaging 512b of data, you would be leaving a lot of throughput on the table.
Going down 1 level of abstraction seems important if you want to extract the most performance from an SSD. Unsurprisingly, the above ideas also make ordinary magnetic disk drives more performant & potentially last longer.
by jedberg on 6/20/21, 7:46 PM
This page tells me a lot about SSDs, but it doesn't tell me why I need to know these things. It doesn't really give me any indication about how I should change my behavior if I know that I'll be running on SSD vs spinning disk.
I've always been told, "just treat SSDs like slow, permanent memory".
by klodolph on 6/20/21, 9:18 PM
If you care about SSDs, one paper you should read is “Don’t Stack Your Log on My Log” by Yang et al. 2014
https://www.usenix.org/system/files/conference/inflow14/infl...
> Log-structured applications and file systems have been used to achieve high write throughput by sequentializing writes. Flash-based storage systems, due to flash memory’s out-of-place update characteristic, have also relied on log-structured approaches. Our work investigates the impacts to performance and endurance in flash when multiple layers of log-structured applications and file systems are layered on top of a log-structured flash device. We show that multiple log layers affects sequentiality and increases write pressure to flash devices through randomization of workloads, unaligned segment sizes, and uncoordinated multi-log garbage collection. All of these effects can combine to negate the intended positive affects of using a log. In this paper we characterize the interactions between multiple levels of independent logs, identify issues that must be considered, and describe design choices to mitigate negative behaviors in multi-log configurations.
by andrewmcwatters on 6/20/21, 7:40 PM
My opinion is probably... not technically correct... until you have to deal with drive reliability and write guarantees, but I don't think programmers actually have to know anything about SSDs in the same way that developers had to know particular things about HDDs.
This is out of pure speculation, but there had to be a period of time during the mass transition to SSDs that engineers said, OK, how do we get the hardware to be compatible with software that is, for the most part, expecting that hard disk drives are being used, and just behave like really fast HDDs.
So, there's almost certainly some non-zero amount of code out there in the wild that is or was doing some very specific write optimized routine that one day was just performing 10 to 100 times faster, and maybe just because of the nature of software is still out there today doing that same routine.
I don't know what that would look like, but my guess would be that it would have something to do with average sized write caches, and those caches look entirely different today or something.
And today, there's probably some SSD specific code doing something out there now, too.
by rossdavidh on 6/20/21, 9:36 PM
Interesting, and fun to read and think about! And, as a professional programmer for 17 years now, not once have I done anything where this would have been important for me to know (even if I had been running my code on a system with SSD's). So, I'm not convinced the title is at all accurate.
But, fun to read and think about.
by dang on 6/20/21, 6:48 PM
What someone else said about that in 2014:
What every programmer should know about solid-state drives - https://news.ycombinator.com/item?id=9049630 - Feb 2015 (31 comments)
by FpUser on 6/20/21, 7:59 PM
It is really puzzling why "every programmer" should burden their already overloaded brains with this. If they're reading/writing some config/data files this knowledge would not help one bit. If they're using database then it falls to the database vendor's to optimize for this scenario.
So I think that unless this "every programmer" is a database storage engine developer (not too many of them I guess) their only concern would be mostly - how close my SSD to that magical point where it has to be cloned and replaced before shit hits the fan.
by rabuse on 6/20/21, 8:33 PM
A little off topic, but I bought a new Macbook Pro with the M1 chip with 8GB of RAM, and I'm worried about the swap usage of this machine wearing out the SSD too quickly. Is this an actual concern, as my swap has been in the multiple GB range with my use?
by kortilla on 6/20/21, 7:52 PM
The title should be “why SSDs mean programmers no longer have to think about hard drives”.
These are all reasons SSDs are much more pleasant to work with than old platter disks.
by teddyh on 6/20/21, 7:52 PM
What everyone should know is that flash drives can lose their data when left unpowered for as little as three months.
by dataflow on 6/20/21, 7:04 PM
What's the flash translation layer made of? Is the flash technology used for that more durable than the rest of the SSD itself? (like say MLC vs. QLC?)
by riobard on 6/21/21, 1:56 AM
One thing I'm still puzzled about SSD over-provisioning, which is also mentioned by the tutorial (https://codecapsule.com/2014/02/12/coding-for-ssds-part-4-ad...) recommended by the article:
> A drive can be over-provisioned simply by formatting it to a logical partition capacity smaller than the maximum physical capacity. The remaining space, invisible to the user, will still be visible and used by the SSD controller.
Does the controller read the partition table to decide that the space beyond logic partition is safe to use as scrap?
by dan-robertson on 6/20/21, 8:58 PM
See this paper from 2017, The unwritten contract of solid state drives: https://dl.acm.org/doi/10.1145/3064176.3064187
by Agentlien on 6/21/21, 7:20 AM
This reminds me of a recent interview[0] by Digital Foundry with the Core Technology Director of Ratchet and Clank: Rift Apart.
Near the beginning they talk about how targeting the PlayStation 5, which has an SSD, drastically changed how they went about making the game.
In short, the quick data transfer meant they were CPU bound rather than disk bound and could afford to have a lot of uncompressed data streamed directly into memory with no extra processing before use.
[0] https://youtu.be/-YpCQrPRpE0
by 1_player on 6/20/21, 7:35 PM
A lot of talk about pages, but no mention about how big these pages are. From a quick look on Google, most SSDs have 4kB pages, with some reaching 8kB or even 16kB.
by 2OEH8eoCRo0 on 6/21/21, 11:15 AM
>Drives not Disks
And where did the word "drive" come from? I thought it referred to motors that spin the media, which SSDs also do not have.
by DrNuke on 6/20/21, 8:15 PM
A number of high-level techniques help rationalize data management and transfer, but the mileage of practical implementations may vary a lot. Generally speaking, only a small number of applications really need to take care and add a further layer of abstraction, that because the best practices already codified into any widespread language do an acceptable job already.
by personjerry on 6/20/21, 7:00 PM
How big is the write cache usually and how does it work? Typically I've seen the write caches be something like 32MB in size, but the "top speed" seems to be sustained for files much bigger than 32MB, which doesn't make sense to me if that top speed is supposedly from writing to the cache. How does that work?
by mikewarot on 6/21/21, 4:18 AM
If you leave un-partitioned space on the SSD, how the heck does the SSD know it is ok to erase it? Wouldn't it be safer to partition it as an extra drive letter, format it, and then leave that drive alone? That would allow the OS to trim all the "empty" blocks.
by ropeladder on 6/21/21, 1:19 AM
If sequential and random reads are mostly the same on SSDs, does that make the distinction between columnar and row-based databases/data storage less important?
by rectang on 6/20/21, 8:40 PM
I wince at the amount of wear the `git clean -dxf; npm ci` cycle must be putting on my SSD.
by CoolGuySteve on 6/20/21, 7:18 PM
The claim about parallelism isn't true. Most benchmarks and my own experience show that sequential reads are still significantly faster than random reads on most NVME drives.
However, random read performance is only somewhere between a 3rd to half as fast as sequential compared to a magnetic disk where it's often 1/10th as fast.
by wly_cdgr on 6/20/21, 10:37 PM
There's nothing whatsoever I should need to know about SSDs as a Javascript programmer and if there is then the programmers on the lower levels haven't done their jobs right and are wasting my time
by BatteryMountain on 6/21/21, 9:23 AM
So.. interesting topic. Last year I experimented with some C# + Samsung 970 Evo Plus Nvme + MessagePack (with compression) + Zfs .. to benchmark how fast I could dump objects from .net memory to disk.
The numbers involved was insane and I played with various scenarios, with/without compression (MessagePack feature), with/without typeless serializer (MessagePack feature), with/without async and then the difference between using sync vs async and forcing disk flushes. I also weighed the difference between writing 1 fat file (append only) or millions of small files. I also checked the difference between using .net streams versus using File.WriteAllBytes (C# feature, an all-in-memory operation, good for small writes, bad for bigger files or async serialization + writing). I also played with the amount of objects involved (100K, 1M, 10M, 50M).
I cannot remember all the numbers involved, but I still have the code for all of it somewhere, so maybe I can write a blogpost about it. But I do remember being utttterly stunned about how fast it actually was to freeze my application state to disk and to thaw it again (the class name was Freezer :p).
The whole reason was, I started using Zfs and read up a bit about how it works. I also have some idea about how ssd's work. I also have some idea how serialization works and writing to disk works (streams etc).. I also have a rough idea how mysql, postgres, sql server save their datafiles to disk and what kind of compromises they make. So one day I was just sitting being frustrated with my data access layers and it dawned on me to try and build my own storage engine for fun, so I started by generating millions of objects that sits in memory, which I then serialized with MessagePack using a Parallel.Foreach (C# feature) to a samsung 970 evo plus to see how fast it would be. It blew my mind and I still don't trust that code enough to use it in production but it does work. Another reason why I tried it out, was because at work we have some postgres tables with 60m+ rows that are getting slow and I'm convinced we have a bad data model + too many indexes and that 60m rows are not too much (since then we've partitioned the hell out of it in multiple ways but that is a nightmare on its own since I still think we sliced the data the wrong way, according to my intuition and where the data has natural boundaries, time will tell who was right).
So I do believe there is a space in the industry where SSD's, paired with certain file systems, using certain file sizes and chunking, will completely leave sql databases in the dust, purely by the mechanism on how each of those things work together. I haven't put my code out in public yet and only told one other dev about it, mostly because it is basically sacrilege to go against the grain in our community and to say "I'm going to write my own database engine" sounds nuts even to me.
by BrissyCoder on 6/21/21, 12:00 AM
Why on earth do 99.5% of programmers even need to know what SSD stands for?