by arjunnarayan on 12/12/20, 6:34 PM with 36 comments
by bob1029 on 12/12/20, 7:29 PM
I am starting to think you can handle all storage requests for a single logical node on just one core/thread. I have been pushing 5~10 million JSON-serialized entities to disk per second with a single managed thread in .NET Core (using a Samsung 970 Pro for testing). This includes indexing and sequential integer key assignment. This testing will completely saturate the drive (over 1 gigabyte per second steady-state). Just getting an increment of a 64 bit integer over a million times per second across an arbitrary number of threads is a big ask. This is the difference you can see when you double down on single threaded ideology for this type of problem domain.
The technical trick to my success is to run all of the database operations in micro batches (10~1000 microseconds per). I use LMAX Disruptor, so the batches are formed naturally based on throughput conditions. Selecting data structures and algorithms that work well in this type of setup is critical. Append-only is a must with flash and makes orders of magnitude difference in performance. Everything else (b-tree algorithms, etc) follows from this realization.
Put another way, If you find yourself using Task or async/await primitives when trying to talk to something as fast as NVMe flash, you need to rethink your approach. The overhead with multiple threads, task parallel abstractions, et. al. is going to cripple any notion of high throughput in a synchronous storage domain.
by zinclozenge on 12/12/20, 7:24 PM
by lrossi on 12/13/20, 9:32 AM
Correct me if I’m wrong, but the only number that I can find is a guarantee that you do not exceed 500 us of latency when handling a request. And it’s not clear if this is a guarantee at all, since you say just that the system will throw a traceback in case of latency spikes.
I would have liked to see the how latency varies under load, how much throughput you can achieve, how the latency long tail looks like on a long-running production load, and comparisons with off-the-shelf systems tuned reasonably.
by dotnwat on 12/12/20, 7:34 PM
by eis on 12/13/20, 5:50 AM
by mirekrusin on 12/12/20, 9:02 PM
by matthewtovbin on 12/13/20, 7:36 AM