by milar on 3/13/25, 4:46 PM with 153 comments
by bddicken on 3/13/25, 5:19 PM
by bob1029 on 3/13/25, 6:05 PM
Latency is king in all performance matters. Especially in those where items must be processed serially. Running SQLite on NVMe provides a latency advantage that no other provider can offer. I don't think running in memory is even a substantial uplift over NVMe persistence for most real world use cases.
by magicmicah85 on 3/13/25, 6:23 PM
by robotguy on 3/13/25, 7:52 PM
Mel never wrote time-delay loops, either, even when the balky Flexowriter
required a delay between output characters to work right.
He just located instructions on the drum
so each successive one was just past the read head when it was needed;
the drum had to execute another complete revolution to find the next instruction.
[0] https://pages.cs.wisc.edu/~markhill/cs354/Fall2008/notes/The...by jhgg on 3/13/25, 6:07 PM
Our workaround was this: https://discord.com/blog/how-discord-supercharges-network-di...
by gz09 on 3/13/25, 7:07 PM
Having recently added support for storing our incremental indexes in https://github.com/feldera/feldera on S3/object storage (we had NVMe for longer due to obvious performance advantages mentioned in the previous article), we'd be happy for someone to disrupt this space with a better offering ;).
by __turbobrew__ on 3/13/25, 7:47 PM
1. Some systems do not support replication out of the box. Sure your cassandra cluster and mysql can do master slave replication, but lots of systems cannot.
2. Your life becomes much harder with NVME storage in cloud as you need to respect maintenance intervals and cloud initiated drains. If you do not hook into those system and drain your data to a different node, the data goes poof. Separating storage from compute allows the cloud operator to drain and move around compute as needed and since the data is independent from the compute — and the cloud operator manages that data system and draining for that system as well — the operator can manage workload placements without the customer needing to be involved.
by tonyhb on 3/13/25, 5:47 PM
by CSDude on 3/14/25, 6:02 AM
I get that local disks are finite, yeah, but I think the core/memory/disk ratio would be good enough for most use cases, no? There are plenty of local disk instances with different ratios as well, so I think a good balance could be found. You could even use local hard disk ones with 20TB+ disks for implementing hot/cold storage.
Big kudos to the PlanetScale team, they're like, finally doing what makes sense. I mean, even AWS themselves don't run Elasticsearch on local disks! Imagine running ClickHouse, Cassandra, all of that on local disks.
by ucarion on 3/13/25, 6:34 PM
On:
> Another issue with network-attached storage in the cloud comes in the form of limiting IOPS. Many cloud providers that use this model, including AWS and Google Cloud, limit the amount of IO operations you can send over the wire. [...]
> If instead you have your storage attached directly to your compute instance, there are no artificial limits placed on IO operations. You can read and write as fast as the hardware will allow for.
I feel like this might be a dumb series of questions, but:
1. The ratelimit on "IOPS" is precisely a ratelimit on a particular kind of network traffic, right? Namely traffic to/from an EBS volume? "IOPS" really means "EBS volume network traffic"?
2. Does this save me money? And if yes, is from some weird AWS arbitrage? Or is it more because of an efficiency win from doing less EBS networking?
I see pretty clearly putting storage and compute on the same machine strictly a latency win, because you structurally have one less hop every time. But is it also a throughput-per-dollar win too?
by myflash13 on 3/14/25, 8:15 AM
edit: apparently they build a kafkaesque layer of caching. No thank you, I'll just keep my data on locally attached NVMe.
by vessenes on 3/13/25, 5:48 PM
by pjdesno on 3/13/25, 9:01 PM
One small nit: > A typical random read can be performed in 1-3 milliseconds.
Um, no. A 7200 RPM platter completes a rotation in 8.33 milliseconds, so rotational delay for a random read is uniformly distributed between 0 and 8.33ms, i.e. mean 4.16ms.
>a single disk will often have well over 100,000 tracks
By my calculations a Seagate IronWolf 18TB has about 615K tracks per surface given that it has 9 platters and 18 surfaces, and an outer diameter read speed of about 260MB/s. (or 557K tracks/inch given typical inner and outer track diameters)
For more than you ever wanted to know about hard drive performance and the mechanical/geometrical considerations that go into it, see https://www.msstconference.org/MSST-history/2024/Papers/msst...
by jgalt212 on 3/13/25, 9:01 PM
by rsanheim on 3/13/25, 8:04 PM
by cmurf on 3/13/25, 5:04 PM
by carderne on 3/14/25, 12:24 PM
It seems like they don't emphasise strongly enough _make sure you colocate your server in the same cloud/az/region/dc as our db. I suspect a large fraction of their users don't realise this, and have loads of server-db traffic happening very slowly over the public internet. It won't take many slow db reads (get session, get a thing, get one more) to trash your server's response latency.
by cynicalsecurity on 3/13/25, 8:17 PM
by anonymousDan on 3/14/25, 12:03 AM
by bloopernova on 3/13/25, 6:20 PM
by SAI_Peregrinus on 3/14/25, 2:54 PM
There were a few storage methods in between tape & HDDs, notably core memory & magnetic drum memory.
by samwho on 3/13/25, 7:54 PM
by gozzoo on 3/13/25, 6:25 PM
by aftbit on 3/13/25, 6:29 PM
by TheAnkurTyagi on 3/14/25, 11:17 AM
by r3tr0 on 3/13/25, 7:50 PM
You can check out our sandbox here:
by liweixin on 3/14/25, 10:11 AM
by dangoodmanUT on 3/13/25, 9:44 PM