by synthc on 2/2/25, 9:37 AM with 47 comments
by jedberg on 2/2/25, 1:13 PM
In other words, this solves the problem of "one hit wonders" getting out of the cache quickly, but that basically already happened with the reddit workload.
The exception to that was Google, which would scrape old pages, and which is why we shunted them to their own infrastructure and didn't cache their requests. Maybe with this algo, we wouldn't have had to do that.
by jbellis on 2/2/25, 1:58 PM
Shoutout to author Ben Manes if he sees this -- thanks for the great work!
by hinkley on 2/2/25, 8:45 PM
It had a clever caching algorithm that favored latency over bandwidth. It weighted hit count versus size, so that given limited space, it would rather keep two small records that had more hits than a large record, so that it could serve more records from cache overall.
For some workloads the payload size is relatively proportional to the cost of the request - for the system of record. But latency and request setup costs do tend to shift that a bit.
But the bigger problem with LRU is that some workloads eventually resemble table scans, and the moment the data set no longer fits into cache, performance falls off a very tall cliff. And not just for that query but now for all subsequent ones as it causes cache misses for everyone else by evicting large quantities of recently used records. So you need to count frequency not just recency.
by thomastay on 2/2/25, 7:38 PM
Love love love this - I really enjoy reading articles where people analyze existing high performance systems instead of just going for the new and shiny thing
by dan-robertson on 2/2/25, 8:23 PM
> Caching is all about maximizing the hit ratio
A thing I worry about a lot is discontinuities in cache behaviour (simple example: let’s say a client polls a list of entries, and downloads each entry from the list one at a time to see if it is different. Obviously this feels like a bit of a silly way for a client to behave. If you have a small lru cache (eg maybe it is partitioned such that partitions are small and all the requests from this client go to the same partition) then there is some threshold size where the client transitions from ~all requests hitting the cache to ~none hitting the cache.)
This is a bit different from some behaviours always being bad for cache (eg a search crawler fetches lots of entries once).
Am I wrong to worry about these kinds of ‘phase transitions’? Should the focus just be on optimising hit rate in the average case?
by quotemstr on 2/2/25, 6:08 PM
by nighthawk454 on 2/2/25, 7:16 PM
https://web.archive.org/web/20250202094451/https://adriacabe... (images are cached better here)
by dstroot on 2/2/25, 3:58 PM
by jupiterroom on 2/2/25, 11:26 AM
by synthc on 2/2/25, 9:37 AM
by urbandw311er on 2/2/25, 10:37 PM