by trojanalert on 8/21/23, 5:58 AM with 2 comments
by thamer on 8/21/23, 9:31 PM
Some will add a hash function around the key, but this only addresses part of the problem: for example if you started with N shards and ever need to add 1, you will need to move all but O(1/N) keys to new shards. And it's not just about growing the number of shards permanently, other maintenance operations such as replacing a host can require you to redistribute the data depending on how replication is set up.
Consistent hashing can often help drastically reduce the number of keys to shuffle, but in any case it's something worth spending some time on early on and getting right rather than having to pay later for having overlooked its impact.