by adam_ellsworth on 6/26/22, 6:51 AM with 5 comments
While for GH it (seems) easier as commit hashes are sufficiently long/unique, for others with character hashes that only span ~6 chars or so, there's bound to be instances where hashes conflict from a statistical standpoint. (iirc reading an article on a hash collision in GH a few years ago here on ycomb)
To my mind these orgs have to have a suite tools/algos requesting information from multiple services, checking whether or not a hash has been taken – and those processes have to optimize for time. (e.g. when a user makes a post, what's a reasonable time to do a lookup?)
So, what are the considerations which need to be made algorithmically to check such collisions while keeping runtime to an acceptable minimum?
by verdverm on 6/26/22, 1:06 PM
1. Hashes are determined outside of their context, by git.
2. They store this is a database
3. Conflicts are so rare that they can be ignored. They must happen within the same repository. If they are in different repos, then it is not a conflict.
by frogger8 on 6/26/22, 1:53 PM
by manx on 6/26/22, 3:39 PM
by compressedgas on 6/26/22, 8:06 AM
Most use a combination of precoordination and a database that when an insert fails they just generate new and try again.
by ev1 on 6/26/22, 7:33 AM