by donmcc on 10/13/21, 5:15 PM with 11 comments
by jhallenworld on 10/13/21, 7:10 PM
The main difference is that it computes the tables at startup from the sorted Unicode intervals. So the construction code has to be fast. The same code is also used for user character classes in regular expressions.
Anyway, it builds them in two passes. First pass it de-duplicates nodes, but only the previously constructed node is a candidate for de-duplication. This keeps the memory usage low during construction. De-duplicated nodes can still be modified during this construction, so they may be re-duplicated (there is a reference counter to determine when this happens).
Second pass (after all data is loaded, no more changes allowed), it globally de-duplicates the leaf nodes using a hash table. Many of the leaf nodes are duplicates (and not just the all zero ones).
by willvarfar on 10/13/21, 6:09 PM
by edflsafoiewq on 10/13/21, 6:05 PM
by kaetemi on 10/14/21, 9:59 AM
https://github.com/ryzom/ryzomcore/blob/core4/nel/src/misc/s...
by edflsafoiewq on 10/13/21, 7:03 PM
https://github.com/bellard/quickjs/blob/b5e62895c619d4ffc75c...
by dhsysusbsjsi on 10/13/21, 9:25 PM