by czx111331 on 7/29/24, 12:24 PM with 29 comments
by wim on 7/29/24, 1:57 PM
There is one operation to change the tree, called insmov (move-or-insert). Whenever a client is online it can sync changes C to a server. Whenever the server has remote changes for us, it will send us back a list R of all changes since our last sync in a global linear order. We then undo any of the insmovs in our changeset C, and (re)apply all changes in R + any new changes we didn't sync yet.
We don't use any fractional indices though. Instead, our insmov tuple not only contains a parent P, but also a previous sibling guid A. Because all tree ops will eventually be applied in the global linear order as determined by the server, "sorting" is handled by just using the insmov operation.
Most of the time the undo'ing of operations is not needed though. Only when the server has insmov changes we don't know about while we are sending new insmovs ourselves do we need to ensure we replay the operations in the correct order. That's likely to happen when you reconnect to wifi after a long flight, but not so likely when updates are pushed in real-time over websocket when you're online (plus it's not needed for non-insmov operations, like updating text).
by rwieruch on 7/29/24, 6:12 PM
After I finished the project, I kinda knew why Google Drive only allows to display and modify on the same hierarchical level. There are so many constraints that you have to consider when implementing this in a nested view with many nodes.
by koromak on 7/29/24, 1:48 PM
Making safe updates via regular REST is difficult, as you have to make sure someone with two tabs open isn't going to make an update on tab 1, then another on tab 2 which puts the overall profile into an invalid state. And in general, ordering matters. Skipping an update serverside that was properly applied clientside could break things.
The dumb-as-rocks solution I came up with is to just send the minimal amount of data over that can completely overwrite a particular chunk of state, and place it behind a queue. Usually thats fine, but sometimes thats a lot of wasted data, 50KB when the actual change was only a couple bytes.
I don't need CRDTs for any of the regular reasons, but it seems like it would make state management a million times easier, even for a single user. For one, I'd get syncing between a user's browser tabs, which is good. But moreover, I can just make simple changes to frontend state, and trust that the CRDT is going to negotiate it properly with the server. I no longer have to deal with it myself.
Does this make sense? Or is the overhead required to make something like Yjs work not worth it when I don't even need multiplayer and local-first.
by lewisjoe on 7/29/24, 1:30 PM
Concurrent conflicts in such cases are notoriously hard to converge without contextual special handling [1]. Does this implementation generalize a solution for such use-cases?
I guess it should be possible to combine a list(or string) CRDT for leaf nodes (i.e text blocks) and use this tree CRDT for structural nodes (lists & tables).
But that will make augmenting every op with two-dimensional address (parent-id, index_offset_into_that_parent)
by billconan on 7/29/24, 1:56 PM
by curtisblaine on 7/29/24, 5:11 PM