by code_monk666 on 3/9/25, 6:31 AM with 6 comments
by pwnna on 3/9/25, 6:06 PM
https://github.com/Shopify/ghostferry/blob/main/tlaplus/ghos...
I also was able to find an concrrency bug before a single line of code was written with the TLC which saved a lot of time. It took about 4 weeks to design and verify the system in spec and about 2 weeks to write the initial code version, which mostly survived to this day and reasonably resembles the TLA+ spec. To my knowledge (I no longer work there) the correctness of the system was never violated and it never had any sort of data corruption. Would be a much harder feat without TLA+.
by agentultra on 3/11/25, 2:25 PM
Since then I've used it for similar purposes as this. I try to share with my team when I do this kind of work. But often folks are highly resistant to it so I use it on my own when I'm working on a hard project.
Spending the time up-front to work out the design is often more cost effective in terms of time and money than iterating in production towards a, "more correct design." Getting it right first sounds hard, and it's challenging, but it's worth it for the kinds of projects where mistakes are costly, difficult to diagnose and to fix.
And I think more software developers are starting to realize this [0] (even if they're using different methods).
[0]: https://www.youtube.com/watch?v=w3WYdYyjek4&t=1333s
Update: grammar/spelling
by bugarela on 3/12/25, 11:42 AM
This looks like a great example, I'll try to find some time to write a version of it in Quint. I have mentioned DB migration as an example of two phase commit usage before, but never as a standalone spec like this. It's definitely the kind of problem that made me anxious in the past, which means it's a good fit for formal verification :)
by asah on 3/11/25, 2:11 PM
https://chatgpt.com/share/67d0390c-4208-8007-a39d-8d9bfa7886...