from Hacker News

Why Payments Engineers Should Avoid State Machines

by ohduran on 9/25/24, 9:24 AM with 14 comments

by agentultra on 9/25/24, 1:11 PM
I didn’t get the analogy. I built a card payment processing stack that works on multiple partner bank backends and of course it’s built on event sourcing because accounting is.
The usual analogy used here are double-entry ledgers. I’m not sure what state machines, Marco Polo, and google maps have to do with it.
Although I suppose you could look at your aggregates as state machines if you’re using them to validate commands before adding new events to the log.
by pjkundert on 9/25/24, 11:56 AM
I read this because (of course), payments engineers should only use state machines: so why would someone claim this?
It says to not hide the events that drive the state machinery: make both the events and state machines available (to both servers and clients), so everyone can reconstruct history when necessary, using the agreed upon events and state machinery.
Which makes sense.
As it turns out, this is what Holochain does to build reliable distributed systems at scale out of unreliable components (payment systems, or anything else).
Every event stored / cached by every App DHT participant is verified by ensuring it’s valid according to the verified shared “DNA” of the App (any state machines and other arbitrary validation code). Misbehaving clients / servers self-incriminate (all events are signed and non-repudiable), and are ejected from the App by all correctly operating nodes.
By induction, a node can then reason that every prior event in the App DHT is valid, regardless of system scale — because an attacker would have to subvert all nodes to avoid having just one valid node notice and report their error to everyone.
by pjc50 on 9/25/24, 1:17 PM
This is kind of awkwardly phrased, especially because state machines or things that look like them are unavoidable in payments. You have a UI flow with state transitions.
But it also has a very valid point about events. Or rather accounting records.
Because ultimately it's a payment system, and what you're doing is updating the accounting records across at least three parties: the user, the payment service, and your own records. You do so by sending messages, and it's a good idea to keep track of the content of those messages. At which point it starts to look a lot more like an event system.
In some cases recording may be a legal requirement. French NF525, for example, mandates securely recording every receipt.
https://www2.ikosoft.com/en-gb/all-knowledge-about-the-nf525...
by hyperman1 on 9/25/24, 1:26 PM
There is something wrong with the terminology here. State machine = push based working?
I've found bitemporal databases a good way of thinking about most crud-style applications, including payment systems. Basically, you keep 2 timestamps: When something supposedly happened, and when you learned about it. This means you leave a logbook of every action around. By replaying the logbook ordered on happened timestamp, you arrive at the correct current state. You can only append to the logbook, never modify an existing field. You use a state machine to gather the state from the logbook, and probably cache it in another table
So e.g. on day 1 you do a payment. happened=learned=day 1, paid. On day 2 you find out this was an error: happened=day1, learned=day2, undo payment.
by blenderob on 9/25/24, 1:25 PM
> A state machine cannot reconstruct the past. It can only move forward. Payments Engineers must avoid state machines.
I don't follow. Of course a state machine cannot reconstruct the past. But you can keep a little auditing framework/logic alongside the state machine to keep track of the event history.
So I don't really follow how you can leap from "A state machine cannot reconstruct the past" to "must avoid state machines." There is no logical connection between the premise and the conclusion!
by anonzzzies on 9/25/24, 1:13 PM
I have been building payment systems for banks and other companies for 3 decades and I am quite unsure what the big revelation is here. My state machines know their history and that's that then?
by taylodl on 9/25/24, 1:29 PM
Meanwhile, this was posted here on HN just yesterday:
https://news.ycombinator.com/item?id=41639763
Also, state machines are very good at storing state transitions, AKA events. So I'm not sure what the author is going on about saying that you can't reconstruct how you got into a particular state. People do it all the time.
As far as scalability is concerned, you can't push it out to the client because the client has no legal authority. Payments, which is part of the banking system, relies on centralized services and processes having legal authority and authorization to make withdraws and deposits.
Since someone having 10 years of experience in payments should know this stuff, I'm thinking maybe the author poorly conveyed their thoughts on the matter.
by phoe-krk on 9/25/24, 1:22 PM
> A state machine cannot reconstruct the past. It can only move forward.
Does keeping a log of state changes, an addition to the current state, transform a state machine into an event-driven system?
> You’re getting scalability by forcing the client to accept more responsibility.
How? Believing that a client is never buggy or malicious is just asking for problems. Alternatively, if "responsibility" here means simply that a client should be able to replay a history of events received from the server on its own, then this problem reduces to the question I asked paragraph above.
by malfist on 9/25/24, 12:49 PM
This article is a while bunch if strawman arguments if you ask me. Author's main point seems to be that state machines _can't_ know how they got to their current state and event driven systems will _always_ know how they arrived at their current state.
But that's just nonsense. There is absolutely nothing intrinsic preventing a state machine knowing how it arrived at a state, in fact if you use an AWS state machine with lambda step functions you can look up any past execution of a state machine and see all the steps it took to arrive where it did.
Furthermore, there's nothing intrinsic about event pipelines that makes them know all the actions taken up arrive somewhere. Not only that, but it's extremely common problem in large event architectures that teams and actions are so distributed you don't actually know who all your consumers are.
by rendall on 9/25/24, 1:33 PM
We all have incoherent or incorrect notions that sometimes we even commit to writing. They do not usually reach the front page of HN, however. I'm rather embarrassed for the author.
The problem and the solution discussed is maintaining history, which has nothing to do with state machines nor event-driven systems, which in turn are not mutually exclusive.
by zeckalpha on 9/25/24, 1:15 PM
Hmm. The line between the two is blurry in practice.
I like to use nouns and verbs to illustrate a difference here. I think people initially approach state machines with noun nodes and verb edges. In practice, verbs are where most of the time is ("waiting for", "attempting to"), so you can also have a state machine where the nodes and edges are inverted. Some systems work better one way or the other.
by akie on 9/25/24, 1:28 PM
What state machines have going for them is the absolute knowledge, nay proof, that “if you are in this state, then all the previous states must have been completed”. When it’s about money, that’s kinda important.
I would use a state machine, and log the transitions (“events”, if you insist).
by ARothfusz on 9/25/24, 12:59 PM
Obviously, of course, a client would never _cheat_ or lie about an event. No, no, no. We can totally trust them.