from Hacker News

Building AI Products–Part I: Back-End Architecture

by rafaelferreira on 12/27/24, 10:42 AM with 21 comments

by xrd on 12/27/24, 12:40 PM
This article is written by an engineer, first and foremost.
Many of the APIs or LLM extensions provided by AI companies are written by ML engineers that do not have Phil's decades of experience in distributed systems, databases and networking. That is evident after reading this article; the first time I've seen a coherent discussion of the tools and tradeoffs when building agentic systems.
I've struggled to actually build something useful with the "agentic" systems and tools out there (and I've tried a lot). Deep down I've felt intimidated by the dozens of new terms the docs use, and after reflection, those tech marketing pieces give the vibe that they are written primarily by AI and told to be colorful and not clear and precise. These solutions from billion dollar valued companies must to present "brand new" ideas to justify their valuations. We should know better: everything builds on the shoulders of decades of research and discovery. If you see something flying high in the clouds (and not standing on the shoulders of giants), it is sure to fall back to earth soon.
A great read. I'm very excited about Outropy.
by svilen_dobrev on 12/27/24, 12:23 PM
> durable workflows
This is what long-running-transactions of the past became.. and slowly cover all their ground (initially Cadence by Uber, then Temporal). Zillions of little flows that can go through their FSMs at any speed, (milli)seconds-or-days-or-months-or-whenever.
i wonder though, how much some further developments like Cloudflare's durable objects, or similar recently announced Rivet actions [1] would simplify (or, complicate) matters, esp. in this "agentic" case ?
https://news.ycombinator.com/item?id=42472519
by nikolayasdf123 on 12/27/24, 1:32 PM
> Agents are not Microservices
> Agents naturally align with OOP principles: they maintain encapsulated state (their memory), expose methods (their tools and decision-making capabilities via inference pipelines), and communicate through message passing
it does sound like a service (memory=db,methods+messages=api). it is just the level of isolation/deployment you need
UPD: also, how come your services share database layer (?), maybe problems in scaling are not due to Agents at all? do you have scaling issues even without agents? would not be surprised! classic rule form Amazon 2002 API mandate by Bezos "no shared db between services. all communication happens over exposed interfaces and over network".
by theptip on 12/27/24, 6:43 PM
Great breakdown of the "architectural decision log" for the evolution of this system.
> This model broke down when we added backpressure and resilience patterns to our agents. We faced new challenges: what happens when the third of five LLM calls fails during an agent’s decision process? Should we retry everything? Save partial results and retry just the failed call? When do we give up and error out?”
> We first looked at ETL tools like Apache Airflow. While great for data engineering, Airflow’s focus on stateless, scheduled tasks wasn’t a good fit for our agents’ stateful, event-driven operations.
> I’d heard great things about Temporal from my previous teams at DigitalOcean. It’s built for long-running, stateful workflows, offering the durability and resilience we needed out of the box.
I would also have reached for workflow engines here. But I wonder if Actor frameworks might actually be the sweet spot; something like Erlang's distributed actor model could be a good fit. I'm not familiar with a good distributed Actor framework for Python but there's of course Elixir, Actix, Akka in other stacks.
Coming from the other direction, I'm not surprised that Airflow isn't fit for this purpose, but I wonder if one of the newer generation of ETL engines like Dagster would work? Maybe the workflow here just involves too many pipelines (one per customer per Agent, I suppose), and too many Sensor events (each Slack message would get materialized, not sure if that's excessive). Could be a fairly substantial overhaul to the architecture vs. Temporal, but I'd be interested to know if anyone has experimented with this option for AI workflows.
by karmasimida on 12/27/24, 1:21 PM
I don't see AI system too special in terms of back-end engineering, except maybe for agentic system, things are inherently stateful.
But considering how limited RPM/TPM with regards mainstream LLMs, states saving/loading is hardly the bottleneck I feel.
by upghost on 12/27/24, 12:53 PM
Color me impressed. These guys get it right because they treat LLMs like what they are -- tools with a specific use, not anthropomorphized pets. (although I did groan a bit at the "AI Chief of Staff" moniker).
It's extremely refreshing to hear an actual engineering conversation around LLMs that doesn't sound like it came out of the pages of an undergraduate alchemy notebook.
by AYBABTME on 12/27/24, 2:15 PM
I came to the same conclusion about Temporal for these types of things. Interactive stuff that touches 1 DB? Do it in the API. Needs to coordinate >1 thing? Temporal.
Orchestrating a bunch of LLM calls is a perfect fit for Temporal.
by iandanforth on 12/27/24, 2:29 PM
Thanks for the excellent article. It's hard to find these step by step architecture evolution retrospectives. A great reference for other startups going though a similar journey!
by jarbus on 12/27/24, 12:59 PM
Great article, really enjoyed how they described what they initially tried, where it struggled, and why their current solution works better.
by xwowsersx on 12/27/24, 9:48 PM
Well written. It's a rare pleasure to hear a discussion about LLMs grounded in real engineering, free from the fanciful notions often found in all the other spam out there.
by asah on 12/27/24, 2:33 PM
crazy idea: could quantum entangled communication help soften CAP ? (e.g. by allowing limited communication between partitions)