by stephencoyner on 2/10/25, 7:45 PM with 82 comments
by Magmalgebra on 2/10/25, 8:44 PM
My team created an identical hypothesis to this doc ~2 years ago and generated a proof of concept. It was pretty magic, we had fortune 500 execs asking for reports on internal metrics and they’d generate in a couple of minutes. First week we got rave reviews - followed by an immediate round of negative feedback as we realized that ~90% of the reports were deeply wrong.
Why were they wrong? It had nothing to do with the LLMs per se, 03-mini doesn’t do much better on our suite than gpt 3.5. The problem was that knowing which data to use for which query was deeply contextual.
Digging into use cases you’d fine that for a particular question you needed to not just get all the rows from a column, you needed to do some obscure JOIN ON operation. This fact was only known by 2 data scientists in charge of writing the report. This flavor or problem - data being messy, with the messiness only documented in a few people’s brains, repeated over and over.
I still work on AI powered products and I don’t see even a little line of sight on this problem. Everyone’s data is immensely messy and likely to remain so. AI has introduced a number of tools to manage that mess, but so far it appears they’ll need to be exposed via fairly traditional UIs.
by bashtoni on 2/10/25, 9:12 PM
In reality, as always, I suspect the truth will be somewhere in between. SaaS products that succeed will be those that have a good UI _and_ and good API that LLMs can use.
An LLM is not always the best interface, particularly for data access. For most people, clicking a few times in the right places is preferable to having to type out (or even speak aloud) "Show me all the calls I did today", waiting for the result, having to follow up with "include the time per call and the expected deal value", etc etc.
There is undoubtedly an opportunity for disruption here, but I think an LLM only SaaS platform is going to be a very tough sell for at least the next decade.
by nmaley on 2/11/25, 12:42 AM
by Bjorkbat on 2/10/25, 10:09 PM
It never panned out, arguably because the technology wasn't quite there yet (this was well before ChatGPT came out), but I thought the bigger problem was that people thought that a chat UI was the ultimate user interface. Just didn't feel right to me. For simple tasks, sure, but otherwise it felt like for "exploratory" tasks it made more sense to have a graphical user interface of some kind.
Same sentiments apply to the hype around agents. Even in a hypothetical world where agents work as well as any human I don't think an agent/chatbot UI is necessarily the ultimate user interface. If I'm asking an agent questions, it makes sense for it to show rather than tell in many contexts. Even in a world where agents capture much of the way we interact with computers, it might make more sense for them to show us using 3rd party SaaS apps.
by bushido on 2/10/25, 10:23 PM
This writeup seems to be authored by a senior designer at Salesforce and I can see the motivation from the their perspective. Their challenges are different than what a new SaaS product will encounter.
Like all the incumbents of their time they are a core-ish database that depended on a plethora of point solutions from vendors and partners to fill in the gaps their product left in constructing workflows. If they don't take an approach like being discussed here – or in the linked OpenAI/Softbank video – they will risk alienating their vendors/partners or worse see them becoming competitors in their own right.
Disclaimer – I'm biased too, I'm building one of the upstarts that aims to compete with Salesforce.
by egypturnash on 2/10/25, 10:56 PM
You Will.
by GiorgioG on 2/11/25, 1:18 AM
by vosper on 2/10/25, 8:47 PM
Many SaaS (especially the complex ones, which are the also the most important ones) have a tonne of UI often imposing a huge amount of non-work work onto users - all the clicking you have to do as part of entering or retrieving data, especially if the UI flow doesn't fit exactly what you're trying to do at that moment. An example might be quicly creating an epic and a bunch of related tickets in Jira, and having them all share some common components.
A generative UI would be able to construct a custom UI for the particular thing the user is trying to do at any point in time. I think it's a really powerful idea, and it could probably be done today by smartly using eg Jira's APIs.
The ability to span applications would be even more powerful. Done well it might even kill the need to maintain complex integrations between related Saas (eg how some product development application might need to sync data to/from Jira or ADO) by having the AI just keep track of changes and move them from one system to another.
Once it gets to the point where the Gen UI is go-to system for interactions you have to wonder what all the designers and UI builders at the myriad SaaS will be doing...
by pragmatic on 2/10/25, 10:32 PM
Who's going to bet millions of dollars these agents after going to get it right. Based on what evidence?
by nitwit005 on 2/11/25, 12:18 AM
I'm sure some of those adjustments are reasonable, but I'm also sure this gets used to create a stack of lies to please upper management.
There's some obvious issues with some sort of AI in such an environment. Do you train the AI to tell the right sorts of lies?
by TranquilMarmot on 2/10/25, 11:10 PM
You can have Agents run behaviors async by attaching triggers to them, for example when you get a specific email or something gets updated in a CRM. You can also give the agent access to basically any third-party action you can think of.
Like others in this thread have pointed out, there's a nice middle-ground here between an LLM-only interface and some nice UI around it, as well as ways to introduce determinism where it makes sense.
The product is still in its early days and we're iterating rapidly, but feel free to check it out and give us some feedback. There's a decent free plan.
by aeromusek on 2/10/25, 11:42 PM
There's a reason we're still using apps instead of talking to Siri…for a huge number of tasks, visual UIs are so much more efficient than long-form text.
by guybedo on 2/10/25, 11:57 PM
It's gonna be: reusable saas components + ai orchestrator + specialized UI
On a related note, there's probably gonna be an extinction level event in the software industry as there's no software moat anymore.
When every application, every feature, every function can be replicated/reproduced by another company in a matter of minutes / hours using AI tools, you don't have a moat anymore.
by alex_young on 2/11/25, 2:05 AM
Why will businesses trust a black box that claims to make good decisions (most of the time) when they have existing human relationships they have vetted, measured, and know the ongoing costs and benefits of?
If the reason is humans are expensive, I have news for you. We've had robotics for around 100 years and the humans are still much cheaper than the robots. Adding a bunch of graphics cards and power plants to the mix doesn't seem to change that equation in a positive direction.
by caspper69 on 2/10/25, 9:03 PM
So let me get this straight- we are going to train AI models to perform screen recognition of some kind (so it can ascertain layout and detect the "important" ui elements), and additionally ask that AI to OCR all text on the screen so it has some hope of being able to follow some natural language instructions (OCR being a task which, as a HN thread a day or two ago pointed out, AI is exceedingly bad at), and then we're going to be able to tell this non-deterministic prediction engine what we want to do with our software, and it's just going to do it?
Like Homer Simpson's button pressing birdie toy? :smackshead:
Why do I have reservations about letting a non-deterministic AI agent run my software?
Why not expose hooks in some common format for our software to perform common tasks? We could call it an "application programming interface". We might even insist on some kind of common data interchange format. I hear all the cool people are into EBCDIC nowadays.
Then we could build a robust and deterministic tool to automate our workflows. It could even pass structured data between unrelated applications in a secure manner. Then we could be sure that the AI Agent will hit the "save the world" button instead of the "kill all humans" button 100% of the time.
On a serious note, we should study various macro recording implementations, to at least have a baseline of what people have been successfully doing for 40+ odd years to automate their workflows, and then come up with an idea that doesn't involve investing in a new computer, gpu, and slowly boiling the oceans.
This reeks of a solution in search of a problem. And the solution has the added benefit of being inefficient and unreliable. But, people don't get billion dollar valuations for macro recorders.
Is this what they meant by "worse is better"?
Edit: and for the love of FSM, please do not expose any new automation APIs to the network.
by utf_8x on 2/10/25, 8:42 PM
by deepsquirrelnet on 2/11/25, 3:10 AM
Autonomy is just more sexy, but in my opinion, it’s a poor design direction for a lot of applications.
by sbmthakur on 2/10/25, 9:44 PM
by ashu1461 on 2/11/25, 6:21 AM
by datadrivenangel on 2/10/25, 8:42 PM
by BSOhealth on 2/10/25, 9:24 PM
by nickdothutton on 2/10/25, 11:08 PM
by asdev on 2/10/25, 10:36 PM
by nonchalantsui on 2/10/25, 8:21 PM
by turnsout on 2/10/25, 8:32 PM
I fundamentally believe that human-oriented web apps are not the answer, and neither is REST. We need something purpose-built.
The challenge is, it has to be SIMPLE enough for people to easily implement in one day. And it needs to be open source to avoid the obvious problems with it being a for-profit enterprise.