from Hacker News

Diagrams AI can, and cannot, generate

by billyp-rva on 3/18/25, 12:09 PM with 68 comments

by diggan on 3/20/25, 4:27 PM
A mistake I see people repeating over and over, is never restarting their conversations with a edited initial message.
Instead of doing what the author is doing here, and sending messages back and forward, leading to a longer and longer conversation, where each messages leads to worse and worse quality replies, until the LLM seems like a dumb rock, rewrite your initial message with everything that went wrong/was misunderstood, and aim to have whatever you want solved in the first message, and you'll get a lot higher quality answers. If the LLM misunderstood, don't reply "No, what I mean was..." but instead rewrite the first message so it's clearer.
This is at least true for all ChatGPT, Claude and DeepSeek models, YMMV with other models.
by LASR on 3/20/25, 7:48 AM
We use mermaidjs as a supercharged version of chain-of-thought for generating some sophisticated decompositions of the intent.
Then we injected the generated mermaid diagrams back into subsequent requests. Reasoning performance improves for a whole variety of applications.
by graphviz on 3/20/25, 12:02 PM
Random thoughts:
Sketching backed by automated cleanup can be good for entering small diagrams. There used to be an iOS app based on graphviz: http://instaviz.com
Constraint-based interactive layout may be underinvested, as a consequence of too many disappointments and false starts in the 1980s.
LLMs seem ill-suited to solving the optimization of combinatorial and geometric constraints and objectives required for good diagram layout. Overall, one has to admire the directness and simplicity of mermaid. Also, it would be great to someday see a practical tool with the quality and generality of the ultra-compact grid layout prototype from the Monash group, https://ialab.it.monash.edu/~dwyer/papers/gridlayout2015.pdf (2015!!)
by vunderba on 3/20/25, 5:50 AM
Related - a nice time saver that I've been using since they added image recognition support to ChatGPT has been taking a quick snap of my crudely drawn hand sketched diagrams (on graph paper) with my phone and asking ChatGPT to convert them to mermaid UML syntax.
by 30minAdayHN on 3/20/25, 4:49 PM
I was thinking about the similar topic and started to wonder if I can generated a diagram of a large codebase.
I thought that LLMs are great at compressing information and thought of putting it to good use by compressing a large codebase into a single diagram. Since entire codebase doesn't fit in the context window, I built a recursive LLM tool that calls itself.
It takes two params: * current diagram state, * new files it needs to expand the diagram.
The seed set would be an empty diagram and an entry point to source code. And I also extended it to complexity analysis.
It worked magically well. Here are couple of diagrams it generated: * https://gist.github.com/priyankc/27eb786e50e41c32d332390a42e... * https://gist.github.com/priyankc/0ca04f09a32f6d91c6b42bd8b18...
If you are interested in trying out, I've blogged here: https://updates.priyank.ch/projects/2025/03/12/complexity-an...
by stared on 3/20/25, 9:53 AM
GPT 4o is not particularly good at this kind of logic, at least compared to other current models. Trying something that is at least in the top 10 from this WebDev Areans leaderboard: https://web.lmarena.ai/leaderboard would help.
Make sure it is allowed to think before doing (not necessarily in a dedicated thinking mode, it can be a regular prompt to design a graph before implementing it; make sure to add in a prompt who the graph is for (e.g. "a clean graph, suitable for a blog post for technical audience").
by McNutty on 3/20/25, 5:45 PM
You have got more patience than me. I have tried to use these tools to generate (basic) network diagrams and by the time I reached your third step I already knew that it was time to quit and draw it out myself. Diagrams need to be correct and accurate otherwise they're just art. I also need any amendments to be made to the same diagram, not to have it regenerated each time.
I do like the idea of another commenter here who takes a photo of their whiteboard and instructs the AI tool to turn it into a structured diagram. That seems to be well within reach of these tools.
by larodi on 3/20/25, 1:47 PM
Claude does quite alright. Across one and a half year I did more than several dozens of Mermaid diagrams of all kinds, and only the most complex perhaps were out of reach.
It also really depends on the printing.
by RKFADU_UOFCCLEL on 3/20/25, 3:44 PM
The "AI" we have now is just a tweening algorithm on a different medium. You won't be able to get it to do anything specific, except when that's a point between 2 existing works. As for this blog, it's nigh unreadable for those not following the current fad web frameworks. Who's to say the user doesn't have to log in to get to the gateway? Gateway can mean different things. Why can the user choose to upload images instead of logging in? What was the purpose of the log in?
by victorbjorklund on 3/20/25, 9:48 AM
I have had good success with D2 diagrams with Claude: https://victorbjorklund.com/build-diagrams-as-code-with-d2-d...
They have icons for common things like cloud things.
by cadamsdotcom on 3/20/25, 6:09 AM
Thanks for writing this up. Some questions for the author:
Interesting perspective but it’s a bit incomplete without a comparison of various models and how they perform.
Kind of like Simon Willison’s now-famous “pelican on a bicycle” test, these diagrams might be done better by some models than others.
Second, this presents a static picture of things, but AI moves really fast! It’d also be great to understand how this capability is improving over time.
by submeta on 3/20/25, 6:52 AM
Try asking llm to generate plantuml markup (use case, statechart, etc) which has some other diagram types in addition to mermaid markup. Then paste it into the free plantuml renderer. Works pretty well.
I also experimented with bpmn markup (xml). Realized there are already repos on GitHub creating bpmn diagrams from prompt.
You can also ask llms to create svg.
by trash_cat on 3/20/25, 11:58 AM
Sonnet 3.7 is perticularly good to generated xml diagrams that can be imported into draw.io. If you are using Cline, Windusurf or Cursor, you can ask it to create the xml file and immediately open it up. Combine it together with CONTEXT.md or ARCHITECTURE.md and you can get a very good overview of the codebase and have discussions around it.
by giberson on 3/21/25, 2:50 PM
FWIW, I think this article could just as accurately be titled "Diagrams Developers can, and cannot, generate".
I'm mainly speaking to the ability to read IaC code ([probably of any library but at LEAST in my case] cdk, pulumi, terraform, cloudformation, serverless) and be able to infer architectural flow from it. It's really not conducive to that use case.
I could also, kidding/not kidding, be speaking to the range of abilities for "mid" and "senior" developers to know and convey such flows in diagrams.
But really my point is this feels like more validation that AI doesn't provide increased ability, it provides existing (and demonstrated) ability faster with less formalized context. The "less formalized context" is what distinguishes it from programs/code.
by ndr_ on 3/20/25, 2:57 PM
I wrote about the same general topic (or more narrowly: process visualization) in German iX magazine, also available here: https://www.heise.de/ratgeber/Prozessvisualisierung-mit-gene... (€)
Rather than relying on end-user products like ChatGPT or Claude.ai, this article is based on the „pure“ model offerings via API and frontends that build on these. While the Ilograph blog ponders „AI’s ability to create generic diagrams“, I‘d conclude: do it, but avoid the „open“ models and low-cost offerings.
by enoeht on 3/20/25, 1:54 PM
Have more success with asking for a detailed workflow print then a d2/mermaid output. No problems with creating a ASCI diagram either and using that for a manual d2 can be done fast enough.
by james-bcn on 3/20/25, 10:01 AM
Why just stick to Mermaid? I expect that there is a lot more material with regards to SVG that large models have been trained on. And it's a fairly simple format. Asking it to create diagrams in SVG format gives it much more flexibility. Of course there may be a bit less consistency, but there are ways around that (e.g. giving an example/template to follow).
Simon Willison has shown that current models aren't very good at creating an SVG of a pelican on a bicycle, but drawing a box diagram in SVG is a much simpler task.
by peter_retief on 3/20/25, 8:14 AM
I ask AI to generate diagrams in LaTeX, works well for me.
by notTooFarGone on 3/20/25, 8:24 AM
I used AI to generate some UML diagrams on a loosely coupled system - just fed it the actual classes where only names identify the actual links. It did quite a good job there.
It was a well defined domain so I guess the training data argument doesn't fit for stuff that is within a "natural" domain like graphs. LLMs can infer the behavior based on naming quite well.
by mulmboy on 3/20/25, 8:11 AM
I have found LLMs to be very good at the kind of code -> diagram task presented here. Fire up superwhisper[1] and stream-of-consciousness away about why you want the diagram, which bits are important, who the audience is, and so on. Then iterate a few times. Works brilliantly for even very complex things, including 5000 line CDK files.
It's disingenuous to conclude that AI is no good at diagramming after using an impotent prompt AND refusing to iterate with it. A human would do no better with the same instructions, LLMs aren't magic.
This is the same as my previous comment https://news.ycombinator.com/item?id=42524125
[1] https://superwhisper.com/
by melagonster on 3/21/25, 2:00 AM
Although searching provides better results, it is certain that attempting to copy or directly use these images would infringe on someone's copyright. In a broader sense, using AI can also be related to copyright infringement; the court must first defeat the AI provider before it can reach the user.
by WesleyLivesay on 3/20/25, 12:09 PM
Given the pace of development in this space, it is probably worth noting in the title that this is from November 2024 so the results might be a bit dated.
by jbverschoor on 3/20/25, 1:13 PM
[yet]