from Hacker News

Show HN: ChainForge, a visual tool for prompt engineering and LLM evaluation

by fatso784 on 8/7/23, 5:54 PM with 29 comments

Hi HN! We’re been working hard on this low-code tool for rapid prompt discovery, robustness testing and LLM evaluation. We’ve just released documentation to help new users learn how to use it and what it can already do. Let us know what you think! :)
  • by _puk on 8/7/23, 6:31 PM

    I think you should probably mention that its source is available! [0]

    I don't personally have a need for this right now, but I can really see the use for the parameterised queries, as well as comparisons across models.

    Thanks for your efforts!

    0: https://github.com/ianarawjo/ChainForge

  • by mabcat on 8/7/23, 10:56 PM

    This looks excellent! It's a great interface for two things I'm struggling to make LlamaIndex do: explain and debug multi-step responses for agent flows, and cache queries aggressively. If I can work out how to hook it into my LlamaIndex-based pile, happy days.

    Feature/guidance request: how to actually call functions, how to loop on responses to resolve multiple function calls. I've managed to mock a response to get_current_weather using this contraption: https://pasteboard.co/aO9BmHG5qsFt.png . But it's messy and I can't see a way to actually evaluate function calls. And if I involve the Chat Turn node, the message sequences seem to get tangled with each other. Probably I'm holding it wrong!

  • by ericskiff on 8/7/23, 7:02 PM

    We just used this on a project and it was very helpful! Cool to see it here on HN
  • by koryk on 8/7/23, 7:50 PM

    I like it! Any plans to add Google Vertex AI support?
  • by maxlamb on 8/7/23, 10:11 PM

    What exactly is prompt discovery?
  • by trentearl on 8/7/23, 7:15 PM

    Cool project