from Hacker News

fatso784

joined 11/2/10, 4:46 AM has 317 karma

EvalGen: Helping Developers Create LLM Evals Aligned to Their Preferences
by fatso784 on 5/14/25, 11:28 PM, with 0 comments
Semantic Commit: Helping Users Update Intent Specifications for AI Memory
by fatso784 on 4/15/25, 1:04 PM, with 0 comments
What AI Engineers Can Learn from Qualitative Research Methods
by fatso784 on 1/9/25, 8:53 PM, with 0 comments
DocETL: A tool for creating LLM-powered data processing pipelines
by fatso784 on 9/26/24, 10:30 PM, with 0 comments
Aligning LLM-as-a-Judge with Human Preferences
by fatso784 on 6/26/24, 8:51 PM, with 0 comments
LLM Wrapper Papers Are Hurting HCI Research
by fatso784 on 6/6/24, 2:37 PM, with 0 comments
Who Validates the Validators? Aligning LLM-Assisted Evaluation of LLM Outputs
by fatso784 on 4/22/24, 4:21 PM, with 0 comments
If in a Crowdsourced Data Annotation Pipeline, a GPT-4
by fatso784 on 3/5/24, 7:25 PM, with 0 comments
Antagonistic AI
by fatso784 on 3/1/24, 12:32 AM, with 0 comments
How to Compare Prompts with ChainForge [video]
by fatso784 on 1/2/24, 5:33 PM, with 0 comments
AI for ChainForge Beta
by fatso784 on 12/13/23, 8:10 PM, with 0 comments
ChatGPT does not have seasonal affective disorder
by fatso784 on 12/12/23, 5:46 PM, with 0 comments
There is no "seasonal affective disorder" of ChatGPT
by fatso784 on 12/12/23, 5:06 PM, with 1 comments
There will never be fully automated prompt engineering
by fatso784 on 9/28/23, 1:10 PM, with 0 comments
ChainForge: A Visual Toolkit for Prompt Engineering and LLM Hypothesis Testing
by fatso784 on 9/19/23, 12:45 PM, with 1 comments
Ask HN: Have LLM API Updates or Deprecations Impacted You?
by fatso784 on 8/17/23, 2:44 PM, with 1 comments
Apple’s ML model and dataset introspection API
by fatso784 on 8/9/23, 7:08 PM, with 0 comments
Show HN: ChainForge, a visual tool for prompt engineering and LLM evaluation
by fatso784 on 8/7/23, 5:54 PM, with 29 comments
Continue multiple conversations simultaneously across multiple LLMs
by fatso784 on 7/28/23, 4:25 PM, with 0 comments
ChainForge now supports chat evaluation
by fatso784 on 7/26/23, 4:39 PM, with 0 comments
ChainForge: A visual tool for prompt engineering in the browser
by fatso784 on 7/5/23, 4:16 PM, with 0 comments
Show HN: Evaluate LLMs, right in the browser. Share your experiments as links
by fatso784 on 7/5/23, 1:23 PM, with 0 comments
You can now run OpenAI evals in ChainForge
by fatso784 on 6/16/23, 6:51 PM, with 0 comments
Pen-Based Computing: Still Looking for the Write App?
by fatso784 on 6/6/23, 4:36 PM, with 0 comments
ChainForge: A visual programming environment for prompt engineering
by fatso784 on 5/24/23, 1:41 AM, with 0 comments
Show HN: ChainForge, a visual tool for evaluating LLM responses
by fatso784 on 5/23/23, 8:04 PM, with 0 comments
Early demo of ChainForge, a data flow environment for prompt engineering
by fatso784 on 4/30/23, 4:17 PM, with 0 comments