from Hacker News

Saving $750K on AI inference with one line of code and no quality loss

by t5-notdiamond on 10/17/24, 3:38 PM with 2 comments

by pinkbeanz on 10/17/24, 4:25 PM
This is neat -- how would you think about evaluating the quality loss as you change to more efficient models? I saw you did an analysis on the number of messages, but wondering if there's more robust methods?