from Hacker News

Pulze AI Evals

by fbnbr on 1/10/25, 3:32 AM with 1 comments

  • by fbnbr on 1/10/25, 3:32 AM

    Benchmark AI models on standard datasets like FinanceBench and MMLU.