from Hacker News

Can LLMs earn $1M from real freelance coding work?

by nickwritesit on 4/16/25, 3:26 PM with 18 comments

  • by kirktrue on 4/16/25, 4:57 PM

    Unless I am repeatedly missing it, it’s not mentioned in the article how much money the researchers spent performing the tests. What was the budget for the AI execution? If the researchers only spent $10,000 to “earn” $400,000, that’s amazing, whereas if they spent $500,000 for the same result, that’s obviously less exciting.
  • by josefresco on 4/16/25, 3:51 PM

    This resonated with me based on my recent experience using Claude to help me code. I almost gave up, but re-phrased the initial request (after 7-10 failed tries) and it finally nailed it.

    > 3. Performance improves with multiple attempts Allowing the o1 model 7 attempts instead of 1 nearly tripled its success rate, going from 16.5% to 46.5%. This hints that current models may have the knowledge to solve many more problems but struggle with execution on the first try.

    https://newsletter.getdx.com/i/160797867/performance-improve...

  • by dboreham on 4/16/25, 4:51 PM

    How do they know the tasks were "solved"? Wouldn't that require the customer to be happy, and pay the bounty?
  • by fxtentacle on 4/16/25, 10:40 PM

    It's an OpenAI ad... And BTW the actual paper says: "we [..] find that frontier models are still unable to solve the majority of tasks"
  • by jsnell on 4/16/25, 4:59 PM

    Honestly, this reads like an AI-generated summary.

    Discussion on original paper: https://news.ycombinator.com/item?id=43086347

  • by amelius on 4/16/25, 4:58 PM

    There goes all the low-hanging fruit ...
  • by tempire on 4/16/25, 8:01 PM

    No
  • by cmsj on 4/16/25, 4:19 PM

    tl;dr, and as Betteridge's Law would lead you to believe, the answer is no.