from Hacker News

Evaluating Multimodal LLMs Using the Google IO 2024 Puzzle

by simonbutt on 3/15/24, 7:10 PM with 2 comments

  • by malet on 3/15/24, 8:02 PM

    Surprising to see these models stumbling on what at first glance seems like a simple task, it would be interesting to see how the non-vision models fare if you convert the problems to ascii art
  • by simonbutt on 3/15/24, 7:10 PM

    GPT-4V, Claude 3 Opus and Gemini Ultra go head to head in solving GoogleIO Puzzle 2024