from Hacker News

Evaluating Multimodal LLMs Using the Google IO 2024 Puzzle

by simonbutt on 3/15/24, 7:10 PM with 2 comments

by malet on 3/15/24, 8:02 PM
Surprising to see these models stumbling on what at first glance seems like a simple task, it would be interesting to see how the non-vision models fare if you convert the problems to ascii art
by simonbutt on 3/15/24, 7:10 PM
GPT-4V, Claude 3 Opus and Gemini Ultra go head to head in solving GoogleIO Puzzle 2024