by ckcheng on 8/4/24, 1:40 AM with 2 comments
Why doesn't this work for IMO combinatoric problems?
DeepMind AI solved 100% of the IMO algebra problems using AlphaProof, and geometry problems using AlphaGeometry 2, but none (?) of the combinatorial problems.
I don’t see much commentary on why the failure on combinatorial problems. Why is that?
by ckcheng on 8/4/24, 11:08 PM
It's because algebraic type problems are typically of the form "Prove algebraic equation E", where E is relatively easy to translate into Lean language.
Whereas combinatoric problem types are of the form "Prove some open-ended expression P" that even if translated into Lean, it's not clear where to start to connect it to axioms or the eventual problem solution.
If that’s an okay way to describe it, then AlphaProof’s language model, likely trained only on Lean statements (?), would have a tough time generating plausible next steps to combinatorial problems for MCTS to explore?
That doesn’t mean the LM + MCTS architecture is bad though. It means the language model needs to be better at generating plausible next steps (sequence of tokens) with maybe a less axiomatic approach to evaluating the fitness of that step (less Lean reliance?).