from Hacker News

DeepSeek: Advancing theorem proving in LLMs through large-scale synthetic data

by hhs on 10/14/24, 3:44 PM with 54 comments

by youoy on 10/15/24, 2:21 AM
From the abstract:
> Proof assistants like Lean have revolutionized mathematical proof verification... > Both the synthetic dataset and the model will be made available to facilitate further research in this promising field.
Please don't use "revolutionised", "promising" on a scientific abstract... If it has revolutionised, tell me instead what was the important thing that made it happen. If it is promising, tell me instead why.
Appart from that, the second sentence says:
> Although large language models (LLMs) show promise in mathematical reasoning, their advancement in formal theorem proving is hindered by a lack of training data.
Is it really? Of hindered by unsuitable model architecture? Because you could perefectly say that Lean is an AI capable of theorem proving... I feel that that sentence should be the potential conclusion of the paper, not the first observation. Like, "we used more training data and it works better, if we scale theorem proving training data, we can have the performance that we want, so training data is actually hindering the advancement in formal theorem proving on LLMs".
by maxrmk on 10/14/24, 9:59 PM
There's a newer version of this model that takes a really cool RL based approach: https://arxiv.org/pdf/2408.08152
by _flux on 10/14/24, 5:10 PM
This must be one of the best applications for LLMs, as you can always automatically verify the results, or reject them otherwise, right?
by aabhay on 10/14/24, 5:57 PM
The ability to use automatic verification + synthetic data is basically common knowledge among practitioners. But all these organizations have also explored endlessly the different ways to overfit on such data and the conclusion is the same -- the current model architecture seems to plateau when it comes to multi-step logical reasoning. You either drift from your common knowledge pre-training too far or you never come up with the right steps in instances where there's a vast design space.
Think -- why has nobody been able to make an LLM play Go better than AlphaZero while still retaining language capabilities? It certainly would have orders of magnitude more parameters.
by uptownfunk on 10/15/24, 4:12 AM
Solutions to major mathematics problems generally require the creation of entirely new objects and the appropriate machinery to analyze them. The behavior of said objects generally ends up providing the solution to the original question as a special case. A new architecture is required but one that is probably not far off from the original transformer in nature.
by whyowhy3484939 on 10/14/24, 6:18 PM
"Suppose you try to construct a coherent, ordered, natural world with no resource other than repeated exposure to things, and the formation of certain associative bonds. Oh, please!"
This is prof. Robinson on Kantian philosophy - check out Oxford podcasts by the way - and this quote is meant to imply that building a coherent world out of raw sensory data and statistics alone is completely and utterly impractical if not outright impossible. While I don't think he meant to refer to any kind of AI, in my mind this description also aptly describes the general method of DL neural networks. Repeated exposure to find correlation.
How does one find order through associativity alone? With AI this is not an academic problem anymore. This has become practical. Kant says it is impossible, not just unlikely.
The Kantian project and the various core issues it tries to address seems readily applicable to AI research yet I see very little mention of it. Perhaps I am just dumb though. Building a mind capable of taming tremendous sensory flux needs to, at the very least, take note of the (many) fundamental issues he raised. Issues I feel are not at all trivial to set aside. I feel we are stuck in Hume's empiricist reasoning and have yet to graduate to Kant and beyond.
Are we now somehow convinced yet again that causality and reasoning will, in fact, after all spontaneously emerge out of pure chaos? Didn't we settle the impossibility of this a few hundred years ago?
by Havoc on 10/14/24, 6:50 PM
Surprised by the negativity here. A 7B class model +- doubles gpt4 score and everyone goes “meh”?!?
by tikkun on 10/14/24, 4:28 PM
[Submitted on 23 May 2024]