from Hacker News

Can reinforcement learning with self-play on language models do math?

by grondilu on 1/15/23, 9:48 AM with 0 comments

As far as I know, the large language models which made the news lately were trained with supervised learning. None used the reinforcement learning techniques used by DeepMind few years ago to learn board games.

DeepMind recently used the same technique to improve the efficiency of matrix multiplication algorithms. Yet if I understand correctly, it didn't use a language model.

A mathematical expression is a sequence of symbols of finite, yet arbitrary length. As such, it needs an attention mechanism to be processed by a neural network, just like with sentences in natural or programming languages.

One task one might assign to a Computer Algebra System would be to simplify an algebraic expression, that is to translate it into a shorter one, or rather to a "simpler" one, provided one can assign a useful complexity measure on the input domain.

Is there any work done on this?