from Hacker News

Discovering algorithms by enumerating terms in Haskell

by agomez314 on 8/2/24, 6:57 PM with 48 comments

  • by bmc7505 on 8/3/24, 8:11 PM

    Maciej Bendkowski has some related work [1] on generating random lambda terms, but was unable to overcome what he calls the asymptotic sparsity problem:

      Sampling simply-typed terms seems notoriously more challenging than sampling closed ones. Even rejection sampling, whenever applicable, admits serious limitations due to the imminent asymptotic sparsity problem — asymptotically almost no term, be it either plain or closed, is at the same time (simply) typeable. [...] Asymptotic sparsity of simply-typed λ-terms is an impenetrable barrier to rejection sampling techniques. As the term size tends to infinity, so does the induced rejection overhead. In order to postpone this inevitable obstacle, it is possible to use dedicated mechanisms interrupting the sampler as soon as it is clear that the partially generated term cannot be extended to a typeable one. The current state-of-the-art samplers take this approach, combining Boltzmann models with modern logic programming execution engines backed by highly-optimised unification algorithms. Nonetheless, even with these sophisticated optimisations, such samplers are not likely to generate terms of sizes larger than one hundred.
    
    I would be curious to see a more rigorous analysis of the sample complexity of generating well-typed expressions in, e.g., the STLC. Maybe there is a way to avoid or reduce the rejection rate before evaluation.

    [1]: https://arxiv.org/pdf/2005.08856

  • by grnnja on 8/3/24, 10:54 PM

    What the author is getting at is a pretty cool research area called program synthesis, where the goal is to create a program that satisfies a specification.

    Most techniques are essentially brute force enumeration with tricks to improve performance, so they tend to struggle to find larger programs. A lot of active research is in improving performance.

    Compared to asking a LLM to write a program, program synthesis approaches will guarantee that a solution will satisfy the specification which can be very powerful.

    In particular, as the author has discovered, one area where program synthesis excels is finding small intricate bitwise operator heavy programs that can be hard to reason about as a human.

    The most famous example of program synthesis is Microsoft's FlashFill, which is used in Excel. You give it a few input output examples and FlashFill will try to create a small program to generalize them, and you can apply the program to more inputs, which saves you a bunch of time. An example from the paper is:

      Input -> Output
    
      International Business Machines -> IBM
      Principles Of Programming Languages -> POPL
      International Conference on Software Engineering -> ICSE
    
      String Program: Loop(\w : Concatenate(SubStr2(v_1, UpperTok, w)))
    
    
    Here's a few papers:

    EUSOLVER: https://www.cis.upenn.edu/~alur/Tacas17.pdf

    FlashFill: https://www.microsoft.com/en-us/research/wp-content/uploads/...

    BlinkFill: https://www.vldb.org/pvldb/vol9/p816-singh.pdf

    Synquid: https://cseweb.ucsd.edu/~npolikarpova/publications/pldi16.pd...

  • by dleslie on 8/3/24, 5:08 PM

    Might help to read some of the author's earlier work, like:

    https://medium.com/@maiavictor/solving-the-mystery-behind-ab...

    And the Omega documentation:

    https://hackage.haskell.org/package/control-monad-omega-0.3/...

    But I'll admit, I want to see a paper or code.

  • by kccqzy on 8/3/24, 5:19 PM

    I don't think the original submission has enough details for us to reproduce or even understand what's being done. Omega monad is just a diagonal search monad that supports infinity. I don't understand the syntax in the screenshot. What's the type of the terms being enumerated? I can see lambda, but what's @inc or the pluses and minuses?
  • by mordechai9000 on 8/3/24, 5:13 PM

    "Perhaps I'm being really naive and an exponential wall is about to humble (and embarrass) me."

    There is something fascinating and awe-inspiring about searching very large, enumerative datasets for useful results. Like wandering Borge's Library of Babel in search of not just meaning but something true and useful.

  • by diwank on 8/3/24, 6:26 PM

  • by tadfisher on 8/3/24, 4:55 PM

    I'm confused, what is the search space? All possible functions from Int -> Int? And how do you verify a result is optimal across inputs?
  • by LightMachine on 8/3/24, 7:53 PM

    Uhm author here. Not sure why this tweet is on Hacker News, as it is just a non-technical "blog post". But I've posted a follow-up today with some code and details, if you're curious:

    https://x.com/VictorTaelin/status/1819774880130158663

    That's as much as I can share for now though

  • by cvoss on 8/4/24, 1:40 AM

    Did you know that there is an efficient algorithm for learning (guessing) a regular language from a series of examples/non-examples? Basically, you directly construct the smallest automaton that correctly discriminates the training inputs. It's Dana Angluin's L* algorithm.

    The technique generalizes to finite tree automata, meaning you can discover generators of simple expression grammars, given examples/non-examples.

    So if you can construe your problem as a labeled tree recognition problem, assuming it's solvable using a finite tree automaton, then you can discover the algorithm for it efficiently.

    For example, if you have three strings of bits (2 inputs and 1 output) and line them up as a single string of triplets of bits, it does not surprise me that it is easy to discover the automaton state transition rules that give you the carry bit for each triple and tell you when to reject the triple because the output bit is not correct for the two input bits.

    The author has arranged the problem this way when sketching the ADC template and has also jump-started the search by assuming the solution space includes exactly one output bit for each pair of input bits. (That may seem like an obvious necessity, but that is a constraint which is not required by the tree automaton formulation, which need not differentiate "inputs" and "outputs".)

  • by treyd on 8/3/24, 4:56 PM

    I'm confused about what this could be, with a title as technical as this but being posted somewhere with the signal:noise of Twitter, also where I can't view it.
  • by sporkl on 8/3/24, 8:12 PM

    This kind of reminds me of what happens when you implement an interpreter in a relational programming language, which lets you do cool stuff like generating quines by specifying that the program and it’s output should be the same.

    Quick search lead to this paper which is exactly what I’m talking about: http://webyrd.net/quines/quines.pdf

  • by verdverm on 8/3/24, 8:25 PM

    Super interesting. Sounds like the algo I created Prioritized Grammar Enumeration.

    https://seminars.math.binghamton.edu/ComboSem/worm-chiu.pge_...

    PGE is essentially a BFS/DFS traversal of the space of all formulas, by using local enumeration of the AST. The biggest gains where from eliminating duplicate work (commutativity / associativity) and not going down bad branches (too much complexity added for no meaningful change to output). A lot of overlap in ideas here, and a lot of open research questions that could be worked on (like can use use RL to help guide the search like A*). There's definitely an exponential explosion or wall as the AST gets wider / deeper.

    At one point I wrote the core algorithm in Haskell, which made it so much more concise and beautiful, but eventually landed on python https://github.com/verdverm/pypge

    In all of Genetic Programming / Symbolic Regression, everyone starts by trying to generate computer code and then switches to just math formula. They are different classes of problems because code has more "genes" and is order sensitive, whereas math is not

  • by garyrob on 8/3/24, 5:28 PM

    Maybe this could be used with the Evolutionary Algorithm subset known as Genetic Programming, originally popularized by John Koza in the 80s?
  • by anamax on 8/5/24, 8:19 AM

  • by rowanG077 on 8/3/24, 11:44 PM

    My intuition tells me you will quickly run out of memory on anything that isn't just a small function. Your state space will just be so big that your super positions will become too large.
  • by jmull on 8/3/24, 10:44 PM

    It's kind of vague, but... it sounds like this is essentially a breadth first search (where the graph being searched is represented by a grammar)?

    Hard to tell.