by kshitij_libra on 4/30/24, 3:16 PM with 11 comments
Main reason: a) to find the right code, I have to keep sampling it, and b) it doesn't seem to be able to solve larger / more complex problems that I actually find more need for.
I found some interesting research on combining planning-algorithms for complex problems, and some ideas on guiding the LLM's decoding process towards correctness by optimizing it via reward functions and reducing the search space. I've detailed and summarised the main points in the post below.
Questions:
1) Do you find code LLM's really useful? Please share some stories / examples where they help vs they didn't. I'm trying to form a better understanding of their usage
2) Any other research ideas being pursued in this field ? / what are you trying ?
Full post with details here: https://kshitij-banerjee.github.io/2024/04/30/can-llms-produce-better-code/
by mindcrime on 4/30/24, 4:48 PM
The couple of times I've done these things, the task involved something like calling some REST API using Apache HttpClient and doing some processing on the response. I never have the exact API details of HttpClient cached "top of mind" since I do this just infrequently enough to not bother remembering the fiddly details. And the LLM did a credible job of giving me the basic structure of what I was trying to do, and then I just had to edit some of the details - mostly in the "process the response" part. Possibly if I spent more time fiddling with better prompting strategies, etc. I'd get more from the things, but I haven't really invested a lot of time on that front yet.
by 8organicbits on 4/30/24, 3:29 PM
LLMs solve the easy problem (writing code) but don't help with the hard problem (domain knowledge).
by gnabgib on 4/30/24, 4:24 PM
by PaulHoule on 4/30/24, 3:38 PM
It could do some impressive things. I was working on a codebase that used JooQ to generate SQL code. The agent did not have access to the database or SQL scripts but it could figure out what the SQL schema was by looking at the Jooq stubs.
I tried to use it to write a somewhat complex query that involved CTEs in Postgres, there is a chicken-and-egg element of circularity that makes these queries tricky to write. I was able to get it to write very simple JooQ queries but it never really understood the problem I had and solutions and went through quite a few cycles that weren't right even after I'd tell it that "this didn't compile", "that won't work because..." and reading a lot of polite apologizing.
I found it very tiresome to cut and paste code snippets, add imports, have to fix little things, have it not compile, cut and paste compilation errors, then undo all the changes. With close integration to the IDE it might be less painful to cycle through a large number of wrong answers.
My take is that LLMs are very strong when doing things that are basically linear operations from one end to the other end. For instance, language translation is like that, at least at the entry level, since roughly every sentence in one text corresponds to a sentence in the other language. It is like translating JooQ stubs to a SQL script: you don't need to really understand very much, just replace one pattern with another pattern.
Other tasks have an element of looping which is really fundamental in computer science
https://en.wikipedia.org/wiki/Halting_problem
I've found that people often get really offended when you point out that LLMs cannot repeal the fundamentals of computer science, but because they invest a finite computing budget into a problem, an LLM just can't do anything that takes a program which might not complete. The old book
https://en.wikipedia.org/wiki/G%C3%B6del,_Escher,_Bach
has a running story about a conflict between the Tortoise and Achilles who are struggling to solve a problem isomorphic to great logical paradoxes and struggle deliciously for a long time before finally understanding the impossibility of what they are doing. Many people misinterpret this book as a critique of the symbolic AI of the 1970s, but it will give you some insight into how "lets just write a loop with an LLM in it" will get you into problems which are just as intractable as symbolic AI seemed to be the late 1980s.