by funfunfunction on 3/25/23, 12:44 AM with 86 comments
by SilverBirch on 3/25/23, 10:10 AM
It doesn't matter how good your LLM is, the information simply isn't there for it to know the information it needs to document. You're never going to get a comment out of this that says "This interface is meant to be backwards compatible with the interface Bob once wrote on a napkin in the pub on a particularly quiet friday afternoon when he decided to reinvent Kafka".
by divan on 3/25/23, 8:10 AM
Main impression - it does hallucinate like crazy. I asked "How does authorization of HTTP request work?" and it started spitting explanation of how user bcrypt hash is stored in SQlite database and token is stored in Redis cache. There are no signs of SQLite or Redis whatsover on this project.
In other query it started confidently explaining how `getTeam` and `createTeam` functions work. There are no such entity or a word "team" in the entire codebase. To add to the insult, it said that this whole team handling logic is stored in `/assets/sbadmin2/scss/_mixins.scss`.
Other time it offered extremely detailed explanation of some business-logic related question, linking to a lot of existing files from the project, but that was completely off.
Sometimes it offered meaningful explanations, but was ignoring the question. Like I ask to explain relation between two entities and it started showing how to display that entity in a HTML template.
But I guess it's just a question of time when tools like this become a daily assistant. Seems invaluable for the newcomers to the codebase.
by ch33zer on 3/25/23, 2:38 AM
https://github.com/context-labs/autodoc/blob/83f03a3cee62d6e...
> You are acting as a code documentation expert for a project called ${projectName}. Below is the code from a file located at \`${filePath}\`. Write a detailed technical explanation of what this code does. Focus on the high-level purpose of the code and how it may be used in the larger project. Include code examples where appropriate. Keep you response between 100 and 300 words. DO NOT RETURN MORE THAN 300 WORDS. Output should be in markdown format. Do not say "this file is a part of the ${projectName} project". Do not just list the methods and classes in this file. Code: ${fileContents} Response:
by hedora on 3/25/23, 2:27 AM
Self spamming your own code base with comments that are either obvious, misleading or wrong was previously unfathomable to me.
Most people think I’m unrealistically pessimistic.
Well done.
by andrewmcwatters on 3/25/23, 5:22 AM
Autogenerating function documentation seems like such a low bar by comparison. It's like taking limited creativity and applying it with high powered tools.
Literally like asking for a faster horse.
Tell me how WebKit generates tiles for rasterizing a document tree. Show me specifically where it takes virtualized rendering commands and translates them into port specific graphics calls.
Show me the specific binary format and where it is written for Unreal Engine 5 .umaps so that I can understand the embedded information for working between different types of software or porting to different engines.
Some codebases are so large that it literally doesn't matter if individual functions are documented when you have to build a mental model of several layers of abstraction to understand how something works.
by golem14 on 3/25/23, 7:19 AM
In the future, the training sets will contain more and more automatically generated stuff I believe will not be curated well, leading to a spiral of ever declining quality.
by ch33zer on 3/25/23, 4:29 AM
It would be really cool if we could take code + docs, feed it into an LLM and get a determination of whether the code matches what's in the docs. It could also be a good way to evaluate the correctness of the generated docs from the linked tool (assuming it works).
by userbinator on 3/25/23, 3:11 AM
by rkagerer on 3/25/23, 3:54 AM
by zx8080 on 3/25/23, 2:27 AM
It would be hell to lose trust to api docs due to those risks.
by petesergeant on 3/25/23, 3:38 AM
by verdverm on 3/25/23, 2:31 AM
by smrtinsert on 3/25/23, 3:26 PM
by splatzone on 3/25/23, 2:40 AM
The thing I’m wondering about is the cost. How much would it cost to run this on the entire WordPress source, for example?
by whiplash451 on 3/25/23, 11:10 AM
I think people who dismiss this kind of tool because it can hallucinate stuff are off topic.
The AI will get better and better, but more importantly we will evolve and learn to work with this kind of tool.
by ImageDeeply on 3/25/23, 3:01 AM