by cjlovett on 5/20/20, 6:52 PM with 152 comments
by neil_s on 5/20/20, 7:56 PM
You can view the demo at https://twitter.com/i/broadcasts/1OyKAYWPRrWKb starting around 29:00.
It's Sam Altman demoing a massive Open AI model that was trained on GitHub OSS repos using a Microsoft supercomputer. It's not Intellicode, but the host says that they're working on compressing the models to a size that could be feasible in Intellicode. The code model uses English-language comments, or simply function signatures, to generate entire functions. Pretty cool.
by YeGoblynQueenne on 5/20/20, 11:43 PM
I can see this being a useful tool [1]. However, I don't expect any ability for innovation. At best this is like having an exceptionally smart autocomplete function that can look up code snippets on SO for you (provided those code snippets are no longer than one line).
That's not to say that it can't write new code, that nobody has quite written before in the same way. But in order for a tool like this to be useful it must stick as close as possible to what is expected- or it will slow development down rather than helping it. Which means it can only do what has already been done before.
For instance- don't expect this to come up with a new sorting algorithm, out of the blue, or to be able to write good code to solve a certain problem when the majority of code solving that problem on github happens to be pretty bad.
In other words: everyone can relax. This will not take your job. Or mine.
____________
[1] I apologise to the people who know me and who will now be falling off their chairs. OK down there?
by tanilama on 5/20/20, 9:23 PM
But there is the thing, the natural description of a function is not always this unambiguous.
When you are telling a function to 'compute XYZ', what you are actually doing is 'check whether X.a exists, if so execute branch 1), else branch 2)'.
If the logic gets really complicated, then describing it accurately in human language isn't necessarily faster than doing it in code directly. Otherwise, we don't need invent programming languages like at all, we can just write compilers to interpret and execute human languages.
And I am interested, as whether the model itself is conditioned on the type constraint of class. It is neat that they pick Python in this case. But if it is Java or other static typed language, would this system condition its generation not only the natural text, but also the resulted type system? My bet, per my understanding of the language modeling approach they use is, they are not doing this, due to very high complexity and cost of the training, and domain adaptation.
Overall, this again is an interesting demo. But I think for code generation based on human language to be useful, we are really in a scenario, that you need to go 99% accurate for it to be remotely practical.
by IdiocyInAction on 5/20/20, 9:01 PM
The thing is, I'd really need to see a live demo to see how good this is. Making mistakes is actually kind of a big issue; as most people know, debugging code is harder than writing it. And a lot of the language models which can write impressive-seeming text also generate masses of garbage. There's no way to know whether this was cherrypicked or not.
The mere fact that it can extract meaning from text like this is already really impressive though.
by parksy on 5/21/20, 9:04 AM
This way developers just write unit tests or functional tests, and the AI generates code and retrains itself until the code passes for all tests. This could happen silently in the background as the developer defines the tests.
A number of natural language test frameworks exist, Behat for example lets you define tests such as:
Feature: Multiple site support
Background:
Given a global administrator named "Greg"
And a blog named "Greg's anti-tax rants"
And a customer named "Wilson"
And a blog named "Expensive Therapy" owned by "Wilson"
Scenario: Wilson posts to his own blog
Given I am logged in as Wilson
When I try to post to "Expensive Therapy"
Then I should see "Your article was published."
Scenario: Greg posts to a client's blog
Given I am logged in as Greg
When I try to post to "Expensive Therapy"
Then I should see "Your article was published."
It could still fit the dream of describing to a computer what kind of program you want and having it figure out the plumbing.Anyway interesting work. Very interesting. I remember a few colleagues laughed at me no more than 5 years ago when I suggested that AI would eventually write code. And here it is, in an early version, flawed surely but only set to improve.
Edit to add: This subject while insanely interesting to me is well out of my wheelhouse. I'm guessing there's possibly semantic structure to the above that the type of model being used in the demo can't deal with? Like this one use-case has to co-exist in an entire ecosystem of dependencies and related entities... Could the model cope with that or is it just calculating the likelihood of the next character like other models I've seen, but with insane accuracy when it comes to code?
by Voloskaya on 5/20/20, 9:48 PM
Are those two entirely separate and yet exactly similar initiatives?
by grensley on 5/20/20, 7:57 PM
by swalsh on 5/20/20, 8:16 PM
by corbins on 5/20/20, 7:36 PM
by gradys on 5/21/20, 5:17 PM
You'd be surprised how easy it is to get a model that performs as well as what you see in the video. And it's even easier now that people have built great libraries for fine-tuning generative language models.
I encourage you to try it yourself! There are many interesting extensions for people to explore:
- Use bi-directional context (vanilla GPT-2 only sees backward context)
- Integrate with semantic analysis tools.
- Experiment with different context representations. You condition the model on an arbitrary sequence of N tokens. It's not necessarily the case that you should spend that whole budget on the N tokens that came immediately before. What about including the imports at the top of the file? What about the docstrings for functions that were just used? What about the filepath of the current file?
Don't look at something like this as though watching your job be automated away. Look at it as a tool that you can master and use to move up the stack.
by mring33621 on 5/20/20, 7:59 PM
So the developer's role will shift to:
1) writing good enough descriptions of the code to be generated by the AI model
2) fixing any little issues in the generated code
by simonhughes22 on 5/20/20, 8:35 PM
by jfoster on 5/21/20, 2:56 AM
by symplee on 5/20/20, 8:49 PM
Or, for TDD, generate the unit tests first based on the function name and description. Then, if the dev updates any of those tests, or adds more tests, use that information in auto generating the appropriate code.
by Jach on 5/21/20, 12:41 AM
by cjlovett on 5/20/20, 8:34 PM
by f47il on 5/20/20, 9:51 PM
by rpiguy on 5/20/20, 8:07 PM
by chrisco255 on 5/20/20, 7:31 PM
by imranq on 5/27/20, 6:21 PM
This is a gamechanger for ensuring the reliability of software. Many more people can be involved in the software development process, and inject their domain knowledge into it.
Are there any plans to open source the model? I would love to play around with it.
by Debonnys on 5/20/20, 9:39 PM
In all seriousness, the demo really looks amazing. I'm curious to see more elaborate, real world examples though.
by raghavgoyal14 on 5/21/20, 2:07 AM
by AJRF on 5/21/20, 8:36 AM
However; I fear this moves software engineering closer to the role of something like plumbing.
I've despaired at the state of most software I've used since as far back as I can remember, except when it comes to tools that have the maturity of something like linux, git, emacs, vim and the unix tools.
For software to get good - it needs to be deeply understood by at least one person working on it. If you train an army of warrior drones who get full line autocompletion first they'll start forgetting what types this method takes as its parameters, they'll be less likely to explore codebases instead plugging in the first autocompletion that comes to their editor.
There bosses will of course want this in the name of "Getting Shit Done". We already have this sort of divide between developers, those who heavily lean on their tools and those who use minimal editor help. Once you are forced to learn a tool because your tool isn't spoon feeding you, you have a chance to better reason from first principles using the code you have available. I don't think it's a shock that a very high percentage of the very best developers use emacs or vim with minimal tooling.
I am aware that this whole comment has subtle tones of superiority and elitism and I am genuinely sorry for that but in my experience it's just true that people who lean really hard on their IDEs to do everything for them are less able to develop creative solutions and you can tell from having conversations with them that they don't really understand what they are doing.
by random32840 on 5/21/20, 6:29 AM
That seems like it would be considerably more effective, because you're removing the noise/overhead of parsing the text and giving a much clearer model of what's being manipulated to the AI.
by yeldarb on 5/21/20, 1:56 AM
by Avi-D-coder on 5/21/20, 4:40 PM
This will end up being a better tabnine. Models like GPT2 are still just approximating intelligence, they are not rationally cognizing.
by unixhero on 5/21/20, 6:04 AM
by brenden2 on 5/20/20, 9:20 PM
by neatze on 5/21/20, 12:02 AM
by Bjorkbat on 5/21/20, 4:10 AM
by woile on 5/21/20, 9:58 AM
Is this one in particular open source?
by monkeydust on 5/20/20, 9:41 PM
by sabujp on 5/20/20, 11:12 PM
by boolcow on 5/21/20, 1:12 AM
Creating flashy AI demos relatively easy. Creating important AI products that actually operate in the real world is the difficulty.
by debbiedowner on 5/21/20, 3:52 AM
by mirekrusin on 5/21/20, 7:53 AM
by master_yoda_1 on 5/20/20, 10:52 PM
by pdeligia on 5/20/20, 8:39 PM
by bobly_today on 5/20/20, 7:35 PM
by darepublic on 5/21/20, 3:41 AM
by testeur on 5/27/20, 3:45 PM
by rauf11 on 5/24/20, 6:28 PM
by alpb on 5/20/20, 7:28 PM
by consultutah on 5/20/20, 7:26 PM
by cjlovett on 5/20/20, 6:52 PM
by datlife on 5/20/20, 7:24 PM
by ipsum2 on 5/20/20, 7:25 PM
by Vysero on 5/20/20, 7:26 PM
Build me a class which computes the larger of two integers.
The AI is smart enough to write it.