from Hacker News

Hey, GitHub – Waiting list signup

by rcshubhadeep on 11/10/22, 9:00 AM with 241 comments

  • by dang on 11/10/22, 5:20 PM

    This is the third such thread in the last 24 hours which consists of nothing but an elaborate waiting list signup. I've changed the titles to make that clear.

    GitHub Blocks – waiting list signup - https://news.ycombinator.com/item?id=33537706 - Nov 2022 (41 comments)

    GitHub code search – waiting list signup - https://news.ycombinator.com/item?id=33537614 - Nov 2022 (48 comments)

    A good HN discussion needs more than a waiting list signup. A good time to have a thread would be when something is actually available.

  • by dustedcodes on 11/10/22, 9:44 AM

    When I talk to my Google Home then 50% of my brain power is engaged in predicting and working out how to best phrase something so that the "AI" understands what I mean and the other 50% is used to actually think about what I want to accomplish in the first place. This is just about okay for things like switching lights on/off or requesting a nice song I want to listen to, but I could never be productive programming like this. When I'm in the zone I don't want to have to waste any mental capacity on supplementing an imperfect AI, I want to be thinking 100% about what I want to code and just let my fingers do the work.

    For that reason I think this will be less appealing to developers than GitHub may think, otherwise I think it's a cool idea.

  • by ggerganov on 11/10/22, 9:11 AM

    Very interesting - I was sort of expecting it to happen soon.

    I have been playing with using Whisper + Github Copilot in Vim [0]. The Whisper text transcription runs offline with a custom C/C++ inference and I use Copilot through the copilot.nvim plugin for Neovim. The results were very satisfying.

    Edit: And just in case there is interest in this, the code is available [1]. It would be very awesome if someone helps to wrap this functionality in a proper Vim plugin.

    [0] https://youtu.be/3flN9kTcZJY

    [1] https://github.com/ggerganov/whisper.cpp/tree/master/example...

  • by Sheeny96 on 11/10/22, 11:16 AM

    I feel like this is being misunderstood - the long term view of this wouldn't be for code scribing, it'd be for non technical people to be able to instantly create things. Imagine being able to say outloud to your phone "hey, create me a view of all the weather data from the past year correlated against x and y in z view format". The code is the means, not the product.
  • by tweetle_beetle on 11/10/22, 9:39 AM

    From memory, there was a time (end of millenium?) when using voice recognition to write documents was the next big thing. There was a pricey bit of software for Windows that was popular with power users and they would spend hours training it to their voice.

    Then it seemed to just die off. I don't think it was bad technology, because I don't think novelty value was enough to account for its popularity - you had to put hours in to get it to work well, it wasn't a casual toy.

    What's changed since then in terms of technology? Unless it's very significant, I suspect it will go the same way. Apart from an assistive technology viewpoint, my gut instinct is that it's not that satisfying or rewarding talking to a computer all day.

  • by chocolatkey on 11/10/22, 9:13 AM

    If this works well, I would pay a seriously high amount of money. My daily coding time is currently limited by the pain in my hand/fingers that eventually becomes too uncomfortable, and I have to wait for a "cooldown" period of days to "reset" my hands back to normal. I can't even code on a normal keyboard or trackpad for a long time anymore.

    The problem with current voice programming systems is they're just too slow so I end up getting impatient and using my fingers anyway

  • by FloatArtifact on 11/10/22, 1:20 PM

    The challenges that remain in speech coding is not generating code as much as it is navigating through existing code or an application.

    There's only two ways to do this effectively and unfortunately no one has taken the true path to accessibility. The more common way is plugins/extensions to grab a information from the editor.

    Accessibility is more than just one editor. It's the OS and all the applications. Microsoft needs to take the hard route to make an accessibility UI automation server to grab that information and only make up the difference through plugins as needed.

    It's all about grabbing information from the application and generating on the fly commands, not just parsing free dictation in order to get the best accuracy.

    It takes a lot of expertise to make any sort of UI automation, fast and efficient for navigating and selecting text or out of focus menu items.

    I've fussed around and managed to get tree sitter to navigate across code. For example generic commands are like 'next function'. Code simply isn't pronounceable when it's written by others. Therefore, navigating across generic tokens is really the best method. Then other methods can be used for fine navigation if needed.

    My hope is that they develop a grammar system that is open source and integrates with accessibility frameworks focused on performance.

    I wish I could have a phone call with the development team.

  • by Cort3z on 11/10/22, 10:33 AM

    However weird and seemingly useless this might appear to the normal programmer on here, I see this as a huge accomplishment and an incredibly important tool. Why? Accessibility.

    Let’s hope that I never get in a serious accident or get an disabling disease, but if I do I am not planning on giving up coding. What would you do if you lost your hands, or became permanently paralyzed. This is the tool we need to combat that. Hats off to github on this one.

  • by MauranKilom on 11/10/22, 9:38 AM

    Related: https://www.youtube.com/watch?v=MzJ0CytAsec

    It does look like we've made some progress in the 15 years since. I do wonder how this would work in an office setting though - so much noise, so much distraction, and so much crosstalk between programmers...

  • by geewee on 11/10/22, 3:14 PM

    Having programmed and navigated my PC via voice exclusively for about 6 months, done a ton of research and written several articles about it and what options are out there [0][1], I think might be pretty ground-breaking stuff.

    Inputting code with voice is generally difficult, often due to variable names, casing, punctuation etc being hard to get right in voice-to-text. I think this might help quite a lot with that.

    _However_, some of the hardest things in voice coding isn't necessarily just the input. Navigating large codebases is hard, and particularly editing existing code can be extremely difficult, probably much more difficult than just inputting new code.

    I have my doubt that with the demonstration shown here, that it's able to make complex editing tasks simple, but if it does - I cannot overstate how huge of a leap forward it is.

    [0]: https://www.gustavwengel.dk/state-of-voice-coding-2017/ [1]: https://www.gustavwengel.dk/state-of-voice-coding-2019/

  • by birriel on 11/10/22, 10:56 AM

    In the meantime, Talon is pretty good. You can use Vim motions and commands as you normally would, except using your voice (this applies to any editor, really):

    https://talonvoice.com/

  • by pfd1986 on 11/10/22, 11:22 AM

    I think commenters here are -- as usual -- missing the point. This is the training ground (literally) for better models able to respond to commands like "take the CSV from me desktop, plot columns A and D and check if the KL divergence os close to zero". And from that to more complex tasks. You always need the first step and this is it.

    I'm bullish.

  • by onion2k on 11/10/22, 9:45 AM

    I've tried writing documentation and fiction using text-to-speech and, for me, it doesn't work because the apparently the of my part brain I use to think about what I'm going to say is the same part I use to actually say it, so I can't do both things at once. I end up writing far more slowly than I can type.
  • by singularity2001 on 11/10/22, 9:55 AM

    In case anyone else stopped after watching the video, if you scroll down a bit further you see the list of

    FEATURES

    Write/edit code

    Just state your intent in natural language and let Hey, GitHub! do the heavy lifting of suggesting a code snippet. And if you don't like what was generated, ask for a change in plain English. Go to the next method

    Code navigation

    No more using mouse and arrow keys. Ask Hey, GitHub! to...

        go to line 34
        go to method X
        go to next block
    
    Control the IDE

    "Toggle zen mode", “run the program”, or use any other VisualStudio Code command.

    Code Summarization Don’t know what a piece of code does? No problem! Ask Hey, GitHub! to explain lines 3-10 and get a summary of what the code does.

    Explain lines 3 - 10

  • by susrev on 11/10/22, 9:42 AM

    All i could think of while looking at this was having to tell Siri where every comma and period should go while texting with it.

    "insert curly brace", "insert semicolon", "insert insertion", etc. does not sound to fun.

  • by pmontra on 11/10/22, 10:10 AM

    My reactions to the demo (when all is good there is no reaction, so here are only the problematic ones, sorry)

    1) import matplotlib.pyplot as plt

    Why "as plt"?! Let the import alone. But this is a matter of style.

    2) Get titanic csv data from the web [...]

    Surprise, it turns out that "the web" is an URL on raw.githubusercontent.com Hopefully I'll be able to spell an URL of my choice

    3) clean records from titanic data where age is null

    Somehow I already know that there is an Age field and somehow it knows that it must capitalize age into Age

    4) fill null values of column Fare with average column values

    The generated code looks great but somehow I managed to spell a capitalized Fare this time :-) (this is probably a typo in the demo)

    5) Hey,Github! New line

    Inserting a new line can't take so many words. We're going to do without new lines or rely on a formatter or something equivalent.

    6) plot line graph of age vs fare column

    This is where it becomes evident that there was no need to import as plt because I'm not pressing those keys anyway. But this is style and it's going to be uniform across all the users of these tools.

    7) Hey, Github! Run program

    Good.

    Considerations:

    A) Why do commands (new line, run) need "Hey, Github!" which is pretty long and terrible to repeat all the day long (just imagine having to say Hey Joe every time we have to say a sentence to Joe, withing a long conversation with Joe) and text-to-code doesn't?

    B) We got a graph at the end. Now what should I do to edit the code in those 99% of cases where I got the graph wrong? An acceptable answer could be mouse and keyboard. It's a little underwhelming but voice to code already gave me the structure of the code.

    C) Does that mean that Microsoft and GitHub are going to know all the closed source code we'll write for our customers (there might be contractual implications) or is this something that will be self hosted in our machines?

  • by hcnews on 11/10/22, 9:28 AM

    To note, there's a class action lawsuit against GitHub Co-Pilot since it learns from a bunch of open source code with very specific licenses. It's very interesting from establishing copyright in an AI training perspective. Hopefully it goes the distance and some nuanced arguments come out in the court case.

    https://www.theverge.com/2022/11/8/23446821/microsoft-openai...

  • by nightski on 11/10/22, 9:34 AM

    Spoken language is incredibly ambiguous. It's one thing to generate a drawing which can vary wildly in output and still be acceptable. It's another to specify something precisely to a computer. Working with non-programmers on a daily basis it is incredible how difficult it is to communicate even relatively simple things without confusion.

    So all the more power to them, but I am very skeptical. Especially since co-pilot has zero knowledge of the formal semantics of programming languages.

    This is a lot different than the half ass auto complete that it already does since that at least has some context.

  • by jasonlfunk on 11/10/22, 9:33 AM

    I probably wouldn’t use this to write code, but I could see it being really useful for navigating around a project.

    “Go to line 35” “Open the model controller” “Show the get method and set method side by side”

  • by amarant on 11/10/22, 12:42 PM

    Oh cool, my brother used to wish out loud something like this existed a few years back when his wrists were really killing him. He's wrists were so far gone he couldn't even type on a ergonomical keyboard for any greater duration of time, so he used to wish he could just talk instead.

    For me, I got a ergonomical keyboard before my wrists went bad, and so far they seem to be holding up!

    Moral of the story: get a good keyboard early, or you might need a tool like this one someday!

  • by wooptoo on 11/10/22, 10:17 AM

    Hey Github what did the previous developer actually _mean_ with this piece of legacy code?
  • by evnix on 11/10/22, 9:55 AM

    Eye strain is one reason I have been waiting for something like this. If I could close my eyes and just navigate the codebase through a mental modal and some voice commands, I really wouldn't mind paying!

    I have looked at some tools for the blind, but you need just way too much dedication for it to work for you and since you have working eyes it is usually easier to just open your eyes.

  • by glenjamin on 11/10/22, 9:43 AM

    There was an excellent talk at Strange Loop a few years ago by Emily Shea about how she'd learnt to code vim using her voice to combat RSI.

    https://www.youtube.com/watch?v=YKuRkGkf5HU

    The demos are in Ruby, but I could imagine that languages with strong type-aware auto-completion could be easier to do.

  • by philmander on 11/10/22, 11:31 AM

    This is effectively a new higher level programming language without a fixed syntax. Describing more the "what", not "how", and being much closer to natural language over computer language.

    The voice part seems like an (albeit important) accessibility add on.

    I'm sure it won't be perfect but an amazing step forward in the evolution of programming languages

  • by silverlake on 11/10/22, 1:57 PM

    I’m working on something similar. The target market is the 99% of people who want to program ad-hoc domain-specific problems. For example, generating charts w/o having to dig through all the data sources (Wolfram Alpha does a simple version of this). Building a financial risk model for a client’s specific request (you have to be a whiz at Excel, python or some internal ide). Even for home automation, my mom can’t use Alexa’s awful app to customize routines.

    I don’t think the voice part is necessary. It’s easy enough to slap ASR on the front. But going from natural language -> full problem spec -> code is hard in the general case, but doable in well-understood domains. Why can’t Scotty talk to a computer? (https://youtube.com/watch?v=hShY6xZWVGE&feature=share)

  • by lakomen on 11/10/22, 10:14 AM

    Imagine sitting there, talking to your computer, and trying to get the notations right.

    If err unequal nil opening bracket, no no don't open the racket opening bracket... BRACKET, do you know what a bracket is No don't do a do while, delete delete. Don't delete everything... sigh

    Well something like that, I imagine it being a very painful experience.

  • by Quequau on 11/10/22, 9:29 AM

    I remember a talk given some years ago by a man who was using voice to text for creating source code. The key point I remember from his talk & demonstration is that it was not casual ordinary speech but instead a very weird mashup of sounds intended to represent the various symbols which we use in source code.
  • by nxpnsv on 11/10/22, 10:14 AM

    GitHub is doing a whole lot. I think I prefer to edit my code in an editor, not on the website where it's hosted. And I think I don't want fancy AI driven code editor features using my code either. But I guess it is nice they are considering solutions for vision impaired users.
  • by raidicy on 11/10/22, 11:04 AM

    I really hope this is very easy to use. I have severe RSI and can barely surf the web. I tried using other voice to code stuff and it just hurt my voice so I'm hoping I can speak very naturally. I'm really looking forward to seeing if this can help me code again.
  • by pcj-github on 11/10/22, 10:28 AM

    I could see it being useful for things like "goto line 42" or "rename this file as...", or very simple things like that, otherwise, I don't want the cognitive overhead of having to translate coding intent through a voice interpreter.
  • by falcor84 on 11/10/22, 12:28 PM

    I think this, or a future version of this, would have real potential.

    I'm thinking about this in terms of the navigator-pilot pair programming approach, and believe that as a senior, if it's even half-as-good as working with a fresh out of uni hire, then it could have real value. When there's a piece of code that I would like written, when I have good test cases in mind, but would prefer to offload it on someone, I could perhaps write the test cases and function signatures (maybe with the bot's help), get the bot to fill in the blanks until it passes the tests, and then give it direct feedback on how to refactor the code.

    I've signed up for the waiting list and am excited to try this out.

  • by kgrax01 on 11/10/22, 9:29 AM

    People can’t seriously believe this is going to be useful at all?

    I can see this helping as an accessibility tool, but beyond that I don’t think it will be useful. This kind of assumes you know everything about what you’re doing, most of the time you don’t.

  • by boredumb on 11/10/22, 1:37 PM

    As someone who works remotely from home, the last thing I need is to start babbling to myself in code for 8 hours a day. I imagine that's a one way ticket to developing some sort of disorder.
  • by ddevault on 11/10/22, 10:02 AM

    Someone emailed me the other day to share their FOSS voice control system. I was really impressed. It seems to map syllables onto actions in a modal sense ala vim. If I were to build a voice control system, it would look much like this.

    https://numen.johngebbie.com/index.html

    It's free software, it's local to your machine, you don't have to sign up for it, and it works today.

  • by okasaki on 11/10/22, 10:06 AM

    Great for accessibility, but I don't see this would work well in an open office, or even at home if other people are around. Seems really annoying.
  • by tempodox on 11/10/22, 12:25 PM

    Imagine using this in a setting where you're not alone in the room. Imagine using this surrounded by other developers who do the same.
  • by danwee on 11/10/22, 4:00 PM

    Curious: In "Clean records from titanic data where age is null", how does it know that the age field is exactly `Age` and not just `age`? You cannot know this without examing the data set (the headers), so is the software inspecting the loaded CSV "on the fly" before us telling it to actually execute the code?
  • by kevmo314 on 11/10/22, 1:58 PM

    Why are all the comments here so negative? Maybe typing is a hard sell, but some of the navigation stuff seems quite useful. Even being able to invoke VS Code's command palette would be really cool with this. Something like "Open Dockerfile" would be useful and maybe faster than typing.
  • by lkrubner on 11/10/22, 4:17 PM

    My worst prediction ever was at the end of my book, when I struck a positive note about voice interfaces. The startup I was at in 2015 had the pitch "Let your sales people talk directly to Salesforce" and we pushed the limits of what we could do with NLP. That particular startup had spectacularly bad management and so it flamed out in a series of screaming, raging fights, which I documented here:

    https://www.amazon.com/Destroy-Tech-Startup-Easy-Steps/dp/09...

    But at the end of the book I struck an upbeat note, about how the technology was advancing quickly and within 3 or 4 years someone would achieve something much greater than our own limited successes.

    But I was wrong. 7 years later I'm surprised at how little progress there has been. I don't see any startup that's done much better than what we did in 2015. Voice interfaces remain limited in accuracy and use.

  • by hintymad on 11/11/22, 12:31 AM

    So this is a frontend of Copilot. The example of "import pandas" getting translated into "import pandas as pd" is pretty convincing, as the tool helps developers to state their intentions. On the other hand, "hey, github, a new line" kills me.
  • by lleontop on 11/10/22, 12:40 PM

    We have come a long way. I remember when announcements like this one were done by companies on April 1st!
  • by squarefoot on 11/10/22, 12:25 PM

    If translation is semantic and not literally identical, chances are that the user asks for a piece of code and it outputs something that is 100% identical to code that is copyrighted elsewhere. Big "blame the AI" legal loophole waiting to happen?
  • by karmasimida on 11/10/22, 10:41 AM

    Actually would be useful.

    If this is reliable I would pay to use it to some capacity, like add an argument.

  • by crucialfelix on 11/10/22, 9:36 AM

    I spent half an hour today trying to convince the O2 voice agent to get me a real person. Conversational AI is a special kind of hell filled with unhappy paths.

    But for a glimpse of the future watch The Expanse or read William Gibson's Agency.

  • by darepublic on 11/10/22, 1:18 PM

    Execution is everything with this. I've wanted something like this so I could actually code while performing other activities or in various states of intoxication. Don't code and drive. Don't drink and code
  • by Tade0 on 11/10/22, 9:57 AM

    I hope to see the click consonant "‖" adopted as "||" one day.
  • by tabasselejambon on 11/10/22, 9:55 AM

    Let's try to picture the noise in an openspace full of people using that ... focusing is going to be difficult, well at least for people like me who are easily distracted by background noise/conversations.
  • by troelsSteegin on 11/10/22, 1:24 PM

    What if the code in question is a DSL? Something say that is syntactically python, but with a namespace defined through a narrow set of imports. This would be interesting to explore for end-user scripting.
  • by mtkhaos on 11/10/22, 10:02 AM

    Nice attempt and interesting workflow using a prompt based transformer. I would prefer being able to spawn a command palette and skip over the voice, alongside having the choice between different variations.
  • by gopheryourshelf on 11/10/22, 9:52 AM

    Imagine an office where everyone is sitting screaming at their computer.
  • by P5fRxh5kUvp2th on 11/10/22, 12:44 PM

    Programming Perl with speech recognition (an oldie but goodie)

    https://www.youtube.com/watch?v=vPXEDW30qBA

  • by mindvirus on 11/10/22, 12:23 PM

    This is awesome. I could see using this to write code on my phone even.
  • by manesioz on 11/10/22, 2:15 PM

    Interesting. I would find this annoying because its so different from what I'm used to, but the potential it has for people with disabilities is huge.
  • by kdmytro on 11/10/22, 1:20 PM

    This is not going to play well with open-space offices.
  • by WormholeCreator on 11/10/22, 11:43 AM

    it is not practical if we have to describe each and every line.

    Also, imagine you are sitting in an office with other team mates - what happens if all of them talk together but are working on different projects. It will disturb others in terms of noise pollution.

    but it will definitely be a fun project and might work perfectly when you are working alone from home.

  • by iillexial on 11/10/22, 9:37 AM

    Those who say it's useless, what do you think about blind people using this, or those who couldn't type?
  • by dimazhlobo on 11/10/22, 9:39 AM

    Why does the oauth scope requires to “operate on your behalf” but the app is “not owned or operated by GitHub”.

    :/

  • by qntmfred on 11/10/22, 2:46 PM

  • by karmasimida on 11/10/22, 11:09 AM

    One concern is in office space, saying things aloud is ... awkward to say the least.
  • by teratron27 on 11/10/22, 9:34 AM

    I'm sure this will work well with my Scottish accent... (or any non-US accent)
  • by ausudhz on 11/10/22, 9:26 AM

    Next is thoughts to code. Just read my mind I'm gonna seat there and think
  • by hdjjhhvvhga on 11/10/22, 10:59 AM

    I'd like to see how they do with my creative variable and function names.
  • by jenscow on 11/10/22, 11:59 AM

        bool success equals user dot no i mean ah fuck stop stop quit
  • by polishdude20 on 11/10/22, 10:12 AM

    Thank god we're remote. An open office space with this would suck.
  • by v3ss0n on 11/11/22, 3:29 AM

    So software development houses will become call centers.
  • by akuji1993 on 11/10/22, 11:30 AM

    export const ButtonComponent; FunctionComponent no Github no semicolon i meant colon Github backspace 5 times no backspace delete delete Github Arrrgh goddammit
  • by danjc on 11/10/22, 11:59 AM

    And you thought open plan offices were bad already!
  • by polyterative on 11/10/22, 4:47 PM

    I have rsi, github please make it work well
  • by nicolas_lorenzi on 11/10/22, 3:27 PM

    I imagine happiness in the open space
  • by hbarka on 11/10/22, 10:15 AM

    How does it do with SQL?
  • by eurasiantiger on 11/10/22, 10:39 AM

    I do not want this.
  • by mezobeli on 11/10/22, 9:18 AM

    Copilot -> Pilot
  • by univue on 11/10/22, 11:16 AM

    Addd some comments
  • by anshumankmr on 11/10/22, 9:40 AM

    No. Thanks.
  • by kashanjunaid on 11/10/22, 2:03 PM

    very intresting!
  • by univue on 11/10/22, 11:16 AM

    sgdf
  • by singularity2001 on 11/10/22, 9:17 AM

    Drop the "Hey Github" nonsense (hopefully it's only for illustration purposes anyways) and … this will be a generational paradigm change in how to write code… if it works. The hard part will be editing code with your voice too. Like "no, I meant …" etc.

    VERY PROMISING, in any case you can just manually fill the gaps with the keyboard!

  • by kajaktum on 11/10/22, 10:11 AM

    This feels like Github expanding because it can't find anything else to do...It being a for profit organization means that it's unable to say "you know what we pretty much have everything we wanted so we're just going into a maintenance/optimization mode". This happens all the time in open source project where they simply tell their users to move elsewhere for the better alternatives but will never happen to a for profit organization.