from Hacker News

Tips for Building Voice User Interfaces

by MatthewPhillips on 1/17/20, 12:28 PM with 10 comments

  • by nmstoker on 1/17/20, 7:01 PM

    I like that the article formalises this, but one concern is that the "whys" seem defined by currently available uses and thus it's limiting.

    I don't doubt that there are plenty of good practical/psychological reasons that these narrowly defined cases will work best and I respect that, but it feels like it will hardly push the envelope if we mimic rather than exceed.

    Clearly it's good to walk before you try to run but as the quality of NLP and speech recognition improves and people become more comfortable, I would hope to see growth around more open ended uses. Voice could ditch the specificity that apps and GUIs demand - you need to be in the right app to do a task but voice could open up a vast range of access (continuing how Alexa and Google Home already aggregate skills). This opens up more challenges around ambiguity (it's well known with the ubiquitous "play X for me" command that could often be handled by many skills) but that's part of what makes it interesting.

    Lastly I totally agree with those pointing out the speed issue - the right tool for the right job applies as always.

  • by jraines on 1/17/20, 2:58 PM

    Aside from the well known problem of discoverability, there are two things that kill me about VUIs:

    1) Slowness. Not just the lag before you start talking, which I presume will get reduced to basically nothing, but the speed of the assistants speech and their verbosity. I wish you could make them talk faster and also enable some sort of “terse mode”

    2) Chaining. I wish they’d listen after the last action for a few seconds and in that state be able to act on things like “do that again” or refer to the main subject of the last answer like “...and what movies was she in last year”?

  • by rkagerer on 1/17/20, 5:01 PM

    I'd love to see a voice platform which decouples functions from phrases, allowing a community of "command makers" to create their own deeply customized commands.

    I'm not convinced the developer of an app or service is necessarily the best suited to be crafting the voice experience. Give power users who actually interact with the thing on a daily basis a decent tool to do some tailoring between your API and your audience, and they'll fix your crappy edge cases and produce more useful interactions. Now add a mechanism to discover and proliferate the best of breed results.

    It's similar to how Apple or Google create the mobile platform, but 3rd party developers are the ones who really understand the needs of a particular group of users.

  • by sys_64738 on 1/18/20, 5:26 PM

    If I can't speak then how should I interact with a voice interface?