from Hacker News

ONNX Runtime merges WebGPU backend

by b_mc2 on 4/24/23, 11:18 PM with 35 comments

  • by minimaxir on 4/25/23, 12:43 AM

    Some context for those who aren't in the loop: ONNX Runtime (https://onnxruntime.ai/) is a standardization format for AI models. Nowadays, it's extremely easy to export models in the ONNX format, especially language models with tools like Hugging Face transformers which have special workflows for it.

    ONNX support in the browser was lacking and limited to CPU, but with a WebGPU backend it may now finally be feasible to run models in the browser on a GPU, which opens up interesting oppertunities. Although from this PR it looks like only a few operations are implemented, no browser-based GPT yet.

  • by pumanoir on 4/25/23, 2:52 AM

    A great option but there is wonnx which seems to be more complete and mature. And the bonus is that it's implemented in Rust (if you are into it).

    https://github.com/webonnx/wonnx

  • by CSSer on 4/25/23, 12:30 AM

    Okay, I’ve just got to know: what’s up with the commit messages? I’ve never seen anyone just straight up use numbers before.
  • by b_mc2 on 4/25/23, 1:27 AM

    A pretty cool library that uses ONNX is transformers.js [1] and they're already working to add WebGPU support.[2]

    [1] https://xenova.github.io/transformers.js/

    [2] https://twitter.com/xenovacom/status/1650634015060156420

  • by jarym on 4/25/23, 1:45 PM

    As an aside, I love ONNX and the main reason I'm sticking with PyTorch. I was able to develop and train an RL model in Python and then convert it to ONNX and call it from C# production code.

    It still took a lot of effort but the final version is very performant and reliable.

  • by Culonavirus on 4/25/23, 3:15 AM

    It would be great if, one lovely day, we could just slap together an electron app and run inference through webgpu hassle-free and cross-platform.
  • by WorldPeas on 4/25/23, 2:07 AM

    Quite the uh... interesting commit strategy...
  • by synergy20 on 4/25/23, 12:43 AM

    interesting, is there a helloworld example or tutorial somewhere to check out how this works in real?