by lewq on 12/21/23, 8:13 PM with 3 comments
It includes a GPU scheduler that can do finegrained GPU memory scheduling (Kubernetes can only do whole-GPU, we do it per-GB of GPU memory to pack both inference and fine tuning jobs into the same fleet) to fit model instances into GPU memory to optimally trade off user facing latency with GPU memory utilization
It's a pretty simple stack of control plane and a fat container that runs anywhere you can get hold of a GPU (e.g. runpod).
Architecture: https://docs.helix.ml/docs/architecture
Demo walkthrough showing runner dashboard: https://docs.helix.ml/docs/overview
Run it yourself: https://docs.helix.ml/docs/controlplane
Discord: https://discord.gg/VJftd844GE
Please roast me!
by leblancfg on 12/22/23, 3:56 AM
If I can be cheeky, be sure to repost over the coming days at different hours – you're likely to spawn more traffic that way =)
by lewq on 12/21/23, 11:29 PM
Demo: https://www.youtube.com/watch?v=Ym4nPSzfer0
About Helix
Helix is a generative AI platform that you can run on our cloud or deploy in your own data center or cloud account. It provides an easy-to-use interface to using open source AI that's accessible to everyone.
Under the hood, it uses the best open source models and includes a GPU scheduler that can fit model instances into GPU memory to optimally trade off user facing latency with GPU memory utilization.
If you think this is cool, please vote for us on https://www.producthunt.com/posts/helix-5 today.
Docs: https://docs.helix.ml/docs/overview
Architecture: https://docs.helix.ml/docs/architecture
Things to try with LLM fine-tuning using Helix:
- https://docs.helix.ml/docs/papers
- https://docs.helix.ml/docs/engaging-content
- https://docs.helix.ml/docs/insights-data
- https://docs.helix.ml/docs/website-content
Sample sessions for SDXL:
- https://app.tryhelix.ai/session/e1b50789-a209-46c8-aa60-4d09...
- https://app.tryhelix.ai/session/cc6004cd-111b-48ae-9a8c-d651...
- https://app.tryhelix.ai/session/d50db369-4ffa-4a49-88dd-1cff...