from Hacker News

Groq surpasses 1,200 tokens/sec with Llama 3 8B

by YourCupOTea on 5/30/24, 6:52 PM with 31 comments

by LorenDB on 5/30/24, 6:57 PM
Groq is an insane company. SambaNova (discussed yesterday[0]) is also very promising. However, what I really want to see is local AI accelerator chips a la Tenstorrent Grayskull that can boost local generation to hundreds of tokens per second while being more efficient than GPUs.
[0]: https://news.ycombinator.com/item?id=40508797
by windowshopping on 5/30/24, 7:07 PM
Is groq related to Twitter's grok or is that just a very unfortunate naming coincidence?
by andy_xor_andrew on 5/30/24, 7:14 PM
When reading Hacker News you develop a signal/noise filter, where lots of headlines make bold claims but you filter them out as embellishment or exaggeration.
My bullshit detector went off when I first saw Groq posted on HN - a startup is making their own chips (doubt) that performs faster than anything Nvidia has for inference (doubt) and accelerates LLMs to hundreds/thousands of tokens per second?? Mega doubt.
But... then I tried their demo, and... yeah, it's that good. Such an amazing company of talented individuals.
by behnamoh on 5/30/24, 7:18 PM
They're not responsive to my questions on Twitter, so I'm asking here:
```
    When will Groq support a real API (not experimental beta preview)?

    When will Groq support logprobs?!

    When will Groq actually tell us what their rate limit is?!
```
Until these aren't answered, many of us can't actually build on Groq.
Edit: It seems I'm getting downvoted by Groq employees...