by gok on 4/5/25, 8:13 PM with 48 comments
by Game_Ender on 4/5/25, 9:01 PM
EDIT - Seems that Groq has stopped selling their chips and now will only partner to fund large build outs of their cloud [2].
0 - https://groq.com/the-groq-lpu-explained/
2 - https://www.eetimes.com/groq-ceo-we-no-longer-sell-hardware
by simonw on 4/5/25, 9:02 PM
All three of those can also be accessed via OpenRouter - with both a chat interface and an API:
- Scout: https://openrouter.ai/meta-llama/llama-4-scout
- Maverick: https://openrouter.ai/meta-llama/llama-4-maverick
Scout claims a 10 million input token length but the available providers currently seem to limit to 128,000 (Groq and Fireworks) or 328,000 (Together) - I wonder who will win the race to get that full sized 10 million token window running?
Maverick claims 1 million and Fireworks offers 1.05M while Together offers 524,000. Groq isn't offering Maverick yet
by parhamn on 4/5/25, 9:20 PM
Very few of the models supported on Groq/Together/Fireworks support function calling. And rarely the interesting ones (DeepSeek V3, large llamas, etc)
by minimaxir on 4/5/25, 9:24 PM
$0.11 per 1M tokens, a 10 million content window (not yet implemented in Groq), and faster inference due to fewer activated parameters allows for some specific applications that were not cost-feasible to be done with GPT-4o/Claude 3.7 Sonnet. That's all dependent on whether the quality of Llama 4 is as advertised, of course, particularly around that 10M context window.
by greeneggs on 4/5/25, 8:49 PM
by vessenes on 4/5/25, 8:54 PM
by sinab on 4/6/25, 12:40 AM
Is there some technical limitation on the context window size with LPUs or is this a temporary stop-gap measure to avoid overloading groq's resources? Or something else?
by jasonjmcghee on 4/5/25, 9:13 PM
Out of curiosity, the console is letting me set max output tokens to 131k but errors above 8192. what's the max intended to be? (8192 max output tokens would be rough after getting spoiled with 128K output of Claude 3.7 Sonnet and 64K of gemini models.)
by growdark on 4/5/25, 9:12 PM
by geor9e on 4/6/25, 12:31 AM
by imcritic on 4/5/25, 9:12 PM