by arealaccount on 4/20/25, 9:56 PM with 14 comments
by jsnell on 4/20/25, 10:42 PM
Back of the envelope:
OpenAI inference costs last year were 4B. Tens of millions would be at least 20M, i.e. 0.5%.
That 4B is not just the electricity cost. It needs to cover the amortized cost of the hardware, the cloud provider's margin, etc.
Let's say a H100 costs $30k, and has a lifetime of 5 years. I make that about $16 / day in depreciation. The H100 run at 100% utilization will use 17kWH of electricity in a day. What does that cost? $2-$3 / day? Let's assume the cloud provider's margin is 0. That still means power consumption is maybe 1/5th of the total inference cost.
So the comparison is 800M vs 20M (2.5%).
Can 2.5% of their tokens be pleasantries? Seems impossible. A "please" is a single token, which will be totally swamped by the output, which will typically be 1000x that.
by pupppet on 4/20/25, 10:20 PM
by gblargg on 4/20/25, 10:16 PM
by jowea on 4/20/25, 10:12 PM
I wonder if they mean it or it's just joke answer.
by marklyon on 4/21/25, 3:55 PM