from Hacker News

Ask HN: Hardware for 1k RPS?

by gsky on 5/31/25, 12:03 AM with 3 comments

I ran an uncensored model on a CPU server. as expected its dead slow (min or two per query).

What kinda hardware (GPU) do i need to serve 1k RPS?

I could not find APIs for uncensored models that kinda forced me to run locally

  • by eddythompson80 on 5/31/25, 5:00 AM

    Depends on your model size and how many of it can fit in memory. Multiply the size by 1k and divide by the memory capacity of the hardware for a rough ballpark.
  • by barnabee on 5/31/25, 12:04 AM

    https://venice.ai claim to offer uncensored models (I’ve not tested that claim)