from Hacker News

Why Vercel overhauled its serverless infrastructure for the AI era

by sylvainkalache on 2/5/25, 8:44 PM with 1 comments

by nadis on 2/11/25, 3:02 AM
Interesting shift from speed to cost-savings and from web app development to AI-native apps.
"But as Vercel's customers started using the serverless platform to build AI apps, they realized they were wasting computing resources while awaiting a response from the model. Traditional servers understand how to manage idle resources, but in serverless platforms like Vercel's "the problem is that you have that computer just waiting for a very long time and while you're claiming that space of memory, the customer is indeed paying," Rauch said.
Fluid Compute gets around this problem by introducing what the company is calling "in-function concurrency," which "allows a single instance to handle multiple invocations by utilizing idle time spent waiting for backend responses," Vercel said in a blog post last October announcing a beta version of the technology. "Basically, you're treating it more like a server when you need it," Rauch said.
Suno was one of Fluid Compute's beta testers, and saw "upwards of 40% cost savings on function workloads," Rauch said. Depending on the app, other customers could see even greater savings without having to change their app's configuration, he said."