from Hacker News

One Million Database Connections

by StratusBen on 11/1/22, 4:03 PM with 50 comments

by jzelinskie on 11/1/22, 5:01 PM
Awesome to hear more about MySQL/Vitess connection pooling.
Folks typically only consider memory usage for database connections, but we've also had to consider the p99 latency for establishing a connection. For SpiceDB[0] one place we've struggled for our MySQL backend (originally contributed by GitHub who are big Vitess users) is preemptively establishing connections in the pool so that it's always full. PGX[1] has been fantastic for Postgres and CockroachDB, but I haven't found something with enough control for MySQL.
PS: Lots of love to to all my friends at Planetscale! SpiceDB is also a big user of vtprotobuf[2] -- a great contribution to the Go gRPC ecosystem.
[0]: https://github.com/authzed/spicedb
[1]: https://github.com/jackc/pgx
[2]: https://github.com/planetscale/vtprotobuf
by gsanderson on 11/1/22, 4:51 PM
Impressive! But I guess the trade-off of having all that power is the potentially terrifying cost. As you detail in the post, AWS Lambda comes with a default throttle (1000 concurrent) which can be adjusted. Is any throttle/limit like that supported, or in the road-map? Only I've been thinking I may want a service to fail beyond a certain point, as that amount of load would indicate an attack, not genuine usage.
by sulam on 11/1/22, 7:39 PM
On the off chance someone associated with this is reading: I’m curious about the networking stack here. Specifically TCP. Is it being used? The reason I ask is because one limit I’ve run into in the past with large scale workloads like this is exhausting the ephemeral port supply to allow connections from new clients.
Did you run into this? If not I’m curious why not. And if so, how did you manage it?
by unilynx on 11/1/22, 4:50 PM
> making MySQL live outside its means (i.e. overcommitting memory) opens the door to dangerous crashes and potential data corruption, so this is not recommended.
data corruption? how?
I'm no mySQL fan but is this FUD or referring to a real issue?
by themenomen on 11/1/22, 5:28 PM
Any additional insights or information on how latencies relate to number of connections?
by prithvi24 on 11/1/22, 6:28 PM
Can ya'll sign a BAA for HIPAA? Saw Soc2 - just
Hosted Vitess sounds amazing - love this - 0 downtime migrations w/ Percona on RDS still suck and waste a lot of time
by twawaaay on 11/1/22, 5:05 PM
If you have a lot of connections doing similar things, just batch requests to get the data in bulk.
Scaling your database up should only be attempted once you can no longer improve efficiency of your application. It is always better to first put effort into improving efficiency than scaling it up.
For example, one trick that allowed me to improve throughput of one application using MongoDB as a backend by factor of 50 was capturing queries from multiple requests happening at the same time and sending them as one request (statement) into the database, then when you get the result you fan them out to the respective business logic that needs them. The application was written with Reactor which makes this much easier than a normal thread based request processing.
For example, if you have 500 people logging at the same time and fetching their user details, batch those requests for example every 100ms up to 100 users and fetch 100 records with a single query.
You will notice that executing a simple fetch by id query even for hundreds of ids will only cost couple times more than fetching a single record.
The application in question was able to fetch 2-3 GB of small documents per second during normal traffic (not an idealised performance test) with just couple dozen connections.