from Hacker News

Ask HN: Do we have evidence that green threading is faster than OS threads?

by roetlich on 2/3/24, 2:36 AM with 2 comments

I understand that green threads and fibers are much cheaper to spawn and probably make context-switching cheaper. But how much impact do they have in practice?

Let's take backend web dev as an example. The Redbean project[0] claims to be able to do 1 million requests per second using good old fork. That number is huge! That's an order of magnitude more than the number of google searches per second. Of course that only works when you're not doing any actual work, but that would mean the overhead for forking on every request is very low.

But I have no idea if the cost of context switching actually matters when you are processing a lot of threads.

AFAIK, for running web backend handling HTTP requests, the options are: - running it all sequentially - fork() - multiple OS processes - One native thread per connection - A pool of native threads - fibers and single threaded async/await - green threads

So my question is: Does anyone actually have benchmarks of an apples-to-apples comparison?

I'm aware of the recent .Net experiment, replacing the current async/await over System.Threading.Tasks with green threads[1]. In that test, green threads are slightly slower.

The Wikipedia article[2] is hilariously out of date, using benchmarks from Linux kernel 2.2. It was flagged as out of date in 2014.

I found an old HN discussion[3], but by now that's also probably out of date. It also doesn't have realistic benchmarks.

The answer is probably "it depends", but I'd still like to know if there is any good general data to guide the development of new programming languages, virtual machines, and concurrency libraries.

[0] https://redbean.dev/ [1] https://github.com/dotnet/runtimelab/issues/2398 [2] https://en.m.wikipedia.org/wiki/Green_thread#Performance [3] https://news.ycombinator.com/item?id=10229704

by GianFabien on 2/3/24, 4:01 AM
The performance of green threads depends on its implementation within the VM or runtime. Some VMs implement a form of cooperative multiprocessing which avoids needing to invoke OS level syscalls and context switches. That is why you get apparently better performance for some benchmarks.
On physical hardware, OS threading performance is influenced by the architecture, cache(s) size, coherence, inter-processor locks, out of order execution, hyper-threading (implemented in silicon), memory size, OS version and of course applications mix.
The reason you can't find any benchmarks or performance data is due to the fact that there are just too many permutations to consider. For example, a comparison between the .Net test you mentioned with Java green-threads running on Linux would be totally meaningless.
by toast0 on 2/3/24, 5:46 AM
Does redbean fork on every request, or does it fork on start and then each process reads and handles one request at a time? It gets weird with keep-alive and tls, but traditional http one request per connection can be really nice in a prefork environment with proper accept filters and large socket buffers. Each process just blocks on accept, gets a fully formed get request, and does the work, responds, and then blocks on accept again. The OS kernel takes care of the socket while the client is sending the request and while the client is acking the response.
Green threads can be nice, because some software is easier written as read from socket, do work, write, loop where the context of the socket is right there. And with green threads, you can have a million on a single machine and handle a million sockets on a single machine. I haven't tried, but I don't think you can have a million OS threads or a million OS processes on a single machine.
Maybe you don't want that many sockets one one machine, or don't have that kind of traffic, or userland processing of the requests is so significant that you can't have service that many sockets on a single machine anyway...