by axelfontaine on 9/29/22, 6:03 PM with 174 comments
by ccooffee on 9/29/22, 7:30 PM
I haven't been able to figure out how the "unmount" of a virtual thread works. As stated in this article:
> Nearly all blocking points in the JDK have been adapted so that when encountering a blocking operation on a virtual thread, the virtual thread is unmounted from its carrier instead of blocking.
How would I implement this logic in my own libraries? The underlying JEP 425[0] doesn't seem to list any explicit APIs for that, but it does give other details not in the OP writeup.
by geodel on 9/29/22, 7:36 PM
I tried moving plain old Tomcat based service to scalable netty based reactive stack but it turned out to be too much work and an alien programing model. With Loom/Virtual thread, the only thing I will be looking for server supporting Virtual threads natively. Helidon Nima would fit the bill here as all other frameworks/app servers have so far just slapping virtual threads on their thread pool based system. And unsurprisingly it is not leading to great perf expected from Virtual thread based system.
by anonymousDan on 9/29/22, 10:39 PM
Having said all that, this sounds super cool and I think is 100% the way to go for Java. Would be interesting to revisit the implementation of something like Akka in light of this.
by thom on 9/29/22, 8:16 PM
by samsquire on 9/29/22, 7:27 PM
I implemented a userspace 1:M:N timeslicing thread, kernel thread to lightweight thread multiplexer in Java, Rust and C.
I preempt hot for and while loops by setting the looping variable to the limit from the kernel multiplexing thread.
It means threads cannot have resource starvation.
https://github.com/samsquire/preemptible-thread
The design is simple. But having native support as in Loom is really useful.
by Blackthorn on 9/29/22, 8:58 PM
by gigatexal on 9/29/22, 8:53 PM
by smasher164 on 9/29/22, 9:14 PM
How long until OS vendors introduce abstractions to make this easier? Why aren't there OS-native green threads, or at the very least user-space scheduling affordances for runtimes that want to implement them without overhead in calling blocking code?
by rr808 on 9/30/22, 2:13 AM
by mikece on 9/29/22, 8:11 PM
by jeffbee on 9/29/22, 7:49 PM
This extremely common misconception is not true of Linux or Windows. Both Windows and Linux have demand-paged thread stacks whose real size ("committed memory" in Windows) is minimal initially and grows when needed.
by lenkite on 9/29/22, 9:20 PM
by polskibus on 9/30/22, 10:56 AM
by stefs on 9/29/22, 11:37 PM
first question: so, as the article states, the ONLY performance upside of virtual threads (versus os threads) is the number of inactive threads, thanks due to lower per-thread memory overhead.
for some reason i was expecting to read something about context switching cost too.
as far as i understand, virtual thread context switches are most likely between a lot cheaper and roughly as expensive than their carrier thread context switches, depending on how much memory has to be copied around and how to find the next thread to execute.
the problem here is that virtual context switches may be cheaper, but have to be executed in addition to the os thread context switches, so the overall efficiency is actually lower because more work is spent scheduling (os vs. os+virtual).
to minimize this it might be possible for privileged applications to disable os thread context switching for the carrier threads as long as there are active virtual threads. that way, the context switching and scheduling overhead is reduced from "os vs. os+virt" to "os vs. virt". i.e. as soon as there are active virtual threads the carrier thread is excluded for os scheduler until there aren't any active virtual threads anymore (or, alternatively, the virtual thread pool is empty).
is this a thing? does this make sense? would it be worth it? do operating systems even support "manual" (i.e. by the app) thread scheduling hints? or are the carrier threads only rarely taken out of schedule because they're not really put to sleep as long as there are active virtual threads anyway, making this a non-issue?
second question: as far as i understand blocking os threads, the scheduler stores which thread is waiting on which io resource and the appropriate thread gets woken up once a waited-on io resouce is available. this is not much of a problem with with a few hundred or thousand os threads, but now with virtual threads, the io resource must now be linked to the os thread for the virtual thread executor's scheduler by the os and then to the virtual thread waiting on the resource by the virtual thread scheduler. so for example if there are 100.000 inactive virtual threads waiting for a network response and one arrives, the os scheduler has to match it to an os thread first (the one the vt scheduler runs on) and then the vt scheduler has to match it to one of the virtual threads. i.e. two lookups in hashtables with 100.000 entries each (one io to os threads, the other io to vt). is this how it works or do i misunderstand this? as async models have the same issue but work fine i guess this isn't really a problem in practice. also, as far as i understand, the os thread woken up is given a kind of resouce id it's been woken up for, instead of "well, you went to sleep for a certain resource id so it's obvious which one you've been woken up for" in blocking IO).
by bheadmaster on 9/30/22, 4:46 AM
by mgraczyk on 9/29/22, 8:35 PM
The first objection in the article is that with async/await you to may forget to use an async operation and could instead use a synchronous operation. This is not a real problem. Languages like JavaScript do not have any synchronous operations so you can't use them by mistake. Languages like python and C# solve this with simple lint rules that tell you if you make this mistake.
The second objection is that you have to reimplement all library functions to support await. This is a bad objection because you also have to do this for virtual threads. Based on how long it took to add virtual threada to Java vs adding async/await to other languages, it seems like virtual threads were much more complicated to implement.
The programming model here sounds analogous to using gevent with python vs python async/await. My opinion is that the gevent approach will die out completely as async/await becomes better supported and programmers become more familiar.
EDIT: Looking more at the "Related Work" section at the bottom. I think I understand the problem here. The "Structured Concurrency" examples are unergonomical versions of async/await. I'm not sure what I'm missing but this seems like a strictly worse way to write structured concurrent code.
Java example:
Response handle() throws ExecutionException, InterruptedException {
try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
Future<String> user = scope.fork(() -> findUser());
Future<Integer> order = scope.fork(() -> fetchOrder());
scope.join(); // Join both forks
scope.throwIfFailed(); // ... and propagate errors
// Here, both forks have succeeded, so compose their results
return new Response(user.resultNow(), order.resultNow());
}
}
Python equivalent async def handle() -> Response:
# scope is implicit, throwing on failure is implicit.
user, order = await asyncio.gather(findUser(), findOrder())
return Response(user, order)
You could probably implement a similar abstraction in Java, but you would need to pass around and manage the the scope object, which seems cumbersome.by jsyolo on 9/29/22, 11:20 PM