from Hacker News

Lock-Free Rust: How to Build a Rollercoaster While It's on Fire

by r3tr0 on 5/13/25, 4:43 PM with 60 comments

by MobiusHorizons on 5/16/25, 5:48 AM
I enjoyed the content, but could have done without the constant hyping up of the edginess of lock free data structures. I mean yes, like almost any heavily optimized structure there are trade offs that prevent this optimization from being globally applicable. But also being borderline aroused at the “danger” and rule breaking is tiresome and strikes me as juvenile.
by Animats on 5/16/25, 8:08 AM
I've done a little bit of "lock-free" programming in Rust, but it's for very specialized situations.[1] This allocates and releases bits in a bitmap. The bitmap is intended to represent the slots in use in the Vulkan bindless texture index, which resides in the GPU. Can't read that those slots from the CPU side to see if an entry is in use. So in-use slots in that table have to be tracked with an external bitmap.
This has no unsafe code. It's all done with compare and swap. There is locking here, but it's down at the hardware level within the compare and swap instruction. This is cleaner and more portable than relying on cross-CPU ordering of operations.
[1] https://github.com/John-Nagle/rust-vulkan-bindless/blob/main...
by gpderetta on 5/16/25, 10:05 AM
The claim that the lock free array is faster then the locked variant is suspicious. The lock free array is performing a CAS for every operation, this is going to dominate[1]. A plain mutex would do two CAS (or just one if it is a spin lock), so the order of magnitude difference is not explainable by the lock free property.
Of course if the mutex array is doing a linear scan to find the insertion point that would explain the difference but: a) I can't see the code for the alternative and b) there is no reason why the mutex variant can't use a free list.
Remember:
- Lock free doesn't automatically means faster (still it has other properties that might be desirable even if slower)
- Never trust a benchmark you didn't falsify yourself.
[1] when uncontended; when contended cache coherence cost will dominate over everything else, lock-free or not.
by 0x1ceb00da on 5/16/25, 3:35 AM
> AtomicUsize: Used for indexing and freelist linkage. It’s a plain old number, except it’s watched 24 / 7 by the CPU’s race condition alarm.
Is it though? Aren't atomic load/store instructions the actual important thing. I know the type system ensures that `AtomicUsize` can only be accessed using atomic instructions but saying it's being watched by the CPU is inaccurate.
by tombert on 5/16/25, 2:51 AM
Pretty interesting.
I have finally bitten the bullet and learned Rust in the last few months and ended up really liking it, but I have to admit that it's a bit lower level than I generally work in.
I have generally avoided locks by making very liberal use of Tokio channels, though that isn't for performance reasons or anything: I just find locks really hard to reason about for anything but extremely trivial usecases, and channels are a more natural abstraction for me.
I've never really considered what goes into these lock-free structures, but that might be one of my next "unemployment projects" after I finish my current one.
by zero0529 on 5/16/25, 7:07 AM
Like the writing style but would prefer if it was dialed down maybe 10 %. Otherwise a great article as an introduction to lock-free datastructures.
by Fiahil on 5/16/25, 1:07 PM
You can go one step further if :
- you don't reallocate the array
- you don't allow updating/ removing past inserted values
In essence it become a log, a Vec<OnceCell<T>> or a Vec<UnsafeCell<Option<T>>>. Works well, but only for a bounded array. So applications like messaging, or inter-thread communication are not a perfect fit.
It's a fixed-size vector that can be read at the same time it's being written to. It's no a common need.
by gmm1990 on 5/16/25, 3:05 PM
Is the advantage of a freelist over just an array of the values (implemented like a ring buffer), that you can don't have to consume values in order? It just seems like throwing a pointer lookup would add a lot of latency for something thats so latency sensitive.
by jillesvangurp on 5/16/25, 9:53 AM
This is the kind of stuff that you shouldn't have to reinvent yourself but be able to reuse from a good library. Or the standard library even.
How would this compare to the lock free abstractions that come with e.g. the java.concurrent package? It has a lot of useful primitives and data structures. I expect the memory overhead is probably worse for those.
Support for this is one of the big reason Java and the jvm has been a popular choice for companies building middleware and data processing frameworks for the last few decades. Exactly the kind of stuff that the author of this article is proposing you could build with this. Things like Kafka, Lucene, Spark, Hadoop, Flink, Beam, etc.
by r3tr0 on 5/13/25, 4:43 PM
hope you enjoy this article on lock free programming in rust.
I used humor and analogies in the article not just to be entertaining, but to make difficult concepts like memory ordering and atomics more approachable and memorable.
by jonco217 on 5/16/25, 10:44 AM
> NOTE: In this snippet we ignore the ABA problem
The article doesn't go into details but this is subtle way to mess up writing lock free data structures:
https://en.wikipedia.org/wiki/ABA_problem
by IX-103 on 5/16/25, 11:59 PM
> We don’t need strict ordering here; we’re just reading a number.
That's probably the most scary sentence in the whole article.
by lucraft on 5/16/25, 10:57 AM
Can I ask a dumb question - how is Atomic set operation implemented internally if not by grabbing a lock?
by Havoc on 5/17/25, 12:36 PM
wow - that's a huge speed improvement.
I wonder if this is connected to that rust optimisation bounty post we saw the other day where they couldn't get rust safe decoder closer than 5% to their C implementation. Maybe that's just the cost of safety
by rurban on 5/16/25, 4:11 PM
It still wouldn't lead to proper Rust concurrency safety, because their IO is still blocking.
by fefe23 on 5/16/25, 5:44 AM
To borrow an old adage: The determined programmer can write C code in any language. :-)
by pjmlp on 5/16/25, 1:11 PM
That screenshot is very much CDE inspired.
by sph on 5/16/25, 2:13 PM
Obligatory video from Jon Gjengset “Crust of Rust: Atomics and Memory Ordering”: https://youtu.be/rMGWeSjctlY?si=iDhOLFj4idOOKby8
by ephemer_a on 5/16/25, 4:13 AM
did 4o write this