by broken_broken_ on 11/6/24, 8:30 AM with 67 comments
by pwdisswordfishz on 11/6/24, 10:07 AM
No, you are not; simple as that. Miri is right. Rust using malloc/free behind the scenes is an internal implementation detail you are not supposed to rely on. Rust used to use a completely different memory allocator, and this code would have crashed at runtime if it were still the case. Since when is undocumented information obtained from strace a stable API?
It's not like you can rely on Rust references and C pointers being identical in the ABI either, but the sample in the post blithely conflates them.
> It might be a bit surprising to a pure Rust developer given the Vec guarantees, but since the C side could pass anything, we must be defensive.
This is just masking bugs that otherwise could have been caught by sanitizers. Better to leave it out.
by haileys on 11/6/24, 10:03 AM
struct MyForeignPtr(*mut c_void);
impl Drop for MyForeignPtr {
fn drop(&mut self) {
unsafe { my_free_func(self.0); }
}
}
Then wrap the foreign pointer with MyForeignPtr as soon as it crosses the FFI boundary into your Rust code, and only ever access the raw pointer via this wrapper object. Don't pass the raw pointer around.by klauserc on 11/6/24, 10:23 AM
1: https://doc.rust-lang.org/std/vec/struct.Vec.html#method.int...
by pornel on 11/6/24, 11:34 AM
Option<Box<T>> is allowed too, and it's a nullable pointer. Similarly &mut T is supported.
Using C-compatible Rust types in FFI function declarations can remove a lot of boilerplate. Unfortunately, Vec and slices aren't one of them.
by namjh on 11/6/24, 11:37 AM
Unsafe Rust is indeed very hard to write correctly. Rustonomicon is a good start to learn unsafe Rust.
by jclulow on 11/6/24, 10:05 AM
If you've been doing C for five decades, it's a shame not to have noticed that it's totally fine to pass a NULL pointer to free().
FWIW, I don't otherwise agreed with the thesis. I've written probably ten FFI wrappers around C libraries now, and in every case I was able to store C pointers in some Rust struct or other, where I could free them in a Drop implementation.
I also think it's not actually that unusual for C allocators (other than the truly ancient malloc(3C) family) to require you to pass the allocation size back to the free routine. Often this size is static, so you just use sizeof on the thing you're freeing, but when it's not, you keep track of it yourself. This avoids the need for the allocator to do something like intersperse capacity hints before the pointers it returns.
by scotty79 on 11/6/24, 10:12 AM
by aapoalas on 11/6/24, 11:52 AM
In these cases I can imagine the caller passing in a pointer/reference to uninitialised stack memory, which is also UB in the last version if the allocating code! A `&mut T` must always point to a valid `T` and must not point to uninitialised memory.
It seems to me like it'd be best to take a `&mut MaybeUninit<T>` parameter instead, and write through that. A further upside is that now if the caller _is_ Rust code, you can use MaybeUninit to reserve the stack space for the `OwningArrayC<T>` and then after the FFI call you can use `assume_init` to move an owned, initialised `OwningArrayC<T>` out of the `MaybeUninit` and get all of Rust's usual automatic `Drop` guarantees: This is the defer you wanted all along.
by John23832 on 11/6/24, 10:55 AM
by jtrueb on 11/6/24, 12:41 PM
I have a lot of FFI code paths that end up easier to understand if you use the C-style instead of the Rusty approach.
Readability and comp time should be important even in FFI code paths
by kelnos on 11/8/24, 2:05 PM
A possible "defer" keyword has nothing to do with any of this, and will not help you.
Rust doesn't need "defer".
by marcodiego on 11/6/24, 12:48 PM
I can only hope it 'infects' other languages.
by cross on 11/7/24, 7:37 PM
But I do recognize that the code in the post was a simplified example, and it's possible that the flexibility of `Vec` is actually used; perhaps elements are pushed into the `Vec` dynamically or something, and it would be inconvenient to simulate that with `libc::malloc` et al. But even then, in an environment that's not inherently memory starved, a viable approach might be to build up the data in a `Vec`, and then allocate a properly-sized region using `libc::malloc` and copy the data into it.
Another option might be to maintain something like a BTreeMap indexed by pointer on the Rust side, keeping track of the capacity there so it can be recovered on free.
by lalaithion on 11/6/24, 2:24 PM
by LegionMammal978 on 11/8/24, 2:04 AM
by dathinab on 11/6/24, 12:01 PM
It _really_ isn't, it's actually exactly how C (or C++) works if you have library allocating something for you you also need to use that library to free it as especially in context of linked libraries you can't be sure how something was allocated, if it used some arena, if maybe some of you libraries use jemalloc and others do not etc. So it's IMHO a very basic 101 of using external C/C++ libraries fact fully unrelated to rust (through I guess it is common for this to be not thought well).
Also it's normal even if everything uses the same alloc because of the following point:
> So now I am confused, am I allowed to free() the Vec<T>'s pointer directly or not?
no and again not rust specific, free is always a flat freeing `Vec<T>'s` aren't guaranteed to be flat (depending on T), and even if, some languages have small vec optimizations (through rust I thing guarantees that it's not done with `Vec` even in the future for FFI compatibility reasons)
so the get to go solution for most FFI languages boundaries (not rust specific) is you create "C external" (here rust) types in their language, hand out pointer which sometimes are opaque (C doesn't know the layout) and then _hand them back for cleanup_ cleaning them up in their language.
i.e. you would have e.g. a `drop_vec_u8` extern C function which just does "create vec from ptr and drop it" (which should get compiled to just a free in case of e.g. `Vec<u8>` but will also properly work for `Vec<MyComplexType>`.
> Box::from_raw(foos);
:wut_emoji:??
in many languages memory objects are tagged and treating one type as another in context of allocations is always a hazard, this can even happen to you in C/C++ in some cases (e.g. certain arena allocators)
again this is more a missing knowledge in context of cross language C FFI in general then rust specific (maybe someone should write a "Generic Cross Language C FFI" knowledge web book, I mean while IMHO it is basic/foundational knowledge it is very often not thought well at all)
> OwningArrayC > defer!{ super::MYLIB_free_foos(&mut foos); }
the issue here isn't rust missing defer or goto or the borrow checker, but trying to write C in rust while OwningArrayC as used in the blog is a overlap of anti-patterns wanting to use but not use rust memory management at the same time in a inconsistent way
If you want to "free something except not if it has been moved" rust has a mechanic for it: `Drop`. I.e. the most fundamental parts of rust (memory) resource management.
If you want to attach drop behavior to an existing type there is a well known pattern called drop guard, i.e. a wrapper type impl Drop i.e. `struct Guard(OwnedArrayC); impl Drop for Guard {...} maybe also impl DerefMut for Guard`. (Or `Guard(Option<..>)` or `Guard(&mut ...)` etc. depending on needs, like e.g. wanting to be able to move it out conveniently).
In rust it is a huge anti pattern to have a guard for a resource and not needing to access the resource through the guard (through you will have to do it sometimes) as it often conflicts with borrow checker and for RAII like languages in general is more error prone. Which is also why `scopeguard` provides a guard which wrapps the data you need to cleanup. That is if you use `scopeguard::guard` and similar instead of `scopeguard::defer!` macro which is for convenience when the cleanup is on global state. I.e. you can use `guard(foos, |foos| super::MYLIB_free_foos(&mut foos))` instead of deferr and it would work just fin.
Through also there is a design issue with super::MYLIB_free_foos(&mut foos) itself. If you want `OwningArrayC` to actually (in rust terms) own the array then passing `&mut foos` is a problem as after the function returns you still have foos with a dangling pointer. So again it shows that there is a the way `OwningArrayC` is done is like trying to both use and not use rusts memory management mechanics at the same time (also who owns the allocation of OwningArrayC itself is not clear in this APIs).
I can give following recommendations (outside of using `guard`):
- if Vec doesn't get modified use `Box<[T]>` instead
- if vec is always accessed through rust consider passing a `Box<Vec<T>>` around instead and always converting to/from `Box<Vec<T>>`/`&Vec<T>`/`&mut Vec<T>`, Box/&/&mut have some defactor memory repr compatibilities with pointer so you can directly place them in a ffi boundary (I think it's guaranteed for Box/&/&mut T and de-facto for `Option<Box/&/&mut T>` (nullpointer == None)
- if that is performance wise non desirable and you can't pass something like `OnwingArrayC` by value either specify that the caller always should (stack) allocate the `OnwingArrayC` itself then only use `OnwingArrayC` at the boundary i.e. directly convert it to `Vec<T>` as needed (through this can easily be less clear about `Vec<T>`, and `&mut Vec<T>` and `&mut [T]` dereferenced)
- In general if `OwningArrayC` is just for passing parameter bundles with a convention of it always being stack allocated by the caller then you also really should only use it for the transfer of the parameters and not automatic resource management, i.e. you should directly convert it to `Vec` at the boundary (and maybe in some edge cases use scopeguard::guard, but then converting it to a Vec is likely faster/easier to do). Also specify exactly what you do with the callee owned pointer in `OwningArrayC` i.e. do we always treat it ass dangling even if there are errors, do we set it to empty vec /no capacity as part of conversion to Vec and it's only moved if that was done etc. Also write a `From<&mut OwningArrayC> for Vec` impl, I recommend setting cap+len to zero in it).
And yes FFI across languages is _always_ hard, good teaching material often missing and in Rust can be even harder as you have to comply with C soundness on one side and Rust soundness on the other (but I mean also true for Python,Java etc.). Through not necessary for any of the problems in the article IMHO. And even if we just speak about C to C FFI of programs build separately the amount of subtle potentially silent food guns is pretty high (like all the issues in this article and more + a bunch of other issues of ABI incompatibility risks and potentially also linker time optimization related risk).
by GrantMoyer on 11/6/24, 12:57 PM
by throwawa14223 on 11/7/24, 6:03 PM
by xanathar on 11/6/24, 10:48 AM
2) Do not use Vec<T> to pass arrays to C. Use boxed slices. Do not even try to allocate a Vec and free a Box... how can it even work?!
3) free(ptr) can be called even in ptr is NULL
4) ... I frankly stopped reading. I will never know if Rust actually needs Go's defer or not.
by zoezoezoezoe on 11/8/24, 8:22 PM
by seanhunter on 11/8/24, 5:11 AM
> Rust + FFI is nasty and has a lot of friction.
...then I would say you have been living a little bit of a charmed life. FFI in most languages has improved beyond all recognition in the last 15 years. When I look at FFI in rust I can't believe how ergonomic and clean it is compared to say PerlXS (which was the old way of doing FFI in perl).by neonsunset on 11/6/24, 11:14 AM