by ushakov on 12/17/24, 9:00 AM with 65 comments
by WalterBright on 12/21/24, 9:17 PM
struct str {
char *dat;
sz len;
};
It's the same solution D uses, except that it's a builtin type, and works for all arrays. I proposed this solution for C:https://www.digitalmars.com/articles/C-biggest-mistake.html
It's hard to overstate what a huge win this is. D has had 23 years of experience with it, and the virtual elimination of array overflow bugs is just win, win, win.
I will never understand why C keeps adding extensions consisting of marginal features, and ignores this foundational fix. I guess they still aren't tired of buffer overflow bugs always being the #1 security vulnerability of shipped C code (and C++, too!).
by kevin_thibedeau on 12/21/24, 9:01 PM
GCC has the format attribute that lets you have printf type checking on your own variadic functions:
https://gcc.gnu.org/onlinedocs/gcc-14.2.0/gcc/Common-Functio...
by simscitizen on 12/17/24, 11:56 PM
Another one to consider is e.g. https://github.com/antirez/sds (used by Redis), which instead stores the string contents in-line with the metadata.
by ropejumper on 12/18/24, 8:36 AM
Choosing between these trade-offs just depends on what you're doing. I'd definitely choose this pattern if I were to write a parser for instance.
by jdblair on 12/21/24, 9:08 PM
by cozzyd on 12/18/24, 3:04 AM
You could even do something crazy with packing a null byte with sz on 64-bit systems (since you will never have a string that long anyway...)
by up2isomorphism on 12/21/24, 9:34 PM
But I would say for 95% percent using a fixed length char array with strncpy will work just fine.
by superjared on 12/21/24, 11:50 PM
by codr7 on 12/21/24, 10:25 PM
by Levitating on 12/21/24, 10:30 PM
Where on OpenAI's site do I find a footer like that?
by Quis_sum on 12/22/24, 2:32 PM
Once you go down the route proposed by many of the comments here - why not enhance it to deal with UTF8... Or rather implement a proper "array" type? What about the lack of multidimensional arrays instead of the pointer to pointer to ... approach? Idiosyncracies such as "int a[2][3];" being of type "int *" and not "int **"?
C was never intended to shield you from mistakes, but rather replace a macro assembler. ANSI C addressed some of the issues in the original K&R C, but that is about it.
If your use case would benefit from all of these protections, there are plenty of higher level language alternatives...
by teo_zero on 12/22/24, 8:58 AM
I see a problem with the separation between str and str_buf, though: you create new strings with the latter, but most functions take the former as arguments. Do you convert them every time? Isn't your code littered with str_from_buf()?
Put it in another way, it's like the mess with const that you mention in your article. If str is the type you use for a const read-only string, and str_buf for a non-const mutable string, you would like to pass a non-const even to those functions that "only" require a const. (I say "only" because being const is a weaker requirement than being mutable; the fact that it's more wordy is another thing that C's syntax makes confusing, but this is an entirely different topic!)
It would be nice if the compiler could be instructed to automatically cast str_buf into str and not vice versa, just like it does for non-const to const.
The only way out I can think of, would be to get rid of the two types and only use the one with the cap field, with the convention that if cap is zero, then the string is read-only. The drawback is that certain mistakes are only detected at run-time and not enforced by the compiler. For example, a function than takes a string s and replaces every substring s1 with s2 could have the following prototype in the two-type system:
replace(str_buf s, str s1, str s2);
And it would be immediate to recognize that you cannot pass a read-only string as the first argument. With a one-type system you loose this ability.Oh well, I guess if a perfect solution existed, it would have been adopted by the C committee, wouldn't it? /s
by zwnow on 12/21/24, 10:36 PM
by zabzonk on 12/21/24, 8:59 PM