by fleaflicker on 4/11/14, 1:32 PM with 53 comments
by yongjik on 4/11/14, 5:35 PM
So, if we went down the pass, what will we have? All the fun of having "legacy" APIs that seem to work but internally only accept strings up to 64kb length and mysteriously chop off excess bytes when you least expect it. It's Y2K problem all over again.
And just when you finally think you're over with it, memory is cheaper again, size_t is 64bit, and someone invariably wants to store a binary blob >4G as string. Fun time again.
Have we forgotten how much trouble we went through in the 90s to handle memory in x86 "640k is enough for everybody" architecture?
by gcb0 on 4/11/14, 4:30 PM
On the week that str+len was abused left and right, someone surfaces to the frontpage an article about how str+NUL is wrong and everyone should use str+len.
by millstone on 4/11/14, 5:35 PM
Consider using a length field. How big should that field be? If it's fixed size, you introduce complications regarding how big a string you can represent, and differences in field sizes across architectures. If it's variable-sized (a la UTF-8), then you've added different complications: you would need library functions to read and write the length, to get access to the string contents, to calculate the amount of memory required to hold a string of a given size, etc. Very much not in the spirit of C.
Next, what endianness should that field have? NUL terminated strings have no endianness issues: they can be trivially written to files, embedded in network packets, whatever. But with a length field, we either need to remember to marshall the string, or allow for the length field to not be in native byte order. Neither is a pleasant prospect, especially for a 1970’s C programmer.
Also, consider C-style string parsing, e.g. strtok/strsep. These could not be implemented with length-field strings.
Explicit length is better when you have an enforced abstraction, like std::string, but at that point you’re not writing in C. If you have to pick an exposed representation, NUL termination is much better than Pascal-style length fields.
So what was the “one-byte mistake?” The article says that it was saving a byte by using NUL termination instead of a two-byte length field. Had K&R not made that “mistake,” we would be unable to make a string longer than 65k - a far more serious limitation than anything NUL termination imposes!
K&R got it right.
by TomMasz on 4/11/14, 5:00 PM
What we've failed to do is ever revisit those decisions and change them where we've identified problems. Yes, you can probably compile (with warnings) files from UNIX v7, but we pay for that compatibility. But there's no question designing, building and maintaining a libc alternative is a colossal undertaking and not likely to happen on a whim. So here we are.
by radiospiel on 4/11/14, 2:27 PM
by gumby on 4/12/14, 4:06 AM
I wonder if you could do this compatibly in the compiler by adding another primitive type (counted string) which had the length in the bytes before the start of the null-terminated string. You'd need a new type because various routines in the standard library would have to invisibly have two versions for counted and non-counted strings (since if you incremented a string pointer, or used a function like strchr, you'd have to treat it as a regular char). "Safe" code would use a different call (say, cstrchr) that returned an index instead of a char. The compiler could optionally warn on unsafe "legacy" calls as it can with strcpy instead of strncpy.
by cliveowen on 4/11/14, 3:03 PM
by crashandburn4 on 4/11/14, 3:05 PM
[1] http://webcache.googleusercontent.com/search?q=cache:http://...
by orvado on 4/11/14, 5:29 PM
${lang} is the language of the future
This looks like a macro for substitution, but maybe its some hip new term I've never encountered. An actual language or just a placeholder for a language that hasn't been chosen yet?
by bananas on 4/11/14, 6:16 PM
200,"STR"
We know where that got us...Programming 101, rules 1&2:
1 - never trust your inputs
2 - always check your invariants.
by ithinkso on 4/11/14, 10:57 PM
by rw_grim on 4/11/14, 3:25 PM