by bshanks on 1/22/22, 5:58 AM with 196 comments
by prirun on 1/22/22, 2:27 PM
This isn't a particularly hard problem. C just took a shitty shortcut to fake strings using byte arrays and the world glombed onto it. Now we're stuck with a crappy "standard" that people should have scoffed at when it first showed its ugly face.
by ErikCorry on 1/22/22, 8:47 AM
And if you are using base 16 then strtol will allow an optional "0x" prefix. So if you didn't want that you have to check for it manually.
Strtol also accepts leading whitespace so if you didn't want that you have to test manually for it.
Don't pass a zero base thinking it means base ten. This works almost all the time but misinterprets a leading zero to mean octal.
Good luck!
by WalterBright on 1/22/22, 8:35 AM
https://www.digitalmars.com/articles/C-biggest-mistake.html
and does not break existing code.
by dottedmag on 1/22/22, 7:41 AM
Not the pervasive undefined behaviour and compilers that become more aggressive every release about breaking previously-working code?
Not the reams of code that assume sizes of integers and signedness of char?
Not the wild build process that makes it awfully hard to actually build anything that has any dependencies whatsoever.
strtol. Damn, what a nuisance!
by stncls on 1/22/22, 6:23 PM
char *forty_two_bee = "42b";
char *end;
errno = 0; // remember errno?
long i = strtol(forty_two_bee, &end, 10);
> This will return 0No, this will return 42. strtol() parses greedily until a character cannot be parsed, but then it returns the conversion of what it did parse.
I guess the fact the author got this wrong... kind of proves their point that strtol()'s API is not great?
On the other hand, while the article purports to criticize a language, it then proceeds to only cover its standard library. Sure, C's stdlib is old-fashioned, but there are many things in C that are much worse than its standard library! (And I say that as someone who still likes the language.)
by dcposch on 1/22/22, 9:30 PM
Go does not have manual memory management. Despite (actually because of) that captures the spirit and design goal of the original C beautifully. It's a minimalist systems programming language.
One of the amazing things about Go is the standard library-- the thing he complains about with C. The Go standard library is incredibly readable. It's night and day from C/C++ where opening glibc/STL etc is assault on the senses.
by skywhopper on 1/22/22, 1:01 PM
by tragomaskhalos on 1/22/22, 2:42 PM
by gumby on 1/22/22, 5:04 PM
The PDP-7 was long obsolete by the time the POSIX effort started. By then the most common Unix host was a VAX (32 bits), though it, or Unix-alikes, ran on a variety of 16 and 32 bit machines, hence a desire for standardization.
by creativemonkeys on 1/22/22, 11:00 AM
You drove a Corolla in college, then got a job and drove a cool BMW for several years and now you think you're hot shit, so you hope in an F1 car and not only does it take forever to learn how to drive it, it has to be driven on a special track and the gearbox is different, what a nuisance!
"If only we could add 4 doors, automatic transmission, snow tires, and a trunk to put our stuff in, people won't keep getting into accidents with this car", you say. Right, but then it becomes a BMW. If you want real speed, you need to first go slow and master the car because otherwise you'll crash and burn.
C is messy because real world hardware is very messy. You can't push bytes through the hardware at its speed limit without getting your hands dirty, and we all come out into the real world wearing "class Dog extends Animal" white gloves.
To use C effectively, you should not be coding in C in your mind. You should be thinking in assembly, but your fingers should be typing C code. It's not safe, but if you want to reach 230MPH and accelerate at 60MPH in 2.6 seconds, you better know exactly what you're doing when you hop behind the wheel of that car. It's not for the weak.
by alkonaut on 1/22/22, 3:13 PM
Even a person in the 60s would realize that that’s the api for conversion from a string to a number (or any conversion that might fail)! What happened? Why do these functions even exist?
by kazinator on 1/22/22, 7:48 AM
char *one = "one";
char *end;
errno = 0; // remember errno?
long i = strtol(one, &end, 10);
if (errno != 0) {
perror("Error parsing integer from string: ");
} else if (i == 0 && end == one) {
fprintf(stderr, "Error: invalid input: %s\n", one);
} else if (i == 0 && *end != '\0') {
f__kMeGently(with_a_chainsaw);
}
It's actually like this: errno = 0;
long i = strtol(input, &end, 10);
if (end == input) {
// no digits were found
} else if (*end != 0 && no_ignore_trailing_junk) {
// unwanted trailing junk
} else if ((i == LONG_MIN || i == LONG_MAX)) && errno != 0) {
// overflow case
} else {
// good!
}
errno only needs to be checked in the LONG_MIN or LONG_MAX case. These cares are ambiguous: LONG_MIN and LONG_MAX are valid values of type long, and they are used for reporting an underflow or overflow. Therefore errno is reset to zero first. Otherwise what if errno contains a nonzero value, and LONG_MAX happens to be a valid, non-overflowing value out of the function?Anyway, you cannot get away from handling these cases no matter how you implement integer scanning; they are inherent to the problem.
It's not strtol's fault that the string could be empty, or that it could have a valid number followed by junk.
Overflows stem from the use of a fixed-width integer. But even if you use bignums, and parse them from a stream (e.g. network), you may need to set a cutoff: what if a malicious user feeds you an endless stream of digits?
The bit with errno is a bit silly; given that the function's has enough parameters that it could have been dispensed with. We could write a function which is invoked exactly like strtoul, but which, in the overflow case, sets the *end pointer to NULL:
// no assignment to errno before strtol
int i = my_strtoul(input, &end, 10);
if (end == 0) {
// underflow or overflow, indicated by LONG_MIN or LONG_MAX value
} else if (end == input) {
// no digits were found
} else if (*end != 0 && no_ignore_trailing_junk) {
// unwanted trailing junk, but i is good
} else {
// no trailing junk, value in i
}
errno is a pig; under multiple threads, it has to access a thread local value. E.g #define errno (*__thread_specific_errno_location())
The designer of strtoul didn't do this likely because of the overriding requirement that the end pointer is advanced past whatever the function was able to recognize as a number, no matter what. This is lets the programmer write a tokenizer which can diagnose the overflow error, and then keep going with the next token.by PaulDavisThe1st on 1/22/22, 8:09 PM
char* forty_two = 42;
int i;
if (sscanf (forty_two, "%d", &i) != 1) {
/* error */
}
Sometimes, there's more than one way to skin a cat, and one of them is more suited to the task at hand.by AtlasBarfed on 1/24/22, 12:56 AM
https://www.reddit.com/r/Zig/comments/9q3or3/how_to_deal_wit...
If that's true, Zig is NOT a modern language. Modern languages use international strings, and are unicode aware with a good unicode aware string library.
For crap's sake, the code example for comparing modern languages USES A STRING. The fact it is not unicode doesn't matter.
by EVa5I7bHFq9mnYK on 1/22/22, 7:32 PM
by futharkshill on 1/22/22, 10:40 AM
by treeshateorcs on 1/23/22, 6:14 PM
by gengiskush on 1/22/22, 12:27 PM