from Hacker News

Re: C as used/implemented in practice

by davidtgoldblatt on 7/13/15, 5:12 PM with 111 comments

  • by nathanb on 7/13/15, 11:44 PM

    I have difficulty accepting "let's replace C with X", where X is a memory-managed language. As a systems programmer (I write SCSI driver code in C), I can't overemphasize how important it is to be able to address memory as a flat range of bytes, regardless of how that memory was originally handed to me. I need to have uint8_t* pointers into the middle of buffers which I can then typecast into byte-aligned structs. If your memory manager would not allow this or would move this memory around, that's a non-starter.

    I don't stick with C because I love it. If I'm writing something for my own purposes, I use Ruby. I've written some server code in Golang (non-production), and it's pretty nifty, even if the way it twists normal C syntax breaks my brain. I even dabble in the dark side (C++) personally and professionally from time to time. And in a previous life, I was reasonably proficient in C# (that's the CLR 2.0 timeframe; I'm completely useless at it in these crazy days of LINQ and the really nifty CLR 4 features...and there's probably even more stuff I haven't even become aware of).

    But none of those languages would let me do what I need to do: zero-copy writes from the network driver through to the RAID backend. And even if they did, the pain of rewriting our entire operating system in Go or Rust or whatever would be way more than the alleviated pain of using a "nicer" language.

    (We never use 'int', by the way. We use the C99 well-defined types in stdint.h. Could this value go greater than a uint32_t can represent? Make it a uint64_t. Does it need to be signed? No? Make sure it's unsigned. A lot of what he's complaining about is sloppy code. I don't care if your compiler isn't efficient when compiling sloppy code.)

  • by byuu on 7/14/15, 3:26 AM

    I understand that in some cases, these heroic compiler optimizations can offer significant performance increases. We should keep C around as it is for when said performance is critical.

    But surely, we can design a language that has no undefined behavior, without substantial deviations from C's syntax, and without massive performance penalties. This language would be great for things that prize security over performance.

    And the trick is, we don't need to rewrite all software in existence in a new language to get here! C can be this language, all we need is a special compilation flag that replaces undefined behavior with defined behavior. Functions called inside a function's arguments? Say they evaluate left-to-right. Shift right on signed types? Say it's arithmetic. Size of a byte? Say it's 8-bits. memset(0x00) on something going out of scope? If the developer said to do it, do it anyway. Underlying CPU doesn't support this? Emulate it. If it can't be emulated, then don't use code that requires the safe flag on said architecture. Yeah, screw the PDP-11. And yeah, it'll be slower in some cases. Yes, even twice as slow in some cases. But still far better than moving to a bytecode or VM language.

    And when we have guaranteed behavior of C, we can write new DSLs that transcode to C, without carrying along all of C's undefined behavior with it.

    You want to talk about writing in higher-level languages like Python and then having C for the underlying performance critical portions? Why not defined-behavior C for the security-critical and cold portions of code, and undefined-behavior C for the critical portions?

    Maybe Google wouldn't accept the speed penalty; but I'd happily drop my personal VPS from ~8000 maximum simultaneous users to ~5000 if it greatly decreased the odds of being vulnerable to the next Heartbleed. But I'm not willing to completely abandon all C code, and drop down to ~200 simultaneous users, to write it in Ruby.

  • by jeffreyrogers on 7/14/15, 12:44 AM

    For those who don't know Chris Lattner[1], who wrote this post, is the primary author of LLVM and more recently of Swift, so he knows a bit about what he's talking about :)

    [1]: https://en.wikipedia.org/wiki/Chris_Lattner

  • by carlosrg on 7/14/15, 10:32 AM

    Until I see really big and open source projects like WebKit or Clang itself moving to Swift or whatever, anything I read about moving to "better systems languages" is like reading a letter to Santa Claus. I doubt C++ is going anywhere, especially when C++ itself is not standing still and evolving (C++11, 14, 17...) while maintaining backwards compatibility.
  • by pjmlp on 7/14/15, 6:37 AM

    "My hope is that the industry will eventually move to better systems programming languages, but that will take a very very long time..."

    -- Chris Lattner

    Yes, a very long time. Modula-2 was born in 1978, but we can go back to Algol and Lisp even.

  • by mcguire on 7/14/15, 3:22 AM

    "In the first example above, it is that 'int' is the default type people generally reach for, not 'long', and that array indexing is expressed with integers instead of iterators. This isn’t something that we’re going to 'fix' in the C language, the C community, or the body of existing C code."

    The majority of that message is pretty well said, but this particular part leaves me cold. The problem isn't that 'int' is the default type, not 'long', nor is it that array indexing isn't done with iterators. (Ever merged two arrays? It's pretty clear using int indexes or pointers, but iterators can get verbose. C++ does a very good job, though, by making iterators look like pointers.) The problem is that, in C, the primitive types don't specifically describe their sizes. If you want a 32-bit variable, you should be able to ask for an unsigned or signed 32-bit variable. If you want whatever is best on this machine, you should be able to ask for whatever is word-sized. Unfortunately, C went with char <= short <= int <= long (, longlong, etc.); in an ideal world, 'int' would be the machine's word size, but when all the world's a VAX, 'int' means 32-bits.

    That is one of the major victories with Rust: most primitive types are sized, with an additional word-sized type.

  • by mrpippy on 7/13/15, 11:59 PM

    For the for loop example, is there some reason why clang doesn't output a warning like "Does 'i' really need to be signed? If so, explicitly make it a 'signed int'. Otherwise, change it to be unsigned"
  • by nikanj on 7/14/15, 10:50 AM

  • by ryanmarsh on 7/14/15, 2:24 AM

    Have we lost sight of the fact that when we talk about a programming language we're really talking about how to put bits on CPU registers?
  • by JustSomeNobody on 7/14/15, 2:45 AM

    What would be considered "security critical"? SSH? IPTables? Linux kernel?