from Hacker News

C++ Headers are Expensive

by kbwt on 1/7/19, 11:17 PM with 109 comments

by AndyKelley on 1/8/19, 12:24 AM

In the Zig stage1 compiler (written in C++), I tried to limit all the C++ headers to as few files as possible. Not counting vendored dependencies, the compiler builds in 24 seconds using a single core on my laptop. It's because of tricks like this:

    /*
     * The point of this file is to contain all the LLVM C++ API interaction so that:
     * 1. The compile time of other files is kept under control.
     * 2. Provide a C interface to the LLVM functions we need for self-hosting purposes.
     * 3. Prevent C++ from infecting the rest of the project.
     */


    // copied from include/llvm/ADT/Triple.h

    enum ZigLLVM_ArchType {
        ZigLLVM_UnknownArch,
    
        ZigLLVM_arm,            // ARM (little endian): arm, armv.*, xscale
        ZigLLVM_armeb,          // ARM (big endian): armeb
        ZigLLVM_aarch64,        // AArch64 (little endian): aarch64
    ...

and then in the .cpp file:

    static_assert((Triple::ArchType)ZigLLVM_UnknownArch == Triple::UnknownArch, "");
    static_assert((Triple::ArchType)ZigLLVM_arm == Triple::arm, "");
    static_assert((Triple::ArchType)ZigLLVM_armeb == Triple::armeb, "");
    static_assert((Triple::ArchType)ZigLLVM_aarch64 == Triple::aarch64, "");
    static_assert((Triple::ArchType)ZigLLVM_aarch64_be == Triple::aarch64_be, "");
    static_assert((Triple::ArchType)ZigLLVM_arc == Triple::arc, "");
    ...

I found it more convenient to redefine the enum and then static assert all the values are the same, which has to be updated with every LLVM upgrade, than to use the actual enum, which would include a bunch of other C++ headers.

The file that has to use C++ headers takes about 3x as long to compile than Zig's ir.cpp file which is nearing 30,000 lines of code, but only depends on C-style header files.

by beached_whale on 1/8/19, 12:11 AM
You can know where you time is going, at least with clang, by adding -ftime-report to your compiler command line. The headers take a long time is often that the compiler can do a better job at optimizing and inlining as everything is visible. Just timing your compiles is like trying to find things in the dark, you know the wall is there but what are you stepping on :) Good to know what is taking a long time, but it may not be the header itself but how much more work the compiler can do now to give a better output(potentially)
by nanolith on 1/8/19, 12:03 AM
I recommend three things for wrangling compile times in C++: precompiled headers, using forward headers when possible (e.g. ios_fwd and friends), and implementing an aggressive compiler firewall strategy when not.
The compiler firewall strategy works fairly well in C++11 and even better in C++14. Create a public interface with minimal dependencies, and encapsulate the details for this interface in a pImpl (pointer to implementation). The latter can be defined in implementation source files, and it can use unique_ptr for simple resource management. C++14 added the missing make_unique, which eases the pImpl pattern.
That being said, compile times in C++ are going to typically be terrible if you are used to compiling in C, Go, and other languages known for fast compilation times. A build system with accurate dependency tracking and on-demand compilation (e.g. a directory watcher or, if you prefer IDEs, continuous compilation in the background) will eliminate a lot of this pain.
by AdieuToLogic on 1/8/19, 5:55 AM
If C++ compile time is a concern and/or impediment to productivity, I recommend the seminal work regarding this topic by Lakos:
Large-Scale C++ Software Design[0]
The techniques set forth therein are founded in real-world experience and can significantly address large-scale system build times. Granted, the book is dated and likely not entirely applicable to modern C++, yet remains the best resource regarding insulating modules/subsystems and optimizing compilation times IMHO.
0 - https://www.pearson.com/us/higher-education/program/Lakos-La...
by kazinator on 1/8/19, 2:40 AM
Speaking about GNU C++ (and C), the headers are getting cheaper all the time compared to the brutally slow compilation.
Recently, after a ten year absence of not using ccache, I was playing with it again.
The speed-up from ccache you obtain today is quite a bit more more than a decade ago; I was amazed.
ccache does not cache the result of preprocessing. Each time you build an object, ccache passes it through the preprocessor to obtain the token-level translation unit which is then hashed to see if there is a hit (ready made .o file can be retrieved) or miss (preprocessed translation unit can be compiled).
There is now more than a 10 fold difference between preprocessing, hashing and retrieving a .o file from the cache, versus doing the compile job. I just did a timing on one program: 750 milliseconds to rebuild with ccache (so everything is preprocessed and ready-made .o files are pulled out and linked). Without ccache 18.2 seconds. 24X difference! So approximately speaking, preprocessing is less than 1/24th of the cost.
Ancient wisdom about C used to be that more than 50% of the compilation time is spent on preprocessing. That's the environment from which came the motivations for devices like precompiled headers, #pragma once and having compilers recognize the #ifndef HEADER_H trick to avoid reading files.
Nowadays, those things hardly matter.
Nowdays when you're building code, the rate at which .o's "pop out" of the build subjectively appears no faster than two decades ago, even though the memories, L1 and L2 cache sizes, CPU clock speeds, and disk spaces are vastly greater. Since not a lot of development has gone into preprocessing, it has more or less sped up with the hardware, but overall compilation hasn't.
Some of that compilation laggardness is probably due to the fact that some of the algorithms have tough asymptotic complexity. Just extending the scope of some of the algorithms to do a bit of better job causes the time to rise dramatically. However, even compiling with -O0 (optimization off), though faster, is still shockingly slow, given the hardware. If I build that 18.2 second program with -O2, it still takes 6 seconds: an 8X difference compared to preprocessing and linking cached .o files in 750 ms. A far cry from the ancient wisdom that character and token level processing of the source dominates the compile time.
by RcouF1uZ4gsC on 1/8/19, 12:27 AM
> The test was done with the source code and includes on a regular hard drive, not an SSD.
In my opinion, this makes any conclusion dubious. If you really care about compile times in C++, step 0 is to make sure you have an adequate machine (at least quadcore CPU/ lot of RAM/SSD). If the choice is between spending programmer time trying to optimize compile times, versus spending a couple hundred dollars for an SSD, 99% of the time, spending money on an SSD will be the correct solution.
by lbrandy on 1/8/19, 1:07 AM
All of msvc, gcc, clang, and the isocpp committee have active work ongoing for C++ modules.
We'll have them Soon™.
by _0w8t on 1/8/19, 8:04 AM
Opera contributed jumbo build feature to Chromium. The idea is to feed to the compiler not the individual sources, but a file that includes many sources. This way common headers are compiled only once. The compilation time saving can be up to factor of 2 or more on a laptop.
The drawback is that sources from the jumbo can not be compiled in parallel. So if one has access to extremely parallel compilation farm, like developers at Google, it will slow down things.
by mcv on 1/8/19, 12:30 PM
This reminds me of my very first job after university. We used Visual C++, with some homebrew framework with one gigantic header file that tied everything together. That header file contained thousands or possibly tens of thousands of const uints, defining all sorts of labels, identifiers and whatever. And that header file was included absolutely everywhere, so every object file got those tens of thousands of const uints taking up space.
Compilation at the time took over 2 hours.
At some point I wrote a macro that replaced all those automatically generated const uints with #defines, and that cut compilation time to half an hour. It was quickly declared the biggest productivity boost by the project lead.
by fizwhiz on 1/7/19, 11:57 PM
Isn't this the reason precompiled headers are a thing?
by timvisee on 1/8/19, 9:36 AM
I would love to see the times of this on a Linux system (preferably on the same hardware).