by kbwt on 1/7/19, 11:17 PM with 109 comments
by AndyKelley on 1/8/19, 12:24 AM
/*
* The point of this file is to contain all the LLVM C++ API interaction so that:
* 1. The compile time of other files is kept under control.
* 2. Provide a C interface to the LLVM functions we need for self-hosting purposes.
* 3. Prevent C++ from infecting the rest of the project.
*/
// copied from include/llvm/ADT/Triple.h
enum ZigLLVM_ArchType {
ZigLLVM_UnknownArch,
ZigLLVM_arm, // ARM (little endian): arm, armv.*, xscale
ZigLLVM_armeb, // ARM (big endian): armeb
ZigLLVM_aarch64, // AArch64 (little endian): aarch64
...
and then in the .cpp file: static_assert((Triple::ArchType)ZigLLVM_UnknownArch == Triple::UnknownArch, "");
static_assert((Triple::ArchType)ZigLLVM_arm == Triple::arm, "");
static_assert((Triple::ArchType)ZigLLVM_armeb == Triple::armeb, "");
static_assert((Triple::ArchType)ZigLLVM_aarch64 == Triple::aarch64, "");
static_assert((Triple::ArchType)ZigLLVM_aarch64_be == Triple::aarch64_be, "");
static_assert((Triple::ArchType)ZigLLVM_arc == Triple::arc, "");
...
I found it more convenient to redefine the enum and then static assert all the values are the same, which has to be updated with every LLVM upgrade, than to use the actual enum, which would include a bunch of other C++ headers.The file that has to use C++ headers takes about 3x as long to compile than Zig's ir.cpp file which is nearing 30,000 lines of code, but only depends on C-style header files.
by beached_whale on 1/8/19, 12:11 AM
by nanolith on 1/8/19, 12:03 AM
The compiler firewall strategy works fairly well in C++11 and even better in C++14. Create a public interface with minimal dependencies, and encapsulate the details for this interface in a pImpl (pointer to implementation). The latter can be defined in implementation source files, and it can use unique_ptr for simple resource management. C++14 added the missing make_unique, which eases the pImpl pattern.
That being said, compile times in C++ are going to typically be terrible if you are used to compiling in C, Go, and other languages known for fast compilation times. A build system with accurate dependency tracking and on-demand compilation (e.g. a directory watcher or, if you prefer IDEs, continuous compilation in the background) will eliminate a lot of this pain.
by AdieuToLogic on 1/8/19, 5:55 AM
Large-Scale C++ Software Design[0]
The techniques set forth therein are founded in real-world experience and can significantly address large-scale system build times. Granted, the book is dated and likely not entirely applicable to modern C++, yet remains the best resource regarding insulating modules/subsystems and optimizing compilation times IMHO.
0 - https://www.pearson.com/us/higher-education/program/Lakos-La...
by kazinator on 1/8/19, 2:40 AM
Recently, after a ten year absence of not using ccache, I was playing with it again.
The speed-up from ccache you obtain today is quite a bit more more than a decade ago; I was amazed.
ccache does not cache the result of preprocessing. Each time you build an object, ccache passes it through the preprocessor to obtain the token-level translation unit which is then hashed to see if there is a hit (ready made .o file can be retrieved) or miss (preprocessed translation unit can be compiled).
There is now more than a 10 fold difference between preprocessing, hashing and retrieving a .o file from the cache, versus doing the compile job. I just did a timing on one program: 750 milliseconds to rebuild with ccache (so everything is preprocessed and ready-made .o files are pulled out and linked). Without ccache 18.2 seconds. 24X difference! So approximately speaking, preprocessing is less than 1/24th of the cost.
Ancient wisdom about C used to be that more than 50% of the compilation time is spent on preprocessing. That's the environment from which came the motivations for devices like precompiled headers, #pragma once and having compilers recognize the #ifndef HEADER_H trick to avoid reading files.
Nowadays, those things hardly matter.
Nowdays when you're building code, the rate at which .o's "pop out" of the build subjectively appears no faster than two decades ago, even though the memories, L1 and L2 cache sizes, CPU clock speeds, and disk spaces are vastly greater. Since not a lot of development has gone into preprocessing, it has more or less sped up with the hardware, but overall compilation hasn't.
Some of that compilation laggardness is probably due to the fact that some of the algorithms have tough asymptotic complexity. Just extending the scope of some of the algorithms to do a bit of better job causes the time to rise dramatically. However, even compiling with -O0 (optimization off), though faster, is still shockingly slow, given the hardware. If I build that 18.2 second program with -O2, it still takes 6 seconds: an 8X difference compared to preprocessing and linking cached .o files in 750 ms. A far cry from the ancient wisdom that character and token level processing of the source dominates the compile time.
by RcouF1uZ4gsC on 1/8/19, 12:27 AM
In my opinion, this makes any conclusion dubious. If you really care about compile times in C++, step 0 is to make sure you have an adequate machine (at least quadcore CPU/ lot of RAM/SSD). If the choice is between spending programmer time trying to optimize compile times, versus spending a couple hundred dollars for an SSD, 99% of the time, spending money on an SSD will be the correct solution.
by lbrandy on 1/8/19, 1:07 AM
We'll have them Soon™.
by _0w8t on 1/8/19, 8:04 AM
The drawback is that sources from the jumbo can not be compiled in parallel. So if one has access to extremely parallel compilation farm, like developers at Google, it will slow down things.
by mcv on 1/8/19, 12:30 PM
Compilation at the time took over 2 hours.
At some point I wrote a macro that replaced all those automatically generated const uints with #defines, and that cut compilation time to half an hour. It was quickly declared the biggest productivity boost by the project lead.
by fizwhiz on 1/7/19, 11:57 PM
by timvisee on 1/8/19, 9:36 AM