CMake superbuild
A superbuild is an approach in which a build system not only builds the main project, but also its dependencies, as part of the same build process. This post builds on top of the previous one and explores ways to achieve this with CMake.
TL;DR: In my opinion, a superbuild should act as a wrapper around a project and not impose any specific dependency management approach on users. However, with CMake I find that it introduces unnecessary indirections and overhead, and therefore superbuilds should not be used. I provide a small example of a CMake superbuild here.
You may already have a superbuild
If you manage your dependencies with add_subdirectory
, it is already a
superbuild. But it has downsides, as explained
in the previous post.
What if a dependency does not support CMake?
What if the user wants to use a package manager instead of building the
dependencies? Not to mention that a clean build will require rebuilding all
the dependencies.
The find_package
strategy, however, is not a superbuild itself (that’s the point, after all: it should
delegate the dependency management). But it can be used in a superbuild; let’s see how.
Treat your project like a dependency
The idea is simple: write a script that will build the dependencies
(in the right order), and then the main project. At the point where the
main project is being built, CMake will find the dependencies that were
just built with find_package
.
Because it stays completely out of the main project, it can be written
in any language: shell, Python, or CMake.
With a shell script
Let us consider a simple example where our main project (main
) has
two dependencies: depA
(using Automake) and depB
(using CMake).
Our main project depends on both, and depB
depends on depA
.
The superbuild shell script (let’s call it build.sh
) would live in the
parent directory w.r.t. main
, like so:
.
├── build.sh
└── main
└── CMakeLists.txt
Note that main
is a completely standard, standalone CMake project.
Assuming that the dependencies are installed on the system, it could
be built normally (from the parent directory) with:
cmake -Bmain/build -Smain
cmake --build main/build
For a superbuild, we want to build the dependencies instead of relying
on the system libraries. So we want build.sh
to:
- Fetch and build the dependencies
- Build the main project
Let’s first fetch the dependencies in order to better visualize the directory tree. The script could start like this:
#!/bin/sh
git clone https://git-repo-1.com/depA
git clone https://git-repo-2.com/depB
Which would result in the following tree:
.
├── build.sh
├── depA
│ └── configure
│ └── Makefile
├── depB
│ └── CMakeLists.txt
└── main
└── CMakeLists.txt
Then it would need to build the projects in order, using
the right tooling (note how depA
uses Automake):
#!/bin/sh
git clone https://example.com/depA
git clone https://example.com/depB
# Build depA and install it in ./install
pushd depA
./configure --prefix ../install
make install
popd
# Build depB and install it in ./install, too
# Note that depB depends on depA, so we point CMAKE_PREFIX_PATH
# to ./install (where depA is installed)
cmake -DCMAKE_PREFIX_PATH=$(pwd)/install -DCMAKE_INSTALL_PREFIX=install -BdepB/build -SdepB
cmake --build depB/build --target install
# Build our main project, also pointing CMAKE_PREFIX_PATH to ./install
cmake -DCMAKE_PREFIX_PATH=$(pwd)/install -Bmain/build -Smain
cmake --build main/build
That’s it, we have our superbuild! Without changing anything in
the main CMakeLists.txt of the main project. Now the user can either build
./main/CMakeLists.txt
like a normal CMake project (that will try to find the
dependencies on the system), or call ./build.sh
that will build and install
the dependencies locally (in ./install
) and will then build the main
project with them.
Though it works, this shell script is probably not cross-platform. If we used CMake instead, then it would be better in that regard1. Let’s see how we can leverage our CMake helpers from the previous post.
With CMake
We can achieve the same result as the build.sh
above with a CMake
script. It will look like this:
.
├── CMakeLists.txt
└── main
└── CMakeLists.txt
I find it important to realise that ./CMakeLists.txt
and
./main/CMakeLists.txt
are two different CMake projects (just like
build.sh
and ./main/CMakeLists.txt
were two separate things)!
In other words, ./CMakeLists.txt
will not call add_subdirectory(main)
.
That is not the typical setup (with consequences that we will discuss later):
usually, there is a root CMakeLists that calls sub-CMakeLists with
add_subdirectory
, making a tree where all CMakeLists belong to the same
project. Here we really have two different CMake projects: the superbuild, and
the main project. But the superbuild will build multiple separate projects
(i.e. the main project and its dependencies).
The superbuild CMakeLists (i.e. ./CMakeLists.txt
) could look like this:
cmake_minimum_required(VERSION 3.10.2)
project(superbuild)
ExternalProject_Add(depA
PREFIX depA
GIT_REPOSITORY https://example.com/depA
CONFIGURE_COMMAND <SOURCE_DIR>/configure --prefix=${CMAKE_INSTALL_PREFIX}
BUILD_COMMAND ${CMAKE_MAKE_PROGRAM}
INSTALL_COMMAND ${CMAKE_MAKE_PROGRAM} install
)
ExternalProject_Add(depB
PREFIX depA
GIT_REPOSITORY https://example.com/depB
CMAKE_CACHE_ARGS
-DCMAKE_INSTALL_PREFIX=${CMAKE_INSTALL_PREFIX}
-DCMAKE_PREFIX_PATH=${CMAKE_PREFIX_PATH}
DEPENDS depA
)
ExternalProject_Add(main
PREFIX main
URL ${CMAKE_CURRENT_SOURCE_DIR}/main
CMAKE_CACHE_ARGS
-DCMAKE_INSTALL_PREFIX=${CMAKE_INSTALL_PREFIX}
-DCMAKE_PREFIX_PATH=${CMAKE_PREFIX_PATH}
DEPENDS depA depB
)
And it would be called like so:
cmake -DCMAKE_INSTALL_PREFIX=install -DCMAKE_PREFIX_PATH=$(pwd)/install -Bbuild -S.
cmake --build build
This is now equivalent to the superbuild setup we had with build.sh
, except
that instead of a shell script, we now use a CMake script. Again, the user can
choose to run the superbuild or to directly build the main project (and use the
dependencies installed on the system).
A real-life example of a one-file superbuild can be found here in the gRPC examples.
With CMake and helper scripts
Both superbuild scripts above look quite simple, but they don’t involved many dependencies, platforms and options. In reality, that file will tend to grow and become harder to maintain. That is why I would advise using a structure similar to my previous post, but adapted for the superbuild. With the CMake superbuild script, it looks like this:
.
├── CMakeLists.txt
├── dependencies
│ ├── depA
│ │ ├── Makefile
│ │ └── configure
│ └── depB
│ └── CMakeLists.txt
└── main
└── CMakeLists.txt
The superbuild CMakeLists may now look like this:
cmake_minimum_required(VERSION 3.10.2)
project(superbuild)
add_subdirectory(dependencies)
ExternalProject_Add(main
PREFIX main
URL ${CMAKE_CURRENT_SOURCE_DIR}/main
CMAKE_CACHE_ARGS
-DCMAKE_INSTALL_PREFIX=${CMAKE_INSTALL_PREFIX}
-DCMAKE_PREFIX_PATH=${CMAKE_PREFIX_PATH}
DEPENDS depA depB
)
Now the superbuild calls the dependencies
project with add_subdirectory()
,
which in turn will call each dependency (also with add_subdirectory()
), which
will eventually call ExternalProject_Add()
for each dependency. And finally it
will call ExternalProject_Add()
for the main project.
I do believe that this is the best structure for a superbuild, because:
- It doesn’t pollute the main project, which remains a standalone CMake project that can be built manually with the system libraries. You can even have your CI test that building with the system libraries keeps working.
- It is just a wrapper around the dependencies, which can still be built without the superbuild scheme (again, the CI can test it).
- It doesn’t use any exotic/advanced CMake features (and
ExternalProject_Add
has been supported for many years, it is virtually always available). - The superbuild script does not have to be written with CMake. It can be done
with shell scripts (by adapting the
build.sh
example above) or any language you want.
At the beginning of this post, I advised against superbuilds. Let’s now look at the disadvantages of this design2.
Disadvantages
A superbuild is an abstraction on top of the build, and the goal of an abstraction is to make something either easier or faster to use. In this case, it tries to make it easier for users to build the project. But it comes at a cost.
Abstractions reduce flexibility
By definition, an abstraction cannot offer all the features of the lower-level
interface. Therefore, choices have to be made for every feature that gets
abstracted away.
For instance with the helper scripts, the user can decide where the
built dependencies should be installed (with CMAKE_INSTALL_PREFIX
). They can
choose to install the dependencies in some place, and the main project in
another place (for instance in order to cache the dependencies instead of
rebuilding them over and over).
If we want to keep this flexibility in the superbuild, we will probably have
to add an option like -DDEPENDENCIES_INSTALL_PREFIX
and document it. It may
add complexity: when -DCMAKE_INSTALL_PREFIX
is specified alone, should it
impact both the main project and the dependencies, or only the main project? Or
should we add another option, e.g. -DMAIN_INSTALL_PREFIX
? But then what do we
do with -DCMAKE_INSTALL_PREFIX
?
For every option we may want to add to the superbuild, we will have to think about how it impacts the dependencies, how it impacts the main project, and how to forward the option properly. And it is likely that from time to time, some user of the superbuild will want to add an option in order to access a lower-level feature that got abstracted away. And as a maintainer, you will have to decide whether you want to keep adding features (which will progressively destroy your abstraction) or to refuse to do it (which may piss off the users). That is a situation I personally do not like to be in.
Abstractions need to be maintained
Adding an abstraction should - as much as possible - not prevent advanced users
from using the lower-level interface. That is why we use find_package
instead
of add_subdirectory
in the first place: a package maintainer (e.g. for a Linux
distribution) will always want to use the system libraries. And as we saw above,
a user may want to build the dependencies separately instead of using the
superbuild.
Our design allows all of that, but it means that we have to maintain three ways of building the project (system libraries, helper scripts, superbuild). This is extra work, and maintainers usually have enough to do already.
Abstractions keep users ignorant
It is not always a bad thing; I could not use my computer without abstractions.
But it is a tradeoff, and it should not be abused. In my experience with
superbuilds, I have seen users recompiling all the dependencies every time they
needed a clean build in the main project, and then complaining that the build
“took forever”. And I honestly could not blame them, because the build system
was not obvious to those unfamiliar with CMake. Whereas without the superbuild
script, it is suddenly obvious that the CMakeLists.txt
at the root is the main
project, and that ./dependencies/CMakeLists.txt
is for the dependencies.
Beginners still have to learn how to call CMake, but they are C++ developers: they will have to learn that anyway. I may as well teach them how to do it with my project rather than putting effort into maintaining a superbuild abstraction.
Conclusion
I have designed and used superbuilds myself, and I used to think that it was a good idea. But my experience is that they are usually2 an unnecessary abstraction that requires maintenance work and risks making the build system convoluted.
I did my best to explain how I believe it should be done, but my opinion is that
it should not be done. Instead I would recommend delegating the dependency
management to the user (by using the system libraries with find_package
),
possibly providing helper scripts
if there is a need to build the dependencies from source (I typically do that
when I need to cross-compile for Android or iOS).