Build environment setup for CMake

Warning

CatBoost uses CMake-based build process since this commit. Previously Ya Make (Yandex's build system) had been used.

For building CatBoost using Ya Make see here

Later in this document $CATBOOST_SRC_ROOT refers to the root dir of the local working copy of the source code cloned from the GitHub CatBoost repository.

Host platform refers to the operating system and CPU architecture you run build on.

Target platform refers to the operating system and CPU architecture you run build for (where you intend to run built artifacts like executable CLI application, dynamic library, Python extension library etc.)

Possible host and target platform combinations:

Host platform Target platform
Linux x86_64 Linux or Android
Linux non x86_64 Linux
macOS x86_64 or arm64 macOS x86_64 or arm64 or universal binaries
Windows x86_64 Windows x86_64

Native artifacts build requirements

Python interpreter

Python 3.x interpreter. Python is used in some auxiliary scripts and conan package manager uses it.

For revisions before 98df6bf python had to have six package installed.

CMake

Condition Minimum version
Target OS is Windows, build without CUDA support 3.24
Target OS is Windows, build with CUDA support 3.21
Target OS is Android 3.21
CUDA support, target OS is not Windows 3.18
None of the above 3.15

Android NDK (only for Android target platform)

Depending on host OS:

  • gcc compiler, not used to compile CatBoost code itself but used to build dependencies as Conan packages.
  • clang compilers, version 14+ and version 12 as a CUDA host compiler if you want to build with CUDA support.
  • lld linker, version 7+

For Linux target the default CMake toolchain assumes that clang and clang++ are available from the command line and will use them to compile CatBoost components. If the default version of clang and clang++ is not what is intended to be used for building then modify the toolchain file $CATBOOST_SRC_ROOT/build/toolchains/clang.toolchain - replace all occurences of clang and clang++ with clang-$CLANG_VERSION and clang++-$CLANG_VERSION respectively where $CLANG_VERSION is the version of clang you want to use like, for example, 16 or 17 (must be already installed).

For compilation with CUDA support the default CMake toolchain assumes that clang-12 is available from the command line.

For revisions before 136f14f the minimal supported clang version has been 12.

Android target uses its own CMake toolchain and compiler tools specified there are all provided by the NDK.

  • XCode command line tools (must contain clang with version 14+, so XCode version must be greater than 14.0 as well)

For revisions before 136f14f the minimal supported clang version has been 12 (means XCode version must have been 12.0+ as well).

  • Windows 10 or Windows 11 SDK (usually installed as a part of the Microsoft Visual Studio setup)

  • for builds without CUDA support:

    • Microsoft Visual Studio 2022 with clang-cl compiler with version 14+ installed (can be selected in Individual components pane of the Visual Studio Installer for Visual Studio 2022). See details here
  • for builds with CUDA support:

    • Microsoft Visual Studio 2019 or 2022 with MSVC v142 - C++ x64/x86 build tools version v14.28 - 16.x or v14.29 - 16.x (can be selected in Individual components pane of the Visual Studio Installer for a paricular Visual Studio version)

For revisions before d5ac776 builds without CUDA have also been using MSVC v142 - Visual Studio 2019 C++ x64/x86 build tools version v14.28 - 16.8 or v14.28 - 16.9.

For revisions before between d5ac776 and 136f14f for builds without CUDA support the minimum supported clang-cl version has been 12 (so, Visual Studio 2019 that includes it has also been supported).

CUDA toolkit (only if CUDA support is needed)

Supported only for Linux and Windows host and target platforms.

CUDA toolkit needs to be installed.

CUDA version 11.8 is supported by default (because it contains the biggest set of supported target CUDA compute architectures).

Other CUDA versions (11.4+) can also be used but require changing target compute architecture options in affected CMake targets.

For revisions before 45cc2e1 the minimal supported CUDA version has been 11.0 .

Conan

Version 1.57.0 - 1.62.0. Version 1.62.0 is required if you use python 3.12. Version 2.x support is in progress.

Used for some dependencies.

conan command should be available from the command line.

Make sure that the path to Conan cache does not contain spaces, this causes issues with some projects. Default cache location can be overridden by specifying CONAN_USER_HOME environment variable

Build system for CMake

Ninja is the preferred build system for CMake.

ninja command should be available from the command line.

Alternatively, on Windows you could also use Visual Studio generators for CMake.

  • For builds with CUDA use Visual Studio 16 2019 generator and also specify the required toolset version when calling CMake by adding -T version=14.28 to the command line.
  • For builds without CUDA use Visual Studio 17 2022 generator and also specify the required ClangCL toolset when calling CMake by adding -T ClangCL to the command line.

Unix Makefiles CMake generator usage on macOS and Linux is possible but not recommended because of some issues with properly taking dependencies into account.

JDK (only for components with JVM API)

You have to install JDK to build components with JVM API (JVM applier and CatBoost for Apache Spark).
JDK version has to be 8+ for JVM applier or strictly 8 for CatBoost for Apache Spark.

Set JAVA_HOME environment variable to point to the path of JDK installation to be used during build.

Python development artifacts (only for Python package)

You have to install Python development artifacts (Python headers in an include directory and Python library for building modules).

Note that they are specific to CPython Python implementation. CatBoost does not currently support other Python implementations like PyPy, Jython or IronPython.

One convenient way to install different Python versions with development artifacts in one step is to use pyenv (and its variant for Windows - pyenv-win)