Development and contributions

Build from source

The required steps for building CatBoost depend on the implementation.

Note

Windows build currently requires Microsoft Visual studio 2015.3 toolset v140 and Windows 10 SDK (10.0.17134.0).

  1. Clone the repository:

    git clone https://github.com/catboost/catboost.git
    
  2. Open the catboost/catboost/app directory from the local copy of the CatBoost repository.

  3. Run the following command:

    ../../ya make -d [-o <output directory>]
    

    Use the -j <number of threads> option to change the number of threads used when building the project.

The following packages are required for installation:

  • python3
  • python3-dev
  • numpy
  • pandas

To install the Python package:

  1. Clone the repository:

    git clone https://github.com/catboost/catboost.git
    
  2. Open the catboost/catboost/python-package/catboost directory from the local copy of the CatBoost repository.

  3. Compile the packages using one of the following methods:

    • Use one of the Python versions provided by the ya make utility:

      ../../../ya make -r -DUSE_ARCADIA_PYTHON=no -DUSE_SYSTEM_PYTHON=<Python version> [optional parameters]
      
    • Use one of the Python versions installed on the machine:

      ../../../ya make -r -DUSE_ARCADIA_PYTHON=no -DOS_SDK=local -DPYTHON_CONFIG=<path to the required python-config> [optional parameters]
      

    Parameter Description
    Parameters that define the Python version to use for compiling. Only one of the following blocks of options can be used at a time
    Use one of the Python versions provided by the ya make utility
    -DUSE_SYSTEM_PYTHONThe version of Python to use for compiling the package on machines without installed Python.

    The following Python versions are supported and can be defined as values for this parameter:

    • 2.7
    • 3.4
    • 3.5
    • 3.6
    Use one of the Python versions installed on the machine
    -DPYTHON_CONFIGDefines the path to the configuration of an installed Python version to use for compiling the package.

    Value examples:

    • python2-config for Python 2
    • python3-config for Python 3
    • /usr/bin/python2.7-config
    • The configuration must be explicitly named python3-config to successfully build the package for Python 3.
    • Manually redefine the following variables when encountering problems with the Python configuration:
      • -DPYTHON_INCLUDE
      • -DPYTHON_LIBRARIES
      • -DPYTHON_LDFLAGS
      • -DPYTHON_FLAGS
      • -DPYTHON_BIN

    Optional parameters
    -DCUDA_ROOTThe path to CUDA. This parameter is required to support training on GPU.
    -DHAVE_CUDA=noDisable CUDA support. This speeds up compilation.

    By default, the package is built with CUDA support if CUDA Toolkit is installed.

    -oThe directory to output the compiled package to. By default, the current directory is used.

    For example, the following command builds the package for Python 3 with training on GPU support:

    ../../../ya make -r -DUSE_ARCADIA_PYTHON=no -DOS_SDK=local -DPYTHON_CONFIG=python3-config -DCUDA_ROOT=/usr/local/cuda
    
  4. Add the current directory to PYTHONPATH to use the built module on macOS or Linux:

    cd ../; export PYTHONPATH=$PYTHONPATH:$(pwd)
    

You can build an extension module by running the ya make command in the catboost/R-package/src directory.

Run tests

CatBoost provides tests that check the compliance of the canonical data with the resulting data.

The required steps for running these tests depend on the implementation.

  1. Execute common tests:

    1. Open the catboost/pytest directory from the local copy of the CatBoost repository.

    2. Run the following command:

    ../../ya make -t -A [-Z]
    

    -Z — Optional key to replace the canonical files if the code breaks tests intentionally.

  2. Execute tests for the GPU implementation:

    1. Open the catboost/pytest/cuda_tests directory from the local copy of the CatBoost repository.

    2. Run the following command:

    ../../../ya make -DCUDA_ROOT=<path_to_CUDA_SDK> -t -A [-Z]
    
    • path_to_CUDA_SDK is the path to directory where CUDA SDK is installed. For example, the typical installation directory for Linux is /usr/local/cuda-X.Y, where X.Y is the installed CUDA SDK version.

    • -Z — Optional key to replace the canonical files if the code breaks tests intentionally.

Use the VCS diff tool to analyze the differences.

  1. Execute common tests:

    1. Open the catboost/python-package/ut/medium directory from the local copy of the CatBoost repository.

    2. Run the following command:

    ../../../../ya make -t -A [-Z]
    

    -Z — Optional key to replace the canonical files if the code breaks tests intentionally.

  2. Execute tests for the GPU implementation:

    1. Open the catboost/python-package/ut/medium/gpu directory from the local copy of the CatBoost repository.

    2. Run the following command:

    ../../../../../ya make -DCUDA_ROOT=<path_to_CUDA_SDK> -t -A [-Z]
    
    • path_to_CUDA_SDK is the path to directory where CUDA SDK is installed. For example, the typical installation directory for Linux is /usr/local/cuda-X.Y, where X.Y is the installed CUDA SDK version.

    • -Z — Optional key to replace the canonical files if the code breaks tests intentionally.

Use the VCS diff tool to analyze the differences.

  1. Install additional R packages that are required to run tests:

    • caret
    • dplyr
    • jsonlite
    • testthat
  2. Open the R-package directory from the local copy of the CatBoost repository.

  3. Run the following command:

    R CMD check .
    

To run tests using the devtools package:

  1. Install devtools.

  2. Run the following command from the R session:

    devtools::test()
    

Develop in Windows

A solution for Visual Studio is available in the CatBoost repository:

catboost/msvs/arcadia.sln

Coding conventions

The following coding conventions must be followed in order to successfully contribute to the CatBoost project:

Versioning conventions

Do not change the package version when submitting pull requests. Yandex uses an internal repository for this purpose.

Yandex Contributor License Agreement

To contribute to CatBoost you need to read the Yandex CLA and indicate that you agree to its terms. Details of how to do that and the text of the CLA can be found in CONTRIBUTING.md.