Development and contributions
Build from source
The required steps for building CatBoost depend on the implementation.
Note
Windows build currently requires Microsoft Visual studio 2015.3 toolset v140 and Windows 10 SDK (10.0.17134.0).
-
Clone the repository:
git clone https://github.com/catboost/catboost.git
-
Open the
catboost/catboost/app
directory from the local copy of the CatBoost repository. -
Run the following command:
../../ya make -d [-o <output directory>]
Use the
-j <number of threads>
option to change the number of threads used when building the project.
The following packages are required for installation:
python3
python3-dev
numpy
pandas
To install the Python package:
-
Clone the repository:
git clone https://github.com/catboost/catboost.git
-
Open the
catboost/catboost/python-package/catboost
directory from the local copy of the CatBoost repository. -
Compile the packages using one of the following methods:
-
Use one of the Python versions provided by the
ya make
utility:../../../ya make -r -DUSE_ARCADIA_PYTHON=no -DUSE_SYSTEM_PYTHON=<Python version> [optional parameters]
-
Use one of the Python versions installed on the machine:
../../../ya make -r -DUSE_ARCADIA_PYTHON=no -DOS_SDK=local -DPYTHON_CONFIG=<path to the required python-config> [optional parameters]
Parameter Description Parameters that define the Python version to use for compiling. Only one of the following blocks of options can be used at a time Use one of the Python versions provided by the ya make
utility-DUSE_SYSTEM_PYTHON
The version of Python to use for compiling the package on machines without installed Python. The following Python versions are supported and can be defined as values for this parameter:
- 2.7
- 3.4
- 3.5
- 3.6
Use one of the Python versions installed on the machine -DPYTHON_CONFIG
Defines the path to the configuration of an installed Python version to use for compiling the package. Value examples:
python2-config
for Python 2python3-config
for Python 3/usr/bin/python2.7-config
- The configuration must be explicitly named
python3-config
to successfully build the package for Python 3. - Manually redefine the following variables when encountering problems with the Python configuration:
-DPYTHON_INCLUDE
-DPYTHON_LIBRARIES
-DPYTHON_LDFLAGS
-DPYTHON_FLAGS
-DPYTHON_BIN
Optional parameters -DCUDA_ROOT
The path to CUDA. This parameter is required to support training on GPU. -DHAVE_CUDA=no
Disable CUDA support. This speeds up compilation. By default, the package is built with CUDA support if CUDA Toolkit is installed.
-o
The directory to output the compiled package to. By default, the current directory is used. For example, the following command builds the package for Python 3 with training on GPU support:
../../../ya make -r -DUSE_ARCADIA_PYTHON=no -DOS_SDK=local -DPYTHON_CONFIG=python3-config -DCUDA_ROOT=/usr/local/cuda
-
-
Add the current directory to
PYTHONPATH
to use the built module on macOS or Linux:cd ../; export PYTHONPATH=$PYTHONPATH:$(pwd)
You can build an extension module by running the ya make
command in the catboost/R-package/src
directory.
Run tests
CatBoost provides tests that check the compliance of the canonical data with the resulting data.
The required steps for running these tests depend on the implementation.
-
Execute common tests:
-
Open the
catboost/pytest
directory from the local copy of the CatBoost repository. -
Run the following command:
../../ya make -t -A [-Z]
-Z
— Optional key to replace the canonical files if the code breaks tests intentionally. -
-
Execute tests for the GPU implementation:
-
Open the
catboost/pytest/cuda_tests
directory from the local copy of the CatBoost repository. -
Run the following command:
../../../ya make -DCUDA_ROOT=<path_to_CUDA_SDK> -t -A [-Z]
-
path_to_CUDA_SDK
is the path to directory where CUDA SDK is installed. For example, the typical installation directory for Linux is/usr/local/cuda-X.Y
, whereX.Y
is the installed CUDA SDK version. -
-Z
— Optional key to replace the canonical files if the code breaks tests intentionally.
-
Use the VCS diff tool to analyze the differences.
-
Execute common tests:
-
Open the
catboost/python-package/ut/medium
directory from the local copy of the CatBoost repository. -
Run the following command:
../../../../ya make -t -A [-Z]
-Z
— Optional key to replace the canonical files if the code breaks tests intentionally. -
-
Execute tests for the GPU implementation:
-
Open the
catboost/python-package/ut/medium/gpu
directory from the local copy of the CatBoost repository. -
Run the following command:
../../../../../ya make -DCUDA_ROOT=<path_to_CUDA_SDK> -t -A [-Z]
-
path_to_CUDA_SDK
is the path to directory where CUDA SDK is installed. For example, the typical installation directory for Linux is/usr/local/cuda-X.Y
, whereX.Y
is the installed CUDA SDK version. -
-Z
— Optional key to replace the canonical files if the code breaks tests intentionally.
-
Use the VCS diff tool to analyze the differences.
-
Install additional R packages that are required to run tests:
caret
dplyr
jsonlite
testthat
-
Open the
R-package
directory from the local copy of the CatBoost repository. -
Run the following command:
R CMD check .
To run tests using the devtools package:
-
Install devtools.
-
Run the following command from the R session:
devtools::test()
Develop in Windows
A solution for Visual Studio is available in the CatBoost repository:
catboost/msvs/arcadia.sln
Coding conventions
The following coding conventions must be followed in order to successfully contribute to the CatBoost project:
- C++ style guide
- pep8 for Python
Versioning conventions
Do not change the package version when submitting pull requests. Yandex uses an internal repository for this purpose.
Yandex Contributor License Agreement
To contribute to CatBoost you need to read the Yandex CLA and indicate that you agree to its terms. Details of how to do that and the text of the CLA can be found in CONTRIBUTING.md.