Development and contributions
Build from source
Run tests
Warning
CatBoost uses CMake-based build process since this commit. Previously Ya Make
(Yandex's build system) had been used.
CMake-based build tests
-
C/C++ libraries.
C/C++ libraries contain tests for them in
ut
subdirectories in the source tree. For library inx/y/z
the corresponding test code will be inx/y/z/ut
and the target name will bex-y-z-ut
.
So, in order to run the test run CMake and then build the correspondingx-y-z-ut
target. Building this target will produce an executable${CMAKE_BUILD_DIR}/x/y/z/x-y-z-ut
. Run this executable to execute all the tests. -
R package
-
Install additional R packages that are required to run tests:
caret
dplyr
jsonlite
testthat
-
Open the
R-package
directory from the local copy of the CatBoost repository. -
Run the following command:
R CMD check .
To run tests using the devtools package:
-
Install devtools.
-
Run the following command from the R session:
devtools::test()
-
-
CLI
- Install testpath, pytest, pandas and catboost (used for reading column description files using
catboost.utils.read_cd
) packages for the python interpreter you intend to use.
Optionally install pytest-xdist and pytest-randomly to run tests in parallel (it will be faster). - Build the CLI binary (target
catboost
for Ninja or another build tool) and a supplementary tool that is used to compare results generated as tests output with the canonical ones (targetlimited_precision_dsv_diff
for Ninja or another build tool). - Set the following environment variables:
CMAKE_SOURCE_DIR
to the root of the local copy of the CatBoost repository.CMAKE_BINARY_DIR
to the root for the build directory that has been generated byCMake
and where the aformentioned targets have been built.TEST_OUTPUT_DIR
to the root for the directory where tests temporary data will be generated.PORT_SYNC_PATH
to the path to the directory that will be used for network ports allocation syncronization. The directory will be created if not exists.HAVE_CUDA
- set to1
if you want to run tests on GPU withCUDA
, set to0
otherwise.
- Open the
catboost/pytest
directory from the local copy of the CatBoost repository. - Run
python -m pytest
or (if you usepytest-xdist
)python -m pytest -n <parallel_worker_count>
orpython -m pytest -n auto
(in theauto
case the number of parallel workers will be equal to the total count of detected CPU cores).
- Install testpath, pytest, pandas and catboost (used for reading column description files using
-
Python package
Tests will check
catboost
module for thepython
interpreter you run them with, so if you want to testcatboost
python package built from source build and install it first.- Install testpath, pytest, pandas, ipywidgets and scikit-learn packages for the python interpreter you intend to use.
Optionally install pytest-xdist and pytest-randomly to run tests in parallel (it will be faster). - Build supplementary tools that are used to compare results generated as tests output with the canonical ones (targets
limited_precision_dsv_diff
,limited_precision_json_diff
,model_comparator
for Ninja or another build tool). - Set the following environment variables:
CMAKE_SOURCE_DIR
to the root of the local copy of the CatBoost repository.CMAKE_BINARY_DIR
to the root for the build directory that has been generated byCMake
and where the aformentioned targets have been built.TEST_OUTPUT_DIR
to the root for the directory where tests temporary data will be generated.PORT_SYNC_PATH
to the path to the directory that will be used for network ports allocation syncronization. The directory will be created if not exists.
- Open the
catboost/python-package/ut/medium
directory from the local copy of the CatBoost repository. - Run
python -m pytest
or (if you usepytest-xdist
)python -m pytest -n <parallel_worker_count>
orpython -m pytest -n auto
(in theauto
case the number of parallel workers will be equal to the total count of detected CPU cores).
Warning
Tests on GPU with
CUDA
will be run if and only if GPU with CUDA drivers installed is present. - Install testpath, pytest, pandas, ipywidgets and scikit-learn packages for the python interpreter you intend to use.
-
JVM applier
Open the
catboost/jvm-packages/catboost4j-prediction
directory from the local copy of the CatBoost repository. Run standardmvn test
command.
To run tests on GPU as well add-DtestOnGPU=1
command line flag. -
CatBoost for Apache Spark
See building CatBoost for Apache Spark from source. Use standard
mvn test
command.
YaMake-based build tests
Warning
The following documentation describes running tests using Ya Make which is applicable only for versions prior to this commit.
CatBoost provides tests that check the compliance of the canonical data with the resulting data.
The required steps for running these tests depend on the implementation.
-
Execute common tests:
-
Open the
catboost/pytest
directory from the local copy of the CatBoost repository. -
Run the following command:
../../ya make -t -A [-Z]
-Z
— Optional key to replace the canonical files if the code breaks tests intentionally. -
-
Execute tests for the GPU implementation:
-
Open the
catboost/pytest/cuda_tests
directory from the local copy of the CatBoost repository. -
Run the following command:
../../../ya make -DCUDA_ROOT=<path_to_CUDA_SDK> -t -A [-Z]
-
path_to_CUDA_SDK
is the path to directory where CUDA SDK is installed. For example, the typical installation directory for Linux is/usr/local/cuda-X.Y
, whereX.Y
is the installed CUDA SDK version. -
-Z
— Optional key to replace the canonical files if the code breaks tests intentionally.
-
Use the VCS diff tool to analyze the differences.
-
Execute common tests:
-
Open the
catboost/python-package/ut/medium
directory from the local copy of the CatBoost repository. -
Run the following command:
../../../../ya make -t -A [-Z]
-Z
— Optional key to replace the canonical files if the code breaks tests intentionally. -
-
Execute tests for the GPU implementation:
-
Open the
catboost/python-package/ut/medium/gpu
directory from the local copy of the CatBoost repository. -
Run the following command:
../../../../../ya make -DCUDA_ROOT=<path_to_CUDA_SDK> -t -A [-Z]
-
path_to_CUDA_SDK
is the path to directory where CUDA SDK is installed. For example, the typical installation directory for Linux is/usr/local/cuda-X.Y
, whereX.Y
is the installed CUDA SDK version. -
-Z
— Optional key to replace the canonical files if the code breaks tests intentionally.
-
Use the VCS diff tool to analyze the differences.
-
Install additional R packages that are required to run tests:
caret
dplyr
jsonlite
testthat
-
Open the
R-package
directory from the local copy of the CatBoost repository. -
Run the following command:
R CMD check .
To run tests using the devtools package:
-
Install devtools.
-
Run the following command from the R session:
devtools::test()
Microsoft Visual Studio solution
Warning
Ready Microsoft Visual Studio solution had been provided until this commit.
For versions after this commit it is recommended to generate Microsoft Visual Studio 2019 solution using the corresponding CMake generator.
A solution for Visual Studio is available in the CatBoost repository:
catboost/msvs/arcadia.sln
Coding conventions
The following coding conventions must be followed in order to successfully contribute to the CatBoost project:
- C++ style guide
- pep8 for Python
Versioning conventions
Do not change the package version when submitting pull requests. Yandex uses an internal repository for this purpose.
Yandex Contributor License Agreement
To contribute to CatBoost you need to read the Yandex CLA and indicate that you agree to its terms. Details of how to do that and the text of the CLA can be found in CONTRIBUTING.md.