Development and contributions
Build from source
Run tests
Warning
CatBoost uses CMake-based build process since this commit. Previously Ya Make (Yandex's build system) had been used.
CMake-based build tests
-
C/C++ libraries.
C/C++ libraries contain tests for them in
utsubdirectories in the source tree. For library inx/y/zthe corresponding test code will be inx/y/z/utand the target name will bex-y-z-ut.
So, in order to run the test run CMake and then build the correspondingx-y-z-uttarget. Building this target will produce an executable${CMAKE_BUILD_DIR}/x/y/z/x-y-z-ut. Run this executable to execute all the tests. -
R package
-
Install additional R packages that are required to run tests:
caretdplyrjsonlitetestthat
-
Open the
R-packagedirectory from the local copy of the CatBoost repository. -
Run the following command:
R CMD check .
To run tests using the devtools package:
-
Install devtools.
-
Run the following command from the R session:
devtools::test()
-
-
CLI
-
Install pytest, pandas and catboost (used for reading column description files using
catboost.utils.read_cd) packages for the python interpreter you intend to use.
Optionally install pytest-xdist and pytest-randomly to run tests in parallel (it will be faster). -
Build the CLI binary (target
catboostfor Ninja or another build tool) and a supplementary tool that is used to compare results generated as tests output with the canonical ones (targetlimited_precision_dsv_difffor Ninja or another build tool). -
Set the following environment variables:
CMAKE_SOURCE_DIRto the root of the local copy of the CatBoost repository.CMAKE_BINARY_DIRto the root for the build directory that has been generated byCMakeand where the aformentioned targets have been built.TEST_OUTPUT_DIRto the root for the directory where tests temporary data will be generated.PORT_SYNC_PATHto the path to the directory that will be used for network ports allocation syncronization. The directory will be created if not exists.HAVE_CUDA- set to1if you want to run tests on GPU withCUDA, set to0otherwise.
-
Open the
catboost/pytestdirectory from the local copy of the CatBoost repository. -
Run
python -m pytestor (if you usepytest-xdist)python -m pytest -n <parallel_worker_count>orpython -m pytest -n auto(in theautocase the number of parallel workers will be equal to the total count of detected CPU cores).
-
-
Python package
Tests will check
catboostmodule for thepythoninterpreter you run them with, so if you want to testcatboostpython package built from source build and install it first.-
Install pytest, pandas, ipywidgets and scikit-learn packages for the python interpreter you intend to use.
Optionally install pytest-xdist and pytest-randomly to run tests in parallel (it will be faster). -
Build supplementary tools that are used to compare results generated as tests output with the canonical ones (targets
limited_precision_dsv_diff,limited_precision_json_diff,model_comparatorfor Ninja or another build tool). -
Set the following environment variables:
CMAKE_SOURCE_DIRto the root of the local copy of the CatBoost repository.CMAKE_BINARY_DIRto the root for the build directory that has been generated byCMakeand where the aformentioned targets have been built.TEST_OUTPUT_DIRto the root for the directory where tests temporary data will be generated.PORT_SYNC_PATHto the path to the directory that will be used for network ports allocation syncronization. The directory will be created if not exists.
-
Open the
catboost/python-package/ut/mediumdirectory from the local copy of the CatBoost repository. -
Run
python -m pytestor (if you usepytest-xdist)python -m pytest -n <parallel_worker_count>orpython -m pytest -n auto(in theautocase the number of parallel workers will be equal to the total count of detected CPU cores).
Warning
Tests on GPU with
CUDAwill be run if and only if GPU with CUDA drivers installed is present. -
-
JVM applier
Open the
catboost/jvm-packages/catboost4j-predictiondirectory from the local copy of the CatBoost repository. Run standardmvn testcommand.
To run tests on GPU as well add-DtestOnGPU=1command line flag. -
CatBoost for Apache Spark
See building CatBoost for Apache Spark from source. Use standard
mvn testcommand.
YaMake-based build tests
Warning
The following documentation describes running tests using Ya Make which is applicable only for versions prior to this commit.
CatBoost provides tests that check the compliance of the canonical data with the resulting data.
The required steps for running these tests depend on the implementation.
-
Execute common tests:
-
Open the
catboost/pytestdirectory from the local copy of the CatBoost repository. -
Run the following command:
../../ya make -t -A [-Z]-Z— Optional key to replace the canonical files if the code breaks tests intentionally. -
-
Execute tests for the GPU implementation:
-
Open the
catboost/pytest/cuda_testsdirectory from the local copy of the CatBoost repository. -
Run the following command:
../../../ya make -DCUDA_ROOT=<path_to_CUDA_SDK> -t -A [-Z]-
path_to_CUDA_SDKis the path to directory where CUDA SDK is installed. For example, the typical installation directory for Linux is/usr/local/cuda-X.Y, whereX.Y
is the installed CUDA SDK version. -
-Z— Optional key to replace the canonical files if the code breaks tests intentionally.
-
Use the VCS diff tool to analyze the differences.
-
Execute common tests:
-
Open the
catboost/python-package/ut/mediumdirectory from the local copy of the CatBoost repository. -
Run the following command:
../../../../ya make -t -A [-Z]-Z— Optional key to replace the canonical files if the code breaks tests intentionally. -
-
Execute tests for the GPU implementation:
-
Open the
catboost/python-package/ut/medium/gpudirectory from the local copy of the CatBoost repository. -
Run the following command:
../../../../../ya make -DCUDA_ROOT=<path_to_CUDA_SDK> -t -A [-Z]-
path_to_CUDA_SDKis the path to directory where CUDA SDK is installed. For example, the typical installation directory for Linux is/usr/local/cuda-X.Y, whereX.Y
is the installed CUDA SDK version. -
-Z— Optional key to replace the canonical files if the code breaks tests intentionally.
-
Use the VCS diff tool to analyze the differences.
-
Install additional R packages that are required to run tests:
caretdplyrjsonlitetestthat
-
Open the
R-packagedirectory from the local copy of the CatBoost repository. -
Run the following command:
R CMD check .
To run tests using the devtools package:
-
Install devtools.
-
Run the following command from the R session:
devtools::test()
Microsoft Visual Studio solution
Warning
Ready Microsoft Visual Studio solution had been provided until this commit.
For versions after this commit it is recommended to generate Microsoft Visual Studio 2019 solution using the corresponding CMake generator.
A solution for Visual Studio is available in the CatBoost repository:
catboost/msvs/arcadia.sln
Coding conventions
The following coding conventions must be followed in order to successfully contribute to the CatBoost project:
- C++ style guide
- pep8 for Python
Versioning conventions
Do not change the package version when submitting pull requests. Yandex uses an internal repository for this purpose.
License
By contributing to this project, you agree that your contributions will be licensed under the Apache 2.0 license.