Common parameters

loss_function
custom_metric
eval_metric
iterations
learning_rate
random_seed
l2_leaf_reg
bootstrap_type
bagging_temperature
subsample
sampling_frequency
sampling_unit
mvs_reg
random_strength
use_best_model
best_model_min_trees
depth
grow_policy
min_data_in_leaf
max_leaves
ignored_features
one_hot_max_size
has_time
rsm
nan_mode
input_borders
output_borders
fold_permutation_block
leaf_estimation_method
leaf_estimation_iterations
leaf_estimation_backtracking
fold_len_multiplier
approx_on_full_history
class_weights
class_names
auto_class_weights
scale_pos_weight
boosting_type
boost_from_average
langevin
diffusion_temperature
posterior_sampling
allow_const_label
score_function
monotone_constraints
feature_weights
first_feature_use_penalties
fixed_binary_splits
penalties_coefficient
per_object_feature_penalties
model_shrink_rate
model_shrink_mode

loss_function

Command-line: --loss-function

Alias: objective

The metric to use in training. The specified value also determines the machine learning problem to solve. Some metrics support optional parameters (see the Objectives and metrics section for details on each metric).

Format:

<Metric>[:<parameter 1>=<value>;..;<parameter N>=<value>]

Supported metrics

RMSE
Logloss
MAE
CrossEntropy
Quantile
LogLinQuantile
Lq
MultiRMSE
MultiClass
MultiClassOneVsAll
MultiLogloss
MultiCrossEntropy
MAPE
Poisson
PairLogit
PairLogitPairwise
QueryRMSE
QuerySoftMax
GroupQuantile
Tweedie
YetiRank
YetiRankPairwise
StochasticFilter
StochasticRank

A custom python object can also be set as the value of this parameter (see an example).

For example, use the following construction to calculate the value of Quantile with the coefficient $\alpha = 0.1$ :

Quantile:alpha=0.1

Type

string
object

Default value

Python package

Depends on the class:

CatBoostClassifier: Logloss if the target_border parameter value differs from None. Otherwise, the default loss function depends on the number of unique target values and is either set to Logloss or MultiClass.
CatBoost and CatBoostRegressor: RMSE

R package, Command-line

RMSE

Supported processing units

CPU and GPU

custom_metric

Command-line: --custom-metric

Description

Metric values to output during training. These functions are not optimized and are displayed for informational purposes only. Some metrics support optional parameters (see the Objectives and metrics section for details on each metric).

Format:

<Metric>[:<parameter 1>=<value>;..;<parameter N>=<value>]

Supported metrics

Examples

Calculate the value of CrossEntropy:
```
CrossEntropy
```
Calculate the value of Quantile with the coefficient $\alpha = 0.1$
```
Quantile:alpha=0.1
```
Calculate the values of Logloss and AUC:
```
['Logloss', 'AUC']
```

Values of all custom metrics for learn and validation datasets are saved to the Metric output files (learn_error.tsv and test_error.tsv respectively). The directory for these files is specified in the --train-dir (train_dir) parameter.

Use the visualization tools to see a live chart with the dynamics of the specified metrics.

Type

string
list of strings

Default value

Python package

None

R package

None

Command-line

None (do not output additional metric values)

Supported processing units

CPU and GPU

eval_metric

Command-line: --eval-metric

Description

The metric used for overfitting detection (if enabled) and best model selection (if enabled). Some metrics support optional parameters (see the Objectives and metrics section for details on each metric).

Format:

<Metric>[:<parameter 1>=<value>;..;<parameter N>=<value>]

Supported metrics

A user-defined function can also be set as the value (see an example).

Examples:

R2

Type

string
object

Default value

Optimized objective is used

Supported processing units

CPU and GPU

iterations

Command-line: -i, --iterations

Aliases: num_boost_round, n_estimators, num_trees

Description

The maximum number of trees that can be built when solving machine learning problems.

When using other parameters that limit the number of iterations, the final number of trees may be less than the number specified in this parameter.

Type

int

Default value

1000

Supported processing units

CPU and GPU

learning_rate

Command-line: -w, --learning-rate

Alias: eta

Description

The learning rate.

Used for reducing the gradient step.

Type

float

Default value

The default value is defined automatically for Logloss, MultiClass & RMSE loss functions depending on the number of iterations if none of parameters leaf_estimation_iterations, --leaf-estimation-method,l2_leaf_reg is set. In this case, the selected learning rate is printed to stdout and saved in the model.

In other cases, the default value is 0.03.

Supported processing units

CPU and GPU

random_seed

Command-line: -r, --random-seed

Alias:random_state

Description

The random seed used for training.

Type

int

Default value

Python package

None (0)

R package, Command-line

Supported processing units

CPU and GPU

l2_leaf_reg

Command-line: --l2-leaf-reg, l2-leaf-regularizer

Alias: reg_lambda

Description

Coefficient at the L2 regularization term of the cost function.

Any positive value is allowed.

Type

float

Default value

3.0

Supported processing units

CPU and GPU

bootstrap_type

Command-line: --bootstrap-type

Description

Bootstrap type. Defines the method for sampling the weights of objects.

Supported methods:

Bayesian
Bernoulli
MVS
Poisson (supported for GPU only)
No

Type

string

Default value

The default value depends on objective, task_type, bagging_temperature and sampling_unit:

When the objective parameter is QueryCrossEntropy, YetiRankPairwise, PairLogitPairwise and the bagging_temperature parameter is not set: Bernoulli with the subsample parameter set to 0.5.
Neither MultiClass nor MultiClassOneVsAll, task_type = CPU and sampling_unit = Object: MVS with the subsample parameter set to 0.8.
Otherwise: Bayesian.

Supported processing units

CPU and GPU

bagging_temperature

Command-line: --bagging-temperature

Description

Defines the settings of the Bayesian bootstrap. It is used by default in classification and regression modes.

Use the Bayesian bootstrap to assign random weights to objects.

The weights are sampled from exponential distribution if the value of this parameter is set to 1. All weights are equal to 1 if the value of this parameter is set to 0.

Possible values are in the range $[0; \inf)$ . The higher the value the more aggressive the bagging is.

This parameter can be used if the selected bootstrap type is Bayesian.

Type

float

Default value

Supported processing units

CPU and GPU

subsample

Command-line: --subsample

Description

Sample rate for bagging.

This parameter can be used if one of the following bootstrap types is selected:

Poisson
Bernoulli
MVS

Type

float

Default value

The default value depends on the dataset size and the bootstrap type:

Datasets with less than 100 objects — 1
Datasets with 100 objects or more:
- Poisson, Bernoulli — 0.66
- MVS — 0.8

Supported processing units

CPU and GPU

sampling_frequency

Command-line: --sampling-frequency

Description

Frequency to sample weights and objects when building trees.

Supported values:

PerTree — Before constructing each new tree
PerTreeLevel — Before choosing each new split of a tree

Type

string

Default value

PerTreeLevel

Supported processing units

CPU

sampling_unit

Command-line: --sampling-unit

Description

The sampling scheme.

Possible values:

Object — The weight $w_{i}$ of the i-th object $o_{i}$ is used for sampling the corresponding object.
Group — The weight $w_{j}$ of the group $g_{j}$ is used for sampling each object $o_{i_{j}}$ from the group $g_{j}$ .

Type

String

Default value

Object

Supported processing units

CPU and GPU

mvs_reg

Command-line: --mvs-reg

Description

Affects the weight of the denominator and can be used for balancing between the importance and Bernoulli sampling (setting it to 0 implies importance sampling and to $\infty$ - Bernoulli).

Note

This parameter is supported only for the MVS sampling method (the bootstrap_type parameter must be set to MVS).

Type

float

Default value

The value is set based on the gradient distribution on the current iteration

Supported processing units

CPU

random_strength

Command-line: --random-strength

Description

The amount of randomness to use for scoring splits when the tree structure is selected. Use this parameter to avoid overfitting the model.

The value of this parameter is used when selecting splits. On every iteration each possible split gets a score (for example, the score indicates how much adding this split will improve the loss function for the training dataset). The split with the highest score is selected.

The scores have no randomness. A normally distributed random variable is added to the score of the feature. It has a zero mean and a variance that decreases during the training. The value of this parameter is the multiplier of the variance.

Note

This parameter is not supported for the following loss functions:

QueryCrossEntropy
YetiRankPairwise
PairLogitPairwise

Type

float

Default value

Supported processing units

CPU

use_best_model

Command-line: --use-best-model

Description

If this parameter is set, the number of trees that are saved in the resulting model is defined as follows:

Build the number of trees defined by the training parameters.
Use the validation dataset to identify the iteration with the optimal value of the metric specified in --eval-metric (--eval-metric).

No trees are saved after this iteration.

This option requires a validation dataset to be provided.

Type

bool

Default value

True if a validation set is input (the eval_set parameter is defined) and at least one of the label values of objects in this set differs from the others. False otherwise.

Supported processing units

CPU and GPU

best_model_min_trees

Command-line: --best-model-min-trees

Description

The minimal number of trees that the best model should have. If set, the output model contains at least the given number of trees even if the optimal value of the evaluation metric on the validation dataset is achieved with smaller number of trees.

Should be used with the --use-best-model parameter.

Type

int

Default value

Python package, R package

None (The minimal number of trees for the best model is not set)

Command-line

The minimal number of trees for the best model is not set

Supported processing units

CPU and GPU

depth

Command-line: -n, --depth

Alias: max_depth

Description

Depth of the trees.

The range of supported values depends on the processing unit type and the type of the selected loss function:

CPU — Any integer up to 16.
GPU — Any integer up to 8 for pairwise modes (YetiRank, PairLogitPairwise, and QueryCrossEntropy), and up to 16 for all other loss functions.

Type

int

Default value

6 (16 if the growing policy is set to Lossguide)

Supported processing units

CPU and GPU

grow_policy

Command-line: --grow-policy

Description

The tree growing policy. Defines how to perform greedy tree construction.

Possible values:

SymmetricTree —A tree is built level by level until the specified depth is reached. On each iteration, all leaves from the last tree level are split with the same condition. The resulting tree structure is always symmetric.
Depthwise — A tree is built level by level until the specified depth is reached. On each iteration, all non-terminal leaves from the last tree level are split. Each leaf is split by condition with the best loss improvement.

Note

Models with this growing policy can not be analyzed using the PredictionDiff feature importance and can be exported only to json and cbm.
Lossguide — A tree is built leaf by leaf until the specified maximum number of leaves is reached. On each iteration, non-terminal leaf with the best loss improvement is split.

Note

Models with this growing policy can not be analyzed using the PredictionDiff feature importance and can be exported only to json and cbm.

Type

string

Default value

SymmetricTree

Supported processing units

CPU and GPU

min_data_in_leaf

Command-line: --min-data-in-leaf

Alias: min_child_samples

Description

The minimum number of training samples in a leaf. CatBoost does not search for new splits in leaves with samples count less than the specified value.
Can be used only with the Lossguide and Depthwise growing policies.

Type

int

Default value

Supported processing units

CPU and GPU

max_leaves

Command-line: --max-leaves

Alias:num_leaves

Description

The maximum number of leafs in the resulting tree. Can be used only with the Lossguide growing policy.

Note

It is not recommended to use values greater than 64, since it can significantly slow down the training process.

Type

int

Default value

Supported processing units

CPU and GPU

ignored_features

Command-line: -I, --ignore-features

Description

Feature indices to exclude from the training.

Python package

It is assumed that all passed values are feature names if at least one of the passed values can not be converted to a number or a range of numbers. Otherwise, it is assumed that all passed values are feature indices.

Specifics:

Non-negative indices that do not match any features are successfully ignored. For example, if five features are defined for the objects in the dataset and this parameter is set to 42, the corresponding non-existing feature is successfully ignored.
The identifier corresponds to the feature's index. Feature indices used in train and feature importance are numbered from 0 to featureCount – 1. If a file is used as input data then any non-feature column types are ignored when calculating these indices. For example, each row in the input file contains data in the following order: cat feature<\t>label value<\t>num feature. So for the row rock<\t>0<\t>42, the identifier for the rock feature is 0, and for the 42 feature it's 1.

For example, use the following construction if features indexed 1, 2, 7, 42, 43, 44, 45, should be ignored: [1,2,7,42,43,44,45]

R package

Specifics:

Non-negative indices that do not match any features are successfully ignored. For example, if five features are defined for the objects in the dataset and this parameter is set to 42, the corresponding non-existing feature is successfully ignored.
The identifier corresponds to the feature's index. Feature indices used in train and feature importance are numbered from 0 to featureCount – 1. If a file is used as input data then any non-feature column types are ignored when calculating these indices. For example, each row in the input file contains data in the following order: cat feature<\t>label value<\t>num feature. So for the row rock<\t>0<\t>42, the identifier for the rock feature is 0, and for the 42 feature it's 1.

For example, if training should exclude features with the identifiers 1, 2, 7, 42, 43, 44, 45, the value of this parameter should be set to c(1,2,7,42,43,44,45).

Command-line

Specifics:

Non-negative indices that do not match any features are successfully ignored. For example, if five features are defined for the objects in the dataset and this parameter is set to 42, the corresponding non-existing feature is successfully ignored.
The identifier corresponds to the feature's index. Feature indices used in train and feature importance are numbered from 0 to featureCount – 1. If a file is used as input data then any non-feature column types are ignored when calculating these indices. For example, each row in the input file contains data in the following order: cat feature<\t>label value<\t>num feature. So for the row rock<\t>0<\t>42, the identifier for the rock feature is 0, and for the 42 feature it's 1.

For example, if training should exclude features with the identifiers 1, 2, 7, 42, 43, 44, 45, use the following construction: 1:2:7:42-45.

Default value

Python package, R package

None

Command-line

Omitted

Supported processing units

CPU and GPU

one_hot_max_size

Command-line: --one-hot-max-size

Description

Use one-hot encoding for all categorical features with a number of different values less than or equal to the given parameter value. Ctrs are not calculated for such features.

See details.

Type

int

Default value

The default value depends on various conditions:

N/A if training is performed on CPU in Pairwise scoring mode
Read more about Pairwise scoring
The following loss functions use Pairwise scoring:
- YetiRankPairwise
- PairLogitPairwise
- QueryCrossEntropy
Pairwise scoring is slightly different from regular training on pairs, since pairs are generated only internally during the training for the corresponding metrics. One-hot encoding is not available for these loss functions.
255 if training is performed on GPU and the selected Ctr types require target data that is not available during the training
10 if training is performed in Ranking mode
2 if none of the conditions above is met

Supported processing units

CPU and GPU

has_time

Command-line: --has-time

Description

Use the order of objects in the input data (do not perform random permutations during the Transforming categorical features to numerical features and Choosing the tree structure stages).

The Timestamp column type is used to determine the order of objects if specified in the input data.

Type

bool

Default value

False (not used; generates random permutations)

Supported processing units

CPU and GPU

rsm

Command-line: --rsm

Alias:colsample_bylevel

Description

Random subspace method. The percentage of features to use at each split selection, when features are selected over again at random.

The value must be in the range (0;1].

Type

float (0;1]

Default value

None (set to 1)

Supported processing units

CPU; GPU for pairwise ranking

nan_mode

Command-line: --nan-mode

Description

The method for processing missing values in the input dataset.

Possible values:

"Forbidden" — Missing values are not supported, their presence is interpreted as an error.
"Min" — Missing values are processed as the minimum value (less than all other values) for the feature. It is guaranteed that a split that separates missing values from all other values is considered when selecting trees.
"Max" — Missing values are processed as the maximum value (greater than all other values) for the feature. It is guaranteed that a split that separates missing values from all other values is considered when selecting trees.

Using the Min or Max value of this parameter guarantees that a split between missing values and other values is considered when selecting a new split in the tree.

Note

The method for processing missing values can be set individually for each feature in the Custom quantization borders and missing value modes input file. Such values override the ones specified in this parameter.

Type

string

Default value

Min

Supported processing units

CPU and GPU

input_borders

Command-line: --input-borders-file

Description

Load Custom quantization borders and missing value modes from a file (do not generate them).

Borders are automatically generated before training if this parameter is not set.

Type

string

Default value

Python package

None

Command-line

The file is not loaded, the values are generated

Supported processing units

CPU and GPU

output_borders

Command-line: --output-borders-file

Description

Save quantization borders for the current dataset to a file.

Refer to the file format description.

Type

string

Default value

Python package

None

Command-line

The file is not saved

Supported processing units

CPU and GPU

fold_permutation_block

Command-line: --fold-permutation-block

Description

Objects in the dataset are grouped in blocks before the random permutations. This parameter defines the size of the blocks. The smaller is the value, the slower is the training. Large values may result in quality degradation.

Type

int

Default value

Python package

R package, Command-line

Default value differs depending on the dataset size and ranges from 1 to 256 inclusively

Supported processing units

CPU and GPU

leaf_estimation_method

Command-line: --leaf-estimation-method

Description

The method used to calculate the values in leaves.

Possible values:

Newton
Gradient
Exact

Type

string

Default value

Depends on the mode and the selected loss function:

Regression with Quantile or MAE loss functions — One Exact iteration.
Regression with any loss function but Quantile or MAE – One Gradient iteration.
Classification mode – Ten Newton iterations.
Multiclassification mode – One Newton iteration.

Supported processing units

CPU and GPU

leaf_estimation_iterations

Command-line: --leaf-estimation-iterations

Description

CatBoost might calculate leaf values using several gradient or newton steps instead of a single one.

This parameter regulates how many steps are done in every tree when calculating leaf values.

Type

int

Default value

Python package

None (Depends on the training objective)

R package, Command-line

Depends on the training objective

Supported processing units

CPU and GPU

leaf_estimation_backtracking

Command-line: --leaf-estimation-backtracking

Description

When the value of the leaf_estimation_iterations parameter is greater than 1, CatBoost makes several gradient or newton steps when calculating the resulting leaf values of a tree.

The behaviour differs depending on the value of this parameter:

No — Every next step is a regular gradient or newton step: the gradient step is calculated and added to the leaf.
Any other value —Backtracking is used.
In this case, before adding a step, a condition is checked. If the condition is not met, then the step size is reduced (divided by 2), otherwise the step is added to the leaf.

When leaf_estimation_iterations for the Command-line version is set to n, the leaf estimation iterations are calculated as follows: each iteration is either an addition of the next step to the leaf value, or it's a scaling of the leaf value. Scaling counts as a separate iteration. Thus, it is possible that instead of having n gradient steps, the algorithm makes a single gradient step that is reduced n times, which means that it is divided by $2\cdot n$ times.

Possible values:

No — Do not use backtracking. Supported on CPU and GPU.
AnyImprovement — Reduce the descent step up to the point when the loss function value is smaller than it was on the previous step. The trial reduction factors are 2, 4, 8, and so on. Supported on CPU and GPU.
Armijo — Reduce the descent step until the Armijo condition is met. Supported only on GPU.

Type

string

Default value

AnyImprovement

Supported processing units

Depends on the selected value

fold_len_multiplier

Command-line: --fold-len-multiplier

Description

Coefficient for changing the length of folds.

The value must be greater than 1. The best validation result is achieved with minimum values.

With values close to 1 (for example, $1+\epsilon$ ), each iteration takes a quadratic amount of memory and time for the number of objects in the iteration. Thus, low values are possible only when there is a small number of objects.

Type

float

Default value

Supported processing units

CPU and GPU

approx_on_full_history

Command-line:--approx-on-full-history

Description

The principles for calculating the approximated values.

Possible values:

False — Use only а fraction of the fold for calculating the approximated values. The size of the fraction is calculated as follows: $\frac{1}X$ , where X is the specified coefficient for changing the length of folds. This mode is faster and in rare cases slightly less accurate
True — Use all the preceding rows in the fold for calculating the approximated values. This mode is slower and in rare cases slightly more accurate.

Type

bool

Default value

Python package, Command-line

False

R package

True

Supported processing units

CPU

class_weights

Command-line: --class-weights

Description

Class weights. The values are used as multipliers for the object weights. This parameter can be used for solving binary classification and multiclassification problems.

Python package

Note

For imbalanced datasets with binary classification the weight multiplier can be set to 1 for class 0 and to $\left(\frac{sum\_negative}{sum\_positive}\right)$ for class 1.

For example, class_weights=[0.1, 4]multiplies the weights of objects from class 0 by 0.1 and the weights of objects from class 1 by 4.

If class labels are not standard consecutive integers [0, 1 ... class_count-1], use the dict or collections.OrderedDict type with label to weight mapping.

For example, class_weights={'a': 1.0, 'b': 0.5, 'c': 2.0} multiplies the weights of objects with class label a by 1.0, the weights of objects with class label b by 0.5 and the weights of objects with class label c by 2.0.

The dictionary form can also be used with standard consecutive integers class labels for additional readability. For example: class_weights={0: 1.0, 1: 0.5, 2: 2.0}.

Note

Class labels are extracted from dictionary keys for the following types of class_weights:

dict
collections.OrderedDict (when the order of classes in the model is important)

The class_names parameter can be skipped when using these types.

Alert

Do not use this parameter with auto_class_weights and scale_pos_weight.

R package

For example, class_weights <- c(0.1, 4) multiplies the weights of objects from class 0 by 0.1 and the weights of objects from class 1 by 4.

Alert

Do not use this parameter with auto_class_weights.

Command-line

Note

The quantity of class weights must match the quantity of class names specified in the --class-names parameter and the number of classes specified in the --classes-count parameter.

For imbalanced datasets with binary classification the weight multiplier can be set to 1 for class 0 and to $\left(\frac{sum\_negative}{sum\_positive}\right)$ for class 1.

Format:

<value for class 1>,..,<values for class N>

For example:

0.85,1.2,1

Alert

Do not use this parameter with auto_class_weights.

Type

list
dict
collections.OrderedDict

Default value

None (the weight for all classes is set to 1)

Supported processing units

CPU and GPU

class_names

Description

Classes names. Allows to redefine the default values when using the MultiClass and Logloss metrics.

If the upper limit for the numeric class label is specified, the number of classes names should match this value.

Warning

The quantity of classes names must match the quantity of classes weights specified in the --class-weights parameter and the number of classes specified in the --classes-count parameter.

Format:

<name for class 1>,..,<name for class N>

For example:

smartphone,touchphone,tablet

Type

list of strings

Default value

None

Supported processing units

CPU and GPU

auto_class_weights

Command-line: --auto-class-weights

Description

Automatically calculate class weights based either on the total weight or the total number of objects in each class. The values are used as multipliers for the object weights.

Supported values:

None — All class weights are set to 1
Balanced:

$CW_k=\displaystyle\frac{max_{c=1}^K(\sum_{t_{i}=c}{w_i})}{\sum_{t_{i}=k}{w_{i}}}$
SqrtBalanced:

$CW_k=\sqrt{\displaystyle\frac{max_{c=1}^K(\sum_{t_i=c}{w_i})}{\sum_{t_i=k}{w_i}}}$

Alert

Do not use this parameter with class_weights and scale_pos_weight.

Type

string

Default value

None — All class weights are set to 1

Supported processing units

CPU and GPU

scale_pos_weight

Description

The weight for class 1 in binary classification. The value is used as a multiplier for the weights of objects from class 1.

Note

For imbalanced datasets, the weight multiplier can be set to $\left(\frac{sum\_negative}{sum\_positive}\right)$

Alert

Do not use this parameter with auto_class_weights and class_weights.

Type

float

Default value

1.0

Supported processing units

CPU and GPU

boosting_type

Command-line: --boosting-type

Description

Boosting scheme.

Possible values:

Ordered — Usually provides better quality on small datasets, but it may be slower than the Plain scheme.
Plain — The classic gradient boosting scheme.

Type

string

Default value

Depends on the processing unit type, the number of objects in the training dataset and the selected learning mode

CPU

Plain
GPU
- Any number of objects, MultiClass or MultiClassOneVsAll mode: Plain
- More than 50 thousand objects, any mode: Plain
- Less than or equal to 50 thousand objects, any mode but MultiClass or MultiClassOneVsAll: Ordered

Supported processing units

CPU and GPU

Only the Plain mode is supported for the MultiClass loss on GPU

boost_from_average

Command-line: --boost-from-average

Description

Initialize approximate values by best constant value for the specified loss function. Sets the value of bias to the initial best constant value.

Available for the following loss functions:

RMSE
Logloss
CrossEntropy
Quantile
MAE
MAPE

Type

bool

Default value

Depends on the selected loss function:

True for RMSE, Quantile, MAE, MAPE
False for all other loss functions

Supported processing units

CPU and GPU

langevin

Command-line: --langevin

Description

Enables the Stochastic Gradient Langevin Boosting mode.

Refer to the SGLB: Stochastic Gradient Langevin Boosting paper for details.

Type

bool

Default value

False

Supported processing units

CPU

diffusion_temperature

Command-line: --diffusion-temperature

Description

The diffusion temperature of the Stochastic Gradient Langevin Boosting mode.

Only non-negative values are supported.

Type

float

Default value

10000

Supported processing units

CPU

posterior_sampling

Command-line: --posterior-sampling

Description

If this parameter is set several options are specified as follows and model parameters are checked to obtain uncertainty predictions with good theoretical properties.
Specifies options:

Langevin: true,
DiffusionTemperature: objects in learn pool count,
ModelShrinkRate: 1 / (2. * objects in learn pool count).

Type

bool

Default value

False

Supported processing units

CPU only

allow_const_label

Command-line: --allow-const-label

Description

Use it to train models with datasets that have equal label values for all objects.

Type

bool

Default value

False

Supported processing units

CPU and GPU

score_function

Command-line: --score-function

Description

The score type used to select the next split during the tree construction.

Possible values:

Cosine (do not use this score type with the Lossguide tree growing policy)
L2
NewtonCosine (do not use this score type with the Lossguide tree growing policy)
NewtonL2

Type

string

Default value

Cosine

Supported processing units

The supported score functions vary depending on the processing unit type:

GPU — All score types
CPU — Cosine, L2

monotone_constraints

Command-line: --monotone-constraints

Description

Impose monotonic constraints on numerical features.

Possible values:

1 — Increasing constraint on the feature. The algorithm forces the model to be a non-decreasing function of this features.
-1 — Decreasing constraint on the feature. The algorithm forces the model to be a non-increasing function of this features.
0 — constraints are disabled.

Supported formats for setting the value of this parameter (all feature indices are zero-based):

Set constraints individually for each feature as a string (the number of features is n).

Format
```
"(<constraint_0>, <constraint_2>, .., <constraint_n-1>)"
```
Zero constraints for features at the end of the list may be dropped.

In monotone_constraints = "(1,0,-1)"an increasing constraint is set on the first feature and a decreasing one on the third. Constraints are disabled for all other features.
Set constraints individually for each explicitly specified feature as a string (the number of features is n).
```
"<feature index or name>:<constraint>, .., <feature index or name>:<constraint>"
```
These examples
```
monotone-constraints = "2:1,4:-1"
```
```
monotone-constraints = "Feature2:1,Feature4:-1"
```
are identical, given that the name of the feature index 2 is Feature2 and the name of the feature indexed 4 is Feature4.
Set constraints individually for each required feature as an array or a dictionary (the number of features is n).

Format
```
[<constraint_0>, <constraint_2>, .., <constraint_n-1>]
```
```
{"<feature index or name>":<constraint>, .., "<feature index or name>":<constraint>}
```
Array examples
```
monotone_constraints = [1, 0, -1]
```
These dictionary examples
```
monotone_constraints = {"Feature2":1,"Feature4":-1}
```
```
monotone_constraints = {"2":1, "4":-1}
```
are identical, given that the name of the feature indexed 2 is Feature2 and the name of the feature indexed 4 is Feature4.

Type

list of strings
string
dict
list

Default value

Python package, R package

None

Command-line

Ommited

Supported processing units

CPU

feature_weights

Command-line: --feature-weights

Description

Per-feature multiplication weights used when choosing the best split. The score of each candidate is multiplied by the weights of features from the current split.

Non-negative float values are supported for each weight.

Supported formats for setting the value of this parameter:

Set the multiplication weight for each feature as a string (the number of features is n).

Format
```
"(<feature-weight_0>,<feature-weight_2>,..,<feature-weight_n-1>)"
```
Note

Spaces between values are not allowed.

Values should be passed as a parenthesized string of comma-separated values. Multiplication weights equal to 1 at the end of the list may be dropped.

In this
example
```
feature_weights = "(0.1,1,3)"
```
the multiplication weight is set to 0.1, 1 and 3 for the first, second and third features respectively. The multiplication weight for all other features is set to 1.
Set the multiplication weight individually for each explicitly specified feature as a string (the number of features is n).

Format
```
"<feature index or name>:<weight>, .., <feature index or name>:<weight>"
```
Note

Spaces between values are not allowed.
These examples
```
feature_weights = "2:1.1,4:0.1"
```
```
feature_weights = "Feature2:1.1,Feature4:0.1"
```
are identical, given that the name of the feature indexed 2 is Feature2 and the name of the feature indexed 4 is Feature4.
Set the multiplication weight individually for each required feature as an array or a dictionary (the number of features is n).

Format
```
[<feature-weight_0>, <feature-weight_2>, .., <feature-weight_n-1>]
```
```
{"<feature index or name>":<weight>, .., "<feature index or name>":<weight>}
```
Array examples
```
feature_weights = [0.1, 1, 3]
```
These dictionary examples
```
feature_weights = {"Feature2":1.1,"Feature4":0.3}
```
```
feature_weights = {"2":1.1, "4":0.3}
```
are identical, given that the name of the feature indexed 2 is Feature2 and the name of the feature indexed 4 is Feature4.

Type

list
numpy.ndarray
string
dict

Default value

1 for all features

Supported processing units

CPU

first_feature_use_penalties

Command-line: --first-feature-use-penalties

Description

Per-feature penalties for the first occurrence of the feature in the model. The given value is subtracted from the score if the current candidate is the first one to include the feature in the model.

Refer to the Per-object and per-feature penalties section for details on applying different score penalties.

Non-negative float values are supported for each penalty.

Set the penalty for each feature as a string (the number of features is n).

Format
```
"(<feature-penalty_0>, <feature-penalty_2>, .., <feature-penalty_n-1>)"
```
Note

Spaces between values are not allowed.

Values should be passed as a parenthesized string of comma-separated values. Penalties equal to 0 at the end of the list may be dropped.

In this example

first_feature_use_penalties parameter:
```
first_feature_use_penalties = "(0.1,1,3)"
```
per_object_feature_penalties parameter:
```
per_object_feature_penalties = "(0.1,1,3)"
```
Note

Spaces between values are not allowed.

the multiplication weight is set to 0.1, 1 and 3 for the first, second and third features respectively. The multiplication weight for all other features is set to 1.
Set the penalty individually for each explicitly specified feature as a string (the number of features is n).

Format
```
"<feature index or name>:<penalty>,..,<feature index or name>:<penalty>"
```
Note

Spaces between values are not allowed.

These examples first_feature_use_penalties parameter:
```
first_feature_use_penalties = "2:1.1,4:0.1"
```
```
first_feature_use_penalties = "Feature2:1.1,Feature4:0.1"
```
per_object_feature_penalties parameter:
```
per_object_feature_penalties = "2:1.1,4:0.1"
```
```
per_object_feature_penalties = "Feature2:1.1,Feature4:0.1"
```
are identical, given that the name of the feature indexed 2 is Feature2 and the name of the feature indexed 4 is Feature4.

Set the penalty individually for each required feature as an array or a dictionary (the number of features is n).

Format

[<feature-penalty_0>, <feature-penalty_2>, .., <feature-penalty_n-1>]

{"<feature index or name>":<penalty>, .., "<feature index or name>":<penalty>}

Array examples.

first_feature_use_penalties parameter:

first_feature_use_penalties = [0.1, 1, 3]

per_object_feature_penalties parameter:

per_object_feature_penalties = [0.1, 1, 3]

These dictionary examples

first_feature_use_penalties parameter:

first_feature_use_penalties = {"Feature2":1.1,"Feature4":0.1}

first_feature_use_penalties = {"2":1.1, "4":0.1}

per_object_feature_penalties parameter:

per_object_feature_penalties = {"Feature2":1.1,"Feature4":0.1}

per_object_feature_penalties = {"2":1.1, "4":0.1}

are identical, given that the name of the feature indexed 2 is Feature2 and the name of the feature indexed 4 is Feature4.

Type

list
numpy.ndarray
string
dict

Default value

0 for all features

Supported processing units

CPU

fixed_binary_splits

Command-line: --fixed-binary-splits

Description

A list of indices of binary features to put at the top of each tree; ignored if grow_policy is Symmetric.

Type

list

Default value

None

Supported processing units

GPU

penalties_coefficient

Command-line: --penalties-coefficient

Description

A single-value common coefficient to multiply all penalties.

Non-negative values are supported.

Type

float

Default value

Supported processing units

CPU

per_object_feature_penalties

Command-line: --per-object-feature-penalties

Description

Per-object penalties for the first use of the feature for the object. The given value is multiplied by the number of objects that are divided by the current split and use the feature for the first time.

Refer to the Per-object and per-feature penalties section for details on applying different score penalties.

Non-negative float values are supported for each penalty.

Python package

Set the penalty for each feature as a string (the number of features is n).

Format
```
"(<feature-penalty_0>, <feature-penalty_2>, .., <feature-penalty_n-1>)"
```
Note

Spaces between values are not allowed.

Values should be passed as a parenthesized string of comma-separated values. Penalties equal to 0 at the end of the list may be dropped.

In this example

first_feature_use_penalties parameter:
```
first_feature_use_penalties = "(0.1,1,3)"
```
per_object_feature_penalties parameter:
```
per_object_feature_penalties = "(0.1,1,3)"
```
Note

Spaces between values are not allowed.

the multiplication weight is set to 0.1, 1 and 3 for the first, second and third features respectively. The multiplication weight for all other features is set to 1.
Set the penalty individually for each explicitly specified feature as a string (the number of features is n).

Format
```
"<feature index or name>:<penalty>,..,<feature index or name>:<penalty>"
```
Note

Spaces between values are not allowed.

These examples first_feature_use_penalties parameter:
```
first_feature_use_penalties = "2:1.1,4:0.1"
```
```
first_feature_use_penalties = "Feature2:1.1,Feature4:0.1"
```
per_object_feature_penalties parameter:
```
per_object_feature_penalties = "2:1.1,4:0.1"
```
```
per_object_feature_penalties = "Feature2:1.1,Feature4:0.1"
```
are identical, given that the name of the feature indexed 2 is Feature2 and the name of the feature indexed 4 is Feature4.

Set the penalty individually for each required feature as an array or a dictionary (the number of features is n).

Format

[<feature-penalty_0>, <feature-penalty_2>, .., <feature-penalty_n-1>]

{"<feature index or name>":<penalty>, .., "<feature index or name>":<penalty>}

Array examples.

first_feature_use_penalties parameter:

first_feature_use_penalties = [0.1, 1, 3]

per_object_feature_penalties parameter:

per_object_feature_penalties = [0.1, 1, 3]

These dictionary examples

first_feature_use_penalties parameter:

first_feature_use_penalties = {"Feature2":1.1,"Feature4":0.1}

first_feature_use_penalties = {"2":1.1, "4":0.1}

per_object_feature_penalties parameter:

per_object_feature_penalties = {"Feature2":1.1,"Feature4":0.1}

per_object_feature_penalties = {"2":1.1, "4":0.1}

are identical, given that the name of the feature indexed 2 is Feature2 and the name of the feature indexed 4 is Feature4.

R package

Set the penalty for each feature as a string (the number of features is n).

Format
```
"(<feature-penalty_0>, <feature-penalty_2>, .., <feature-penalty_n-1>)"
```
Note

Spaces between values are not allowed.

Values should be passed as a parenthesized string of comma-separated values. Penalties equal to 0 at the end of the list may be dropped.

Penalties equal to 0 at the end of the list may be dropped.

In this
example
first_feature_use_penalties parameter:
```
first_feature_use_penalties = "(0.1,1,3)"
```
per_object_feature_penalties parameter:
```
per_object_feature_penalties = "(0.1,1,3)"
```
Note

Spaces between values are not allowed.
the multiplication weight is set to 0.1, 1 and 3 for the first, second and third features respectively. The multiplication weight for all other features is set to 1.
Set the penalty individually for each explicitly specified feature as a string (the number of features is n).

Format
```
"<feature index or name>:<penalty>,..,<feature index or name>:<penalty>"
```
Note

Spaces between values are not allowed.
These examples
first_feature_use_penalties parameter:
```
first_feature_use_penalties = "2:1.1,4:0.1"
```
```
first_feature_use_penalties = "Feature2:1.1,Feature4:0.1"
```
per_object_feature_penalties parameter:
```
per_object_feature_penalties = "2:1.1,4:0.1"
```
```
per_object_feature_penalties = "Feature2:1.1,Feature4:0.1"
```
are identical, given that the name of the feature indexed 2 is Feature2 and the name of the feature indexed 4 is Feature4.

Type

list
numpy.ndarray
string
dict

Default value

0 for all objects

Supported processing units

CPU

model_shrink_rate

Command-line: --model-shrink-rate

Description

The constant used to calculate the coefficient for multiplying the model on each iteration.
The actual model shrinkage coefficient calculated at each iteration depends on the value of the --model-shrink-modefor the Command-line version parameter. The resulting value of the coefficient should be always in the range (0, 1].

Type

float

Default value

The default value depends on the values of the following parameters:

--model-shrink-mode for the Command-line version
--monotone-constraints for the Command-line version

Supported processing units

CPU

model_shrink_mode

Command-line: model_shrink_mode

Description

Determines how the actual model shrinkage coefficient is calculated at each iteration.

Possible values:

Constant:

$1 - model\_shrink\_rate \cdot learning\_rate {,}$
- $model\_shrink\_rate$ is the value of the --model-shrink-ratefor the Command-line version parameter.
- $learning\_rate$ is the value of the --learning-ratefor the Command-line version parameter
Decreasing:

$1 - \frac{model\_shrink\_rate}{i} {,}$
- $model\_shrink\_rate$ is the value of the --model-shrink-ratefor the Command-line version parameter.
- $i$ is the identifier of the iteration.

Type

string

Default value

Constant

Supported processing units

CPU

Common parameters

loss_functionloss_function

DescriptionDescription

custom_metriccustom_metric

DescriptionDescription

eval_metriceval_metric

DescriptionDescription

iterationsiterations

DescriptionDescription

learning_ratelearning_rate

DescriptionDescription

random_seedrandom_seed

DescriptionDescription

l2_leaf_regl2_leaf_reg

DescriptionDescription

bootstrap_typebootstrap_type

DescriptionDescription

bagging_temperaturebagging_temperature

DescriptionDescription

subsamplesubsample

DescriptionDescription

sampling_frequencysampling_frequency

DescriptionDescription

sampling_unitsampling_unit

DescriptionDescription

mvs_regmvs_reg

DescriptionDescription

random_strengthrandom_strength

DescriptionDescription

use_best_modeluse_best_model

DescriptionDescription

best_model_min_treesbest_model_min_trees

DescriptionDescription

depthdepth

DescriptionDescription

grow_policygrow_policy

DescriptionDescription

min_data_in_leafmin_data_in_leaf

DescriptionDescription

max_leavesmax_leaves

DescriptionDescription

ignored_featuresignored_features

DescriptionDescription

one_hot_max_sizeone_hot_max_size

DescriptionDescription

has_timehas_time

DescriptionDescription

rsmrsm

DescriptionDescription

nan_modenan_mode

DescriptionDescription

input_bordersinput_borders

DescriptionDescription

output_bordersoutput_borders

DescriptionDescription

fold_permutation_blockfold_permutation_block

DescriptionDescription

leaf_estimation_methodleaf_estimation_method

DescriptionDescription

leaf_estimation_iterationsleaf_estimation_iterations

DescriptionDescription

leaf_estimation_backtrackingleaf_estimation_backtracking

DescriptionDescription

fold_len_multiplierfold_len_multiplier

DescriptionDescription

approx_on_full_historyapprox_on_full_history

DescriptionDescription

class_weightsclass_weights

DescriptionDescription

class_namesclass_names

DescriptionDescription

auto_class_weightsauto_class_weights

DescriptionDescription

scale_pos_weightscale_pos_weight

DescriptionDescription

boosting_typeboosting_type

DescriptionDescription

boost_from_averageboost_from_average

DescriptionDescription

langevinlangevin

loss_function

Description

custom_metric

Description

eval_metric

Description

iterations

Description

learning_rate

Description

random_seed

Description

l2_leaf_reg

Description

bootstrap_type

Description

bagging_temperature

Description

subsample

Description

sampling_frequency

Description

sampling_unit

Description

mvs_reg

Description

random_strength

Description

use_best_model

Description

best_model_min_trees

Description

depth

Description

grow_policy

Description

min_data_in_leaf

Description

max_leaves

Description

ignored_features

Description

one_hot_max_size

Description

has_time

Description

rsm

Description

nan_mode

Description

input_borders

Description

output_borders

Description

fold_permutation_block

Description

leaf_estimation_method

Description

leaf_estimation_iterations

Description

leaf_estimation_backtracking

Description

fold_len_multiplier

Description

approx_on_full_history

Description

class_weights

Description

class_names

Description

auto_class_weights

Description

scale_pos_weight

Description

boosting_type

Description

boost_from_average

Description

langevin

Description