Bootstrap options
Regularization
To prevent overfitting, the weight of each training example is varied over steps of choosing different splits (not over scoring different candidates for one split) or different trees.
Speeding up
When building a new tree, CatBoost calculates a score for each of the numerous split candidates. The computation complexity of this procedure is , where:
is the number of numerical features, each providing many split candidates.
is the number of examples.
Usually, this computation dominates over all other steps of each CatBoost iteration (see Table 1 in the paper). Hence, it seems appealing to speed up this procedure by using only a part of examples for scoring all the split candidates.
Bootstrap type  Description  Associated parameters 

Bayesian  The weight of an example is set to the following value:
Note. The Bayesian bootstrap serves only for the regularization, not for speeding up. 
bagging_temperature Commandline version: baggingtemperature Defines the settings of the Bayesian bootstrap. It is used by default in classification and regression modes. Use the Bayesian bootstrap to assign random weights to objects. The weights are sampled from exponential distribution if the value of this parameter is set to “1”. All weights are equal to 1 if the value of this parameter is set to “0”. Possible values are in the range . The higher the value the more aggressive the bagging is. This parameter can be used if the selected bootstrap type is Bayesian. sampling_unit Commandline version: samplingunit The sampling scheme. Possible values: 
Bernoulli  Corresponds to Stochastic Gradient Boosting (SGB, refer to the paper for details). Each example is independently sampled for choosing the current split with the probability defined by the subsample parameter. All the sampled examples have equal weights. Though SGB was originally proposed for regularization, it speeds up calculations almost times. 
subsample Commandline version: baggingtemperature Sample rate for bagging. This parameter can be used if one of the following bootstrap types is selected: sampling_unit Commandline version: samplingunit The sampling scheme. Possible values: 
MVS (supported only on CPU)  Implements the importance sampling algorithm called Minimum Variance Sampling (MVS). Scoring of a split candidate is based on estimating of the expected gradient in each leaf (provided by this candidate), where the gradient for the example is calculated as follows: , where
For this estimation, MVS samples the subsample examples such that the largest values of are taken with probability and each other example is sampled with probability , where is the threshold for considering the gradient to be large if the value is exceeded. Then, the estimate of the expected gradient is calculated as follows:

mvs_reg Commandline version: mvsreg Affects the weight of the denominator and can be used for balancing between the importance and Bernoulli sampling (setting it to 0 implies importance sampling and to  Bernoulli). sampling_unit Commandline version: samplingunit The sampling scheme. Possible values: 
Poisson (refer to the paper for details; supported only on GPU)  The weights of examples are i.i.d. sampled from the Poisson distribution with the parameter providing the expected number of examples with positive weights equal to the subsample parameter . If subsample is equal to 0.66, this approximates the classical bootstrap (sampling examples with repetitions). 
subsample Commandline version: baggingtemperature Sample rate for bagging. This parameter can be used if one of the following bootstrap types is selected: sampling_unit Commandline version: samplingunit The sampling scheme. Possible values: 
No  All training examples are used with equal weights. 
sampling_unit Commandline version: samplingunit The sampling scheme. Possible values: 
Bootstrap type  Description  Associated parameters 

Bayesian  The weight of an example is set to the following value:
Note. The Bayesian bootstrap serves only for the regularization, not for speeding up. 
bagging_temperature Commandline version: baggingtemperature Defines the settings of the Bayesian bootstrap. It is used by default in classification and regression modes. Use the Bayesian bootstrap to assign random weights to objects. The weights are sampled from exponential distribution if the value of this parameter is set to “1”. All weights are equal to 1 if the value of this parameter is set to “0”. Possible values are in the range . The higher the value the more aggressive the bagging is. This parameter can be used if the selected bootstrap type is Bayesian. sampling_unit Commandline version: samplingunit The sampling scheme. Possible values: 
Bernoulli  Corresponds to Stochastic Gradient Boosting (SGB, refer to the paper for details). Each example is independently sampled for choosing the current split with the probability defined by the subsample parameter. All the sampled examples have equal weights. Though SGB was originally proposed for regularization, it speeds up calculations almost times. 
subsample Commandline version: baggingtemperature Sample rate for bagging. This parameter can be used if one of the following bootstrap types is selected: sampling_unit Commandline version: samplingunit The sampling scheme. Possible values: 
MVS (supported only on CPU)  Implements the importance sampling algorithm called Minimum Variance Sampling (MVS). Scoring of a split candidate is based on estimating of the expected gradient in each leaf (provided by this candidate), where the gradient for the example is calculated as follows: , where
For this estimation, MVS samples the subsample examples such that the largest values of are taken with probability and each other example is sampled with probability , where is the threshold for considering the gradient to be large if the value is exceeded. Then, the estimate of the expected gradient is calculated as follows:

mvs_reg Commandline version: mvsreg Affects the weight of the denominator and can be used for balancing between the importance and Bernoulli sampling (setting it to 0 implies importance sampling and to  Bernoulli). sampling_unit Commandline version: samplingunit The sampling scheme. Possible values: 
Poisson (refer to the paper for details; supported only on GPU)  The weights of examples are i.i.d. sampled from the Poisson distribution with the parameter providing the expected number of examples with positive weights equal to the subsample parameter . If subsample is equal to 0.66, this approximates the classical bootstrap (sampling examples with repetitions). 
subsample Commandline version: baggingtemperature Sample rate for bagging. This parameter can be used if one of the following bootstrap types is selected: sampling_unit Commandline version: samplingunit The sampling scheme. Possible values: 
No  All training examples are used with equal weights. 
sampling_unit Commandline version: samplingunit The sampling scheme. Possible values: 
The frequency of resampling and reweighting is defined by the sampling_frequency parameter:
 PerTree — Before constructing each new tree
 PerTreeLevel — Before choosing each new split of a tree
It is recommended to use MVS when speeding up is an important issue and regularization is not. It is usually the case when operating large data. For regularization, other options might be more appropriate.
Related papers
 Estimating Uncertainty for Massive Data Streams

N. Chamandy, O. Muralidharan, A. Najmi, and S. Naid, 2012
 Stochastic gradient boosting

J. H. Friedman
Computational Statistics & Data Analysis, 38(4):367–378, 2002
 Training Deep Models Faster with Robust, Approximate Importance Sampling

T. B. Johnson and C. Guestrin
In Advances in Neural Information Processing Systems, pages 7276–7286, 2018.
 Lightgbm: A highly efficient gradient boosting decision tree

G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.Y. Liu..
In Advances in Neural Information Processing Systems, pages 3146–3154, 2017.
 CatBoost: unbiased boosting with categorical features

Liudmila Prokhorenkova, Gleb Gusev, Aleksandr Vorobev, Anna Veronika Dorogush, Andrey Gulin. NeurIPS, 2018
NeurIPS 2018 paper with explanation of Ordered boosting principles and ordered categorical features statistics.