CatBoostClassifier

class CatBoostClassifier(iterations=None,
                         learning_rate=None,
                         depth=None,
                         l2_leaf_reg=None,
                         model_size_reg=None,
                         rsm=None,
                         loss_function='Logloss',
                         border_count=None,
                         feature_border_type=None,                         
                         input_borders=None,
                         output_borders=None,
                         old_permutation_block_size=None,
                         od_pval=None,
                         od_wait=None,
                         od_type=None,
                         nan_mode=None,
                         counter_calc_method=None,
                         leaf_estimation_iterations=None,
                         leaf_estimation_method=None,
                         thread_count=None,
                         random_seed=None,
                         use_best_model=None,
                         verbose=None,
                         logging_level=None,
                         metric_period=None,
                         ctr_leaf_count_limit=None,
                         store_all_simple_ctr=None,
                         max_ctr_complexity=None,
                         has_time=None,
                         allow_const_label=None,
                         classes_count=None,
                         class_weights=None,
                         one_hot_max_size=None,
                         random_strength=None,
                         name=None,
                         ignored_features=None,
                         train_dir=None,
                         custom_loss=None,
                         custom_metric=None,
                         eval_metric=None,
                         bagging_temperature=None,
                         save_snapshot=None,
                         snapshot_file=None,
                         snapshot_interval=None,
                         fold_len_multiplier=None,
                         used_ram_limit=None,
                         gpu_ram_part=None,
                         allow_writing_files=None,
                         final_ctr_computation_mode=None,
                         approx_on_full_history=None,
                         boosting_type=None,
                         simple_ctr=None,
                         combinations_ctr=None,
                         per_feature_ctr=None,
                         task_type=None,
                         device_config=None,
                         devices=None,
                         bootstrap_type=None,
                         subsample=None,
                         sampling_unit=None,
                         dev_score_calc_obj_block_size=None,
                         max_depth=None,
                         n_estimators=None,
                         num_boost_round=None,
                         num_trees=None,
                         colsample_bylevel=None,
                         random_state=None,
                         reg_lambda=None,
                         objective=None,
                         eta=None,
                         max_bin=None,
                         scale_pos_weight=None,
                         gpu_cat_features_storage=None,
                         data_partition=None
                         metadata=None, 
                         early_stopping_rounds=None,
                         cat_features=None, 
                         growing_policy=None,
                         min_samples_in_leaf=None,
                         max_leaves_count=None,
                         leaf_estimation_backtracking=None)

Purpose

Training and applying models for the classification problems. When using the applying methods only the probability that the object belongs to the class is returned. Provides compatibility with the scikit-learn tools.

Parameters

Parameter Description Default value
metadata The key-value string pairs to store in the model's metadata storage after the training. None
cat_features

A one-dimensional array of categorical columns indices.

Categorical features of the catboost.Pool object must be equal to those of the model if a catboost.Pool object is used for training.

Note. Do not use this parameter if the input training dataset (specified in the X parameter) type is catboost.Pool.
None (all features are considered numerical)
Parameter Description Default value
metadata The key-value string pairs to store in the model's metadata storage after the training. None
cat_features

A one-dimensional array of categorical columns indices.

Categorical features of the catboost.Pool object must be equal to those of the model if a catboost.Pool object is used for training.

Note. Do not use this parameter if the input training dataset (specified in the X parameter) type is catboost.Pool.
None (all features are considered numerical)

See Training parameters for the full list of parameters.

Note. Some parameters duplicate the ones specified for the fit method. In these cases the values specified for the fit method take precedence.

Attributes

Attribute Description
tree_count_

Return the number of trees in the model.

feature_importances_

Output the calculated feature importances.

random_seed_

The random seed used for training.

learning_rate_

The learning rate used for training.

feature_names_

The names of features in the dataset.

evals_result_

Return the values of metrics calculated during the training.

best_score_

Return the best result for each metric calculated on each validation dataset.

best_iteration_

Return the identifier of the iteration with the best result of the evaluation metric or loss function on the last validation set.

Attribute Description
tree_count_

Return the number of trees in the model.

feature_importances_

Output the calculated feature importances.

random_seed_

The random seed used for training.

learning_rate_

The learning rate used for training.

feature_names_

The names of features in the dataset.

evals_result_

Return the values of metrics calculated during the training.

best_score_

Return the best result for each metric calculated on each validation dataset.

best_iteration_

Return the identifier of the iteration with the best result of the evaluation metric or loss function on the last validation set.

Methods

Method Description
fit

Train a model.

predict

Apply the model to the given dataset.

predict_proba

Apply the model to the given dataset to predict the probability that the object belongs to the given classes.

staged_predict

Apply the model to the given dataset and calculate the results for each i-th tree of the model taking into consideration only the trees in the range [1;i].

staged_predict_proba

Apply the model to the given dataset to predict the probability that the object belongs to the class and calculate the results for each i-th tree of the model taking into consideration only the trees in the range [1;i].

eval_metrics

Calculate the specified metrics for the specified dataset.

get_feature_importance

Calculate and return the feature importances.

get_object_importance
Calculate the effect of objects from the train dataset on the optimized metric values for the objects from the input dataset:
  • Positive values reflect that the optimized metric increases.
  • Negative values reflect that the optimized metric decreases.
load_model

Load the model from a file.

save_model

Save the model to a file.