catboost.save_pool

catboost.save_pool(data,
                   label = NULL,
                   weight = NULL,
                   baseline = NULL,
                   pool_path = "data.pool",
                   cd_path = "cd.pool")

Purpose

Save the dataset to the CatBoost format. Files with the following data are created:

Use the catboost.load_pool function to read the resulting files. These files can also be used in the Command-line version and the Python package.

Arguments

data

Description

A data.frame or matrix with features.

The following column types are supported:

double
factor. It is assumed that categorical features are given in this type of columns. A standard CatBoost processing procedure is applied to this type of columns:
1. The values are converted to strings.
2. The ConvertCatFeatureToFloat function is applied to the resulting string.

Default value

Required argument

label

Description

The target variables (in other words, the objects' label values) of the dataset.

Default value

NULL

weight

Description The weights of objects.

Default value

NULL

baseline

Description

A vector of formula values for all input objects. The training starts from these values for all input objects instead of starting from zero.

Default value

NULL

pool_path

Description

The path to the output file that contains the dataset description.

Default value

data.pool

cd_path

Description

The path to the output file that contains the columns description.

Default value

cd.pool

Was the article helpful?

catboost.load_pool

catboost.train