Scale and bias

Purpose

Set and/or print the model scale and bias.

Execution format

catboost normalize-model [optional parameters]

Options

-m, --model-file, --model-path

Description

The name of the input file with the description of the model obtained as the result of training.

Default value

model.bin

--model-format

Description

The format of the input model.

Possible values:

  • CatboostBinary.
  • AppleCoreML (only datasets without categorical features are currently supported).
  • json (multiclassification models are not currently supported). Refer to the CatBoost JSON model tutorial for format details.
  • onnx — ONNX-ML format (only datasets without categorical features are currently supported). Refer to https://onnx.ai for details. See the ONNX section for details on applying the resulting model.
  • pmml — PMML version 4.3 format. Categorical features must be interpreted as one-hot encoded during the training if present in the training dataset. This can be accomplished by setting the --one-hot-max-size/one_hot_max_size parameter to a value that is greater than the maximum number of unique categorical feature values among all categorical features in the dataset. See the PMML section for details on applying the resulting model.

Default value

CatboostBinary

--column-description, --cd

Description

The path to the input file that contains the columns description.

Default value

If omitted, it is assumed that the first column in the file with the dataset description defines the label value, and the other columns are the values of numerical features.

--delimiter

Description

The delimiter character used to separate the data in the dataset description input file.

Only single char delimiters are supported. If the specified value contains more than one character, only the first one is used.

Note

Used only if the dataset is given in the Delimiter-separated values format.

Default value

The input data is assumed to be tab-separated

--has-header

Description

Read the column names from the first line of the dataset description file if this parameter is set.

Note

Used only if the dataset is given in the Delimiter-separated values format.

Default value

False (the first line is supposed to have the same data as the rest of them)

--set-scale

Description

The model scale.

Default value

1

--set-bias

Description

The model bias.

The model prediction results are calculated as follows:

The value of this parameters affects the prediction by changing the default value of the bias.

Default value

Depends on the value of the --boost-from-average for the Command-line version parameter:

  • True — The best constant value for the specified loss function
  • False — 0

Description

Return the scale and bias of the model.

These values affect the results of applying the model, since the model prediction results are calculated as follows:

Scale and bias are not output

Default value

Set the scale and bias to 0.8:

--logging-level

Description

The logging level to output to stdout.

Possible values:

  • Silent — Do not output any logging information to stdout.
  • Verbose — Output the following data to stdout:
    • optimized metric
    • elapsed time of training
    • remaining time of training
  • Info — Output additional information and the number of trees.
  • Debug — Output debugging information.

Default value

Info

-T, --thread-count

Description

The number of threads to use.

Default value

4

--input-path

Description

The name of the input file with the dataset description.

Default value

input.tsv

--output-model

Description

The path to the output model.

Default value

model.bin

--output-model-format

Description

The format of the output model.

Possible values:

  • CatboostBinary.
  • AppleCoreML (only datasets without categorical features are currently supported).
  • json (multiclassification models are not currently supported). Refer to the CatBoost JSON model tutorial for format details.
  • onnx — ONNX-ML format (only datasets without categorical features are currently supported). Refer to https://onnx.ai for details. See the ONNX section for details on applying the resulting model.
  • pmml — PMML version 4.3 format. Categorical features must be interpreted as one-hot encoded during the training if present in the training dataset. This can be accomplished by setting the --one-hot-max-size/one_hot_max_size parameter to a value that is greater than the maximum number of unique categorical feature values among all categorical features in the dataset. See the PMML section for details on applying the resulting model.

Default value

CatboostBinary

Usage examples

catboost normalize-model --set-scale 0.8 --set-bias 0.8 --print-scale-and-bias

The output of this example:

Input model scale 1 bias 1.405940652
Output model scale 0.8 bias 0.8