Performance settings

thread_count

Command-line: -T, --thread-count

Description

The number of threads to use during the training.

Python package, Command-line
  • For CPU

    Optimizes the speed of execution. This parameter doesn't affect results.

  • For GPU
    The given value is used for reading the data from the hard drive and does not affect the training.

    During the training one main thread and one thread for each GPU are used.

R package

Optimizes the speed of execution. This parameter doesn't affect results.

Type

int

Default value

-1 (the number of threads is equal to the number of processor cores)

Supported processing units

CPU and GPU

used_ram_limit

Command-line: --used-ram-limit

Description

Attempt to limit the amount of used CPU RAM.

Alert

  • This option affects only the CTR calculation memory usage.
  • In some cases it is impossible to limit the amount of CPU RAM used in accordance with the specified value.

Format:

<size><measure of information>

Supported measures of information (non case-sensitive):

  • MB
  • KB
  • GB

For example:

2gb

Type

int

Default value

None (memory usage is no limited)

Supported processing units

CPU

gpu_ram_part

Command-line: --gpu-ram-part

Description

How much of the GPU RAM to use for training.

Type

float

Default value

0.95

Supported processing units

CPU

pinned_memory_size

Command-line: --pinned-memory-size

Description

How much pinned (page-locked) CPU RAM to use per GPU.

The value should be a positive integer or inf. Measure of information can be defined for integer values.

Format:

<size><measure of information>

Supported measures of information (non case-sensitive):

  • MB
  • KB
  • GB

For example:

2gb

Type

int

Default value

1073741824

Supported processing units

CPU

gpu_cat_features_storage

Command-line: --gpu-cat-features-storage

Description

The method for storing the categorical features' values.

Possible values:

  • CpuPinnedMemory
  • GpuRam

Note

Use the CpuPinnedMemory value if feature combinations are used and the available GPU RAM is not sufficient.

Type

string

Default value

Python package

None (set to GpuRam)

Command-line

GpuRam

Supported processing units

CPU

data_partition

Command-line: --data-partition

Description

The method for splitting the input dataset between multiple workers.

Possible values:

  • FeatureParallel — Split the input dataset by features and calculate the value of each of these features on a certain GPU.

    For example:

    • GPU0 is used to calculate the values of features indexed 0, 1, 2
    • GPU1 is used to calculate the values of features indexed 3, 4, 5, etc.
  • DocParallel — Split the input dataset by objects and calculate all features for each of these objects on a certain GPU. It is recommended to use powers of two as the value for optimal performance.

    For example:

    • GPU0 is used to calculate all features for objects indexed object_1, object_2
    • GPU1 is used to calculate all features for objects indexed object_3, object_4, etc.

Type

string

Default value

Depends on the learning mode and the input dataset

Supported processing units

CPU