Training on GPU
CatBoost supports training on GPUs.
Training on GPU is non-deterministic, because the order of floating point summations is non-deterministic in this implementation.
Choose the implementation for more details on the parameters that are required to start training on GPU.
Python package
The parameters that enable and customize training on GPU are set in the constructors of the following classes:
Parameters
task_type
The processing unit type to use for training.
Possible values:
- CPU
- GPU
devices
IDs of the GPU devices to use for training (indices are zero-based).
Format
<unit ID>
for one device (for example,3
)<unit ID1>:<unit ID2>:..:<unit IDN>
for multiple devices (for example,devices='0:1:3'
)<unit ID1>-<unit IDN>
for a range of devices (for example,devices='0-3'
)
Note
Other training parameters are also available. Some of them are CPU-specific or GPU-specific. See the Python package training parameters section for more details.
For example, use the following code to train a classification model on GPU:
from catboost import CatBoostClassifier
train_data = [[0, 3],
[4, 1],
[8, 1],
[9, 1]]
train_labels = [0, 0, 1, 1]
model = CatBoostClassifier(iterations=1000,
task_type="GPU",
devices='0')
model.fit(train_data,
train_labels,
verbose=False)
R package
For the catboost.train method:
Parameters
task_type
The processing unit type to use for training.
Possible values:
- CPU
- GPU
devices
Parameters
IDs of the GPU devices to use for training (indices are zero-based).
Format
<unit ID>
for one device (for example,3
)<unit ID1>:<unit ID2>:..:<unit IDN>
for multiple devices (for example,devices='0:1:3'
)<unit ID1>-<unit IDN>
for a range of devices (for example,devices='0-3'
)
For example, use the following code to train a model on GPU:
library(catboost)
dataset = matrix(c(1900,7,
1896,1,
1896,41),
nrow=3,
ncol=2,
byrow = TRUE)
label_values = c(0,1,1)
fit_params <- list(iterations = 100,
loss_function = 'Logloss',
task_type = 'GPU')
pool = catboost.load_pool(dataset, label = label_values)
model <- catboost.train(pool, params = fit_params)
Command-line version
For the catboost fit command:
Command keys
--task-type
The processing unit type to use for training.
Possible values:
- CPU
- GPU
--devices
IDs of the GPU devices to use for training (indices are zero-based).
Format
<unit ID>
for one device (for example,3
)<unit ID1>:<unit ID2>:..:<unit IDN>
for multiple devices (for example,devices='0:1:3'
)<unit ID1>-<unit IDN>
for a range of devices (for example,devices='0-3'
)
Train a classification model on GPU:
catboost fit --learn-set ../pytest/data/adult/train_small --column-description ../pytest/data/adult/train.cd --task-type GPU