Quick start
To get started:
-
Prepare a dataset using the catboost.load_pool function:
library(catboost) features <- data.frame(feature1 = c(1, 2, 3), feature2 = c('A', 'B', 'C')) labels <- c(0, 0, 1) train_pool <- catboost.load_pool(data = features, label = labels)
The dataset is created from a synthetic
data.frame
calledfeatures
in this example. Thedata
argument can also reference a dataset file or a matrix of numerical features. -
Train the model using the catboost.train function:
model <- catboost.train(train_pool, NULL, params = list(loss_function = 'Logloss', iterations = 100, metric_period=10))
The second argument in this example (
test_pool
) is set to NULL. It can also be used to pass a validation dataset (the labelled data used for estimating the prediction error while training). Theparams
argument is used to specify the training parameters. -
Apply the trained model using the catboost.predict function:
real_data <- data.frame(feature1 = c(2, 1, 3), feature2 = c('D', 'B', 'C')) real_pool <- catboost.load_pool(real_data) prediction <- catboost.predict(model, real_pool) print(prediction)