Overfitting detector

IncToDec
Iter

If overfitting occurs, CatBoost can stop the training earlier than the training parameters dictate. For example, it can be stopped before the specified number of trees are built. This option is set in the starting parameters.

The following overfitting detection methods are supported:

IncToDec

Before building each new tree, CatBoost checks the resulting loss change on the validation dataset. The overfit detector is triggered if the $Threshold$ value set in the starting parameters is greater than $CurrentPValue$ :

$CurrentPValue < Threshold$

How $CurrentPValue$ is calculated from a set of values for the maximizing metric $score[i]$ :

$ExpectedInc$ is calculated:

$ExpectedInc = max_{i_{1} \leq i_{2} \leq i } 0.99^{i - i_{1}} \cdot (score[i_{2}] - score[i_{1}])$
$x$ is calculated:

$x = \frac{ExpectedInc[i]}{max_{j \leq i} { } score[j] - score[i]}$
$CurrentPValue$ is calculated:

$CurrentPValue = exp \left(- \frac{0.5}{x}\right)$

Iter

Before building each new tree, CatBoost checks the number of iterations since the iteration with the optimal loss function value.

The model is considered overfitted if the number of iterations exceeds the value specified in the training parameters.

Overfitting detector

IncToDecIncToDec

IterIter

Was the article helpful?

IncToDec

Iter