Objectives and metrics

This section contains basic information regarding the supported metrics for various machine learning problems.

Metrics can be calculated during the training or separately from the training for a specified model. The calculated values are written to files and can be plotted by visualization tools (both during and after the training) for further analysis.

User-defined parameters

Some metrics provide user-defined parameters. These parameters must be set together with the metric name when it is being specified.

The parameters for each metric are set in the following format:
<Metric>[:<parameter 1>=<value>;..;<parameter N>=<value>]

The supported parameters vary from one metric to another and are listed alongside the corresponding descriptions.

Usage examples
Quantile:alpha=0.1
List of most important parameters

The following table contains the description of parameters that are used in several metrics. The default values vary from one metric to another and are listed alongside the corresponding descriptions.

Parameter Description
use_weights

Use object/group weights to calculate metrics if the specified value is “true” and set all weights to “1” regardless of the input data if the specified value is “false”.

Note.

This parameter cannot be used with the optimized objective. If weights are present, they are necessarily used to calculate the optimized objective. This behaviour cannot be disabled.

top

The number of top samples in a group that are used to calculate the ranking metric. Top samples are either the samples with the largest approx values or the ones with the lowest target values if approx values are the same.

Parameter Description
use_weights

Use object/group weights to calculate metrics if the specified value is “true” and set all weights to “1” regardless of the input data if the specified value is “false”.

Note.

This parameter cannot be used with the optimized objective. If weights are present, they are necessarily used to calculate the optimized objective. This behaviour cannot be disabled.

top

The number of top samples in a group that are used to calculate the ranking metric. Top samples are either the samples with the largest approx values or the ones with the lowest target values if approx values are the same.

Enable, disable and configure metrics calculation

The calculation of metrics can be resource-intensive. It creates a bottleneck in some cases, for example, if many metrics are calculated during the training or the computation is performed on GPU.

The training can be sped up by disabling the calculation of some metrics for the training dataset. Use the hints=skip_train~true parameter to disable the calculation of the specified metrics.

Note.

The calculation of some metrics is disabled by default for the training dataset to speed up the training. Use the hints=skip_train~false parameter to enable the calculation.

Metrics that are not calculated by default for the train dataset
  • PFound
  • YetiRank
  • NDCG
  • YetiRankPairwise
  • AUC
  • NormalizedGini
  • FilteredDCG
  • DCG
Usage examples
Enable the calculation of the AUC metric:
AUC:hints=skip_train~false
Disable the calculation of the Logloss metric:
Logloss:hints=skip_train~true
Another way to speed up the training is to set up the frequency of iterations to calculate the values of metrics. Use one of the following parameters:
Command-line version parameters Python parameters R parameters
--metric-period metric_period metric_period
Command-line version parameters Python parameters R parameters
--metric-period metric_period metric_period
For example, use the following parameter in Python or R to calculate metrics once per 50 iterations:
metric_period=50

Variables used

The following common variables are used in formulas of the described metrics:

  •  is the label value for the i-th object (from the input data for training).
  •  is the result of applying the model to the i-th object.
  • is the predicted success probability
  •  is the total number of objects.
  • is the number of classes.
  • is the class of the object for binary classification.

  •  is the weight of the i-th object. It is set in the dataset description in columns with the Weight type (if otherwise is not stated) or in the sample_weight parameter of the Python package. The default is 1 for all objects.
  • , , , , are abbreviations for Positive, True Positive, True Negative, False Positive and False Negative.

    By default, , , , , use weights. For example,

  • is the array of pairs specified in the Pairs description or in the pairs parameter of the Python package.
  • is the number of pairs for the Pairwise metrics.
  • is the value calculated using the resulting model for the winner object for the Pairwise metrics.
  • is the value calculated using the resulting model for the loser object for the Pairwise metrics.
  • is the weight of the (; ) pair for the Pairwise metrics.
  • is the array of object identifiers from the input dataset with a common GroupId. It is used to calculate theGroupwise metrics.
  • is the set of all arrays of identifiers from the input dataset with a common GroupId. It is used to calculate the Groupwise metrics.