Ranking: objectives and metrics

Pairwise metrics

Pairwise metrics use special labeled information — pairs of dataset objects where one object is considered the “winner” and the other is considered the “loser”. This information might be not exhaustive (not all possible pairs of objects are labeled in such a way). It is also possible to specify the weight for each pair.

If GroupId is specified, then all pairs must have both members from the same group if this dataset is used in pairwise modes.

If the labeled pairs data is not specified for the dataset, then pairs are generated automatically in each group using per-object label values (labels must be specified and must be numerical). The object with a greater label value in the pair is considered the “winner”.

Specific variables used

The following variables are used in formulas of the described pairwise metrics:
  • is the positive object in the pair.
  • is the negative object in the pair.

Objectives and metrics

Name Used for optimization User-defined parameters Formula and/or description
PairLogit +

Calculation principles

Note.

The object weights are not used to calculate and optimize the value of this metric. The weights of object pairs are used instead.

PairLogitPairwise +

Calculation principles

This metric may give more accurate results on large datasets compared to PairLogit but it is calculated significantly slower.

This technique is described in the Winning The Transfer Learning Track of Yahoo!’s Learning To Rank Challenge with YetiRank paper.

Note.

The object weights are not used to calculate and optimize the value of this metric. The weights of object pairs are used instead.

PairAccuracy

use_weights

Default: true

Calculation principles

Note.

The object weights are not used to calculate the value of this metric. The weights of object pairs are used instead.

Name Used for optimization User-defined parameters Formula and/or description
PairLogit +

Calculation principles

Note.

The object weights are not used to calculate and optimize the value of this metric. The weights of object pairs are used instead.

PairLogitPairwise +

Calculation principles

This metric may give more accurate results on large datasets compared to PairLogit but it is calculated significantly slower.

This technique is described in the Winning The Transfer Learning Track of Yahoo!’s Learning To Rank Challenge with YetiRank paper.

Note.

The object weights are not used to calculate and optimize the value of this metric. The weights of object pairs are used instead.

PairAccuracy

use_weights

Default: true

Calculation principles

Note.

The object weights are not used to calculate the value of this metric. The weights of object pairs are used instead.

Groupwise metrics

Name Used for optimization User-defined parameters Formula and/or description
YetiRank * +

An approximation of ranking metrics (such as NDCG and PFound). Allows to use ranking metrics for optimization.

The value of this metric can not be calculated. The metric that is written to output data if YetiRank is optimized depends on the range of all N target values () of the dataset:
  •  — PFound
  •  — NDCG
This metric gives less accurate results on big datasets compared to YetiRankPairwise but it is significantly faster.
Note.

The object weights are not used to optimize this metric. The group weights are used instead.

This objective is used to optimize PairLogit. Automatically generated object pairs are used for this purpose. These pairs are generated independently for each object group. Use the Group weights file or the GroupWeight column of the Columns description file to change the group importance. In this case, the weight of each generated pair is multiplied by the value of the corresponding group weight.

YetiRankPairwise * +

An approximation of ranking metrics (such as NDCG and PFound). Allows to use ranking metrics for optimization.

The value of this metric can not be calculated. The metric that is written to output data if YetiRank is optimized depends on the range of all N target values () of the dataset:
  •  — PFound
  •  — NDCG

This metric gives more accurate results on big datasets compared to YetiRank but it is significantly slower.

This technique is described in the Winning The Transfer Learning Track of Yahoo!’s Learning To Rank Challenge with YetiRank paper.
Note.

The object weights are not used to optimize this metric. The group weights are used instead.

This objective is used to optimize PairLogit. Automatically generated object pairs are used for this purpose. These pairs are generated independently for each object group. Use the Group weights file or the GroupWeight column of the Columns description file to change the group importance. In this case, the weight of each generated pair is multiplied by the value of the corresponding group weight.

StochasticFilter +

Directly optimize the FilteredDCG metric calculated for a pre-defined order of objects for filtration of objects under a fixed ranking. As a result, the FilteredDCG metric can be used for optimization.

is the relevance of an object in the group and the sum is computed over the documents with .

The filtration is defined via the raw formula value:

Zeros correspond to filtered instances and ones correspond to the remaining ones.

The ranking is defined by the order of objects in the dataset.
Attention.

Sort objects by the column you are interested in before training with this loss function and use the has_time option to avoid further objects reordering.

For optimization, a distribution of filtrations is defined:

  • The gradient is estimated via REINFORCE.

Refer to the Learning to Select for a Predefined Ranking paper for calculation details.

StochasticRank +

Common parameters: