QueryCrossEntropy

Let's assume that it is required to solve a classification problem on a dataset with grouped objects. For example, it may be required to predict user clicks on a search engine results page.

Generally, this task can be solved by the Logloss function:
Logloss=1i=1Nwigroup(obj_in_groupwi(tilog(pi)+(1ti)log(1pi)))Logloss = \displaystyle\frac{1}{\sum\limits_{i = 1}^{N} w_{i}} \sum_{group} \left( \sum_{obj\_in\_group} w_{i} \left(t_{i} \cdot log(p_{i}) + (1 - t_{i}) \cdot log(1 - p_{i}) \right) \right)

  • tit_{i} is the label value for the i-th object (from the input data for training). Possible values are in the range [0;1][0;1].
  • aia_{i} is the Logloss raw formula prediction.
  • pip_{i} is the predicted probability that the object belongs to the positive class. pi=σ(ai)p_i = \sigma(a_{i}) (refer to the Logistic function, odds, odds ratio, and logit section of the Logistic regression article in Wikipedia for details).

Since the internal structure of the data is known, it can be assumed that the predictions in various groups are different. This can be modeled by adding a shift_groupshift\_group to each formula prediction for a group:
pˉi=σ(ai+group_shift)\bar p_{i} = \sigma(a_{i} + group\_shift)
The shift_groupshift\_group parameter is jointly optimized for each group during the training.

In this case, the Logloss formula for grouped objects takes the following form:
Loglossgroup=1i=1Nwigroup(obj_in_groupwi(tilog(pˉi)+(1ti)log(1pˉi)))Logloss_{group} = \displaystyle\frac{1}{\sum\limits_{i = 1}^{N} w_{i}} \sum_{group} \left( \sum_{obj\_in\_group} w_{i} \left( t_{i} \cdot log({{\bar p_{i}}} ) + (1 - t_{i}) \cdot log(1 - {{\bar p_i}} ) \right) \right)
The QueryCrossEntropy metric is calculated as follows:
QueryCrossEntropy(α)=(1α)LogLoss+αLogLossgroupQueryCrossEntropy(\alpha) = (1 - \alpha) \cdot LogLoss + \alpha \cdot LogLoss_{group}

User-defined parameters

Parameter: alpha

Description

The coefficient used in quantile-based losses. Defines the rules for mixing the

Logloss=1i=1Nwigroup(obj_in_groupwi(tilog(pi)+(1ti)log(1pi)))Logloss = \displaystyle\frac{1}{\sum\limits_{i = 1}^{N} w_{i}} \sum_{group} \left( \sum_{obj\_in\_group} w_{i} \left(t_{i} \cdot log(p_{i}) + (1 - t_{i}) \cdot log(1 - p_{i}) \right) \right)

and

Loglossgroup=1i=1Nwigroup(obj_in_groupwi(tilog(pˉi)+(1ti)log(1pˉi)))Logloss_{group} = \displaystyle\frac{1}{\sum\limits_{i = 1}^{N} w_{i}} \sum_{group} \left( \sum_{obj\_in\_group} w_{i} \left( t_{i} \cdot log({{\bar p_{i}}} ) + (1 - t_{i}) \cdot log(1 - {{\bar p_i}} ) \right) \right)

versions of the Logloss function.