• Installation
    • Overview
    • Python package installation
    • CatBoost for Apache Spark installation
    • R package installation
    • Command-line version binary
  • Key Features
  • Training parameters
  • Python package
  • CatBoost for Apache Spark
  • R package
  • Command-line version
  • Applying models
  • Objectives and metrics
  • Model analysis
  • Data format description
  • Parameter tuning
  • Speeding up the training
  • Data visualization
  • Algorithm details
  • FAQ
  • Educational materials
  • Development and contributions
  • Contacts


Calculation principles

The calculation of this function consists of the following steps:

  1. Model values are calculated for the objects from the input dataset.

  2. Top MM model values are selected for each group. The quantity MM is user-defined.

    For example, let's assume that the number of top model values is limited to 2 and the following values are calculated for the input dataset:

    Document ID    Model value
    1              10.4
    2              20.1
    3              1.1

    In this case, the objects with indices 2 and 1 are selected.

  3. The average of the label values is calculated for the objects selected at step 2.

    For example, if the dataset consists of one group and the documents match the ones mentioned in the description of step 2, the AverageGain metric is calculated as follows:

    QueryAverage=LabelValueobject2+LabelValueobject12QueryAverage = \displaystyle\frac{LabelValue_{object2} + LabelValue_{object1}}{2}

User-defined parameters



The number of top samples in a group that are used to calculate the ranking metric. Top samples are either the samples with the largest approx values or the ones with the lowest target values if approx values are the same.

–1 (all label values are used)