Recovering training after an interruption

During the training, CatBoost makes snapshots — backup copies of intermediate results. If an unexpected interruption occurs (for instance, the computer accidentally turns off), training can be continued from the saved state. In this case, the completed iterations of building trees don't need to be repeated.

Saving snapshots can be enabled when setting training parameters. Refer to the descriptions below for details. If enabled, the default period for making snapshots is 600 seconds. This period can be changed with training parameters described below.

To restore an interrupted training from a previously saved snapshot, launch the training in the same folder with the same parameters. In this case, CatBoost finds the snapshot and resumes the training from the iteration where it has stopped.

Method

Parameters

save_snapshot

save_snapshot

Enable snapshotting for restoring the training progress after an interruption. If enabled, the default period for making snapshots is 600 seconds. Use the snapshot_interval parameter to change this period.

Set this parameter to "True".

save_snapshot

save_snapshot

The name of the file to save the training progress information in. This file is used for recovering training after an interruption.

Depending on whether the specified file exists in the file system:

  • Missing — Write information about training progress to the specified file.
  • Exists — Load data from the specified file and continue training from where it left off.

Note

This parameter is not supported in the params parameter of the cv function.

snapshot_interval

snapshot_interval

The interval between saving snapshots in seconds.

The first snapshot is taken after the specified number of seconds since the start of training. Every subsequent snapshot is taken after the specified number of seconds since the previous one. The last snapshot is taken at the end of the training.

Method catboost.train

Parameters

save_snapshot

save_snapshot

Enable snapshotting for restoring the training progress after an interruption. If enabled, the default period for making snapshots is 600 seconds. Use the snapshot_interval parameter to change this period.

Set this parameter to "True".

snapshot_file

snapshot_file

The name of the file to save the training progress information in. This file is used for recovering training after an interruption.

Depending on whether the specified file exists in the file system:

  • Missing — Write information about training progress to the specified file.
  • Exists — Load data from the specified file and continue training from where it left off.

Note

This parameter is not supported in the params parameter of the cv function.

snapshot_interval

snapshot_interval

The interval between saving snapshots in seconds.

The first snapshot is taken after the specified number of seconds since the start of training. Every subsequent snapshot is taken after the specified number of seconds since the previous one. The last snapshot is taken at the end of the training.

Command catboost fit

Command keys

--snapshot-file

--snapshot-file

Settings for recovering training after an interruption.

Depending on whether the specified file exists in the file system:

  • Missing — Write information about training progress to the specified file.
  • Exists — Load data from the specified file and continue training from where it left off.

Use this parameter to enable snapshotting.

--snapshot-interval

--snapshot-interval

The interval between saving snapshots in seconds.

The first snapshot is taken after the specified number of seconds since the start of training. Every subsequent snapshot is taken after the specified number of seconds since the previous one. The last snapshot is taken at the end of the training.