optuna_integration.lightgbm.LightGBMTunerCV

class optuna_integration.lightgbm.LightGBMTunerCV(params, train_set, num_boost_round=1000, folds=None, nfold=5, stratified=True, shuffle=True, feval=None, feature_name='auto', categorical_feature='auto', fpreproc=None, seed=0, callbacks=None, time_budget=None, sample_size=None, study=None, optuna_callbacks=None, verbosity=None, show_progress_bar=True, model_dir=None, return_cvbooster=False, *, optuna_seed=None)[source]

Hyperparameter tuner for LightGBM with cross-validation.

It employs the same stepwise approach as LightGBMTuner. LightGBMTunerCV invokes lightgbm.cv() to train and validate boosters while LightGBMTuner invokes lightgbm.train(). See a simple example which optimizes the validation log loss of cancer detection.

Note

Arguments and keyword arguments for lightgbm.cv() can be passed except metrics, init_model and eval_train_metric. For params, please check the official documentation for LightGBM.

The arguments that only LightGBMTunerCV has are listed below:

Parameters:

time_budget (int | None) – A time budget for parameter tuning in seconds.
study (optuna.study.Study | None) – A Study instance to store optimization results. The Trial instances in it has the following user attributes: elapsed_secs is the elapsed time since the optimization starts. average_iteration_time is the average time of iteration to train the booster model in the trial. lgbm_params is a JSON-serialized dictionary of LightGBM parameters used in the trial.
optuna_callbacks (list[Callable[[Study, FrozenTrial], None]] | None) – List of Optuna callback functions that are invoked at the end of each trial. Each function must accept two parameters with the following types in this order: Study and FrozenTrial. Please note that this is not a callbacks argument of lightgbm.train() .
model_dir (str | None) – A directory to save boosters. By default, it is set to None and no boosters are saved. Please set shared directory (e.g., directories on NFS) if you want to access get_best_booster() in distributed environments. Otherwise, it may raise ValueError. If the directory does not exist, it will be created. The filenames of the boosters will be {model_dir}/{trial_number}.pkl (e.g., ./boosters/0.pkl).
verbosity (int | None) –
A verbosity level to change Optuna’s logging level. The level is aligned to LightGBM’s verbosity .

Warning

Deprecated in v2.0.0. verbosity argument will be removed in the future. The removal of this feature is currently scheduled for v4.0.0, but this schedule is subject to change.

Please use set_verbosity() instead.
show_progress_bar (bool) –
Flag to show progress bars or not. To disable progress bar, set this False.

Note

Progress bars will be fragmented by logging messages of LightGBM and Optuna. Please suppress such messages to show the progress bars properly.
return_cvbooster (bool) – Flag to enable get_best_booster().
optuna_seed (int | None) –
seed of TPESampler for random number generator that affects sampling for num_leaves, bagging_fraction, bagging_freq, lambda_l1, and lambda_l2.

Note

The deterministic parameter of LightGBM makes training reproducible. Please enable it when you use this argument.
params (dict[str, Any]) –
train_set (lgb.Dataset) –
num_boost_round (int) –
folds (Generator[tuple[int, int], None, None] | Iterator[tuple[int, int]] | 'BaseCrossValidator' | None) –
nfold (int) –
stratified (bool) –
shuffle (bool) –
feval (Callable[..., Any] | None) –
feature_name (str) –
categorical_feature (str) –
fpreproc (Callable[..., Any] | None) –
seed (int) –
callbacks (list[Callable[..., Any]] | None) –
sample_size (int | None) –

Methods

`compare_validation_metrics`(val_score, best_score)
`get_best_booster`()	Return the best cvbooster.
`higher_is_better`()
`run`()	Perform the hyperparameter-tuning with given parameters.
`sample_train_set`()	Make subset of self.train_set Dataset object.
`tune_bagging`([n_trials])
`tune_feature_fraction`([n_trials])
`tune_feature_fraction_stage2`([n_trials])
`tune_min_data_in_leaf`()
`tune_num_leaves`([n_trials])
`tune_regularization_factors`([n_trials])

Attributes

`best_params`	Return parameters of the best booster.
`best_score`	Return the score of the best booster.

property best_params: dict[str, Any]: Return parameters of the best booster.

property best_score: float: Return the score of the best booster.

get_best_booster()[source]

Return the best cvbooster.

If the best booster cannot be found, ValueError will be raised. To prevent the errors, please save boosters by specifying both of the model_dir and the return_cvbooster arguments of __init__(), when you resume tuning or you run tuning in parallel.

Return type:: lgb.CVBooster

run()

Perform the hyperparameter-tuning with given parameters.

Return type:: None

sample_train_set()

Make subset of self.train_set Dataset object.

Return type:: None