optuna_integration.lightgbm.LightGBMTuner
- class optuna_integration.lightgbm.LightGBMTuner(params, train_set, num_boost_round=1000, valid_sets=None, valid_names=None, feval=None, feature_name='auto', categorical_feature='auto', keep_training_booster=False, callbacks=None, time_budget=None, sample_size=None, study=None, optuna_callbacks=None, model_dir=None, *, show_progress_bar=True, optuna_seed=None)[source]
Hyperparameter tuner for LightGBM.
It optimizes the following hyperparameters in a stepwise manner:
lambda_l1
,lambda_l2
,num_leaves
,feature_fraction
,bagging_fraction
,bagging_freq
andmin_child_samples
.You can find the details of the algorithm and benchmark results in this blog article by Kohei Ozaki, a Kaggle Grandmaster.
Note
Arguments and keyword arguments for lightgbm.train() can be passed. For
params
, please check the official documentation for LightGBM.The arguments that only
LightGBMTuner
has are listed below:- Parameters:
time_budget (int | None) – A time budget for parameter tuning in seconds.
study (optuna.study.Study | None) – A
Study
instance to store optimization results. TheTrial
instances in it has the following user attributes:elapsed_secs
is the elapsed time since the optimization starts.average_iteration_time
is the average time of iteration to train the booster model in the trial.lgbm_params
is a JSON-serialized dictionary of LightGBM parameters used in the trial.optuna_callbacks (list[Callable[[Study, FrozenTrial], None]] | None) – List of Optuna callback functions that are invoked at the end of each trial. Each function must accept two parameters with the following types in this order:
Study
andFrozenTrial
. Please note that this is not acallbacks
argument of lightgbm.train() .model_dir (str | None) – A directory to save boosters. By default, it is set to
None
and no boosters are saved. Please set shared directory (e.g., directories on NFS) if you want to accessget_best_booster()
in distributed environments. Otherwise, it may raiseValueError
. If the directory does not exist, it will be created. The filenames of the boosters will be{model_dir}/{trial_number}.pkl
(e.g.,./boosters/0.pkl
).show_progress_bar (bool) –
Flag to show progress bars or not. To disable progress bar, set this
False
.Note
Progress bars will be fragmented by logging messages of LightGBM and Optuna. Please suppress such messages to show the progress bars properly.
optuna_seed (int | None) –
seed
ofTPESampler
for random number generator that affects sampling fornum_leaves
,bagging_fraction
,bagging_freq
,lambda_l1
, andlambda_l2
.Note
The deterministic parameter of LightGBM makes training reproducible. Please enable it when you use this argument.
train_set (lgb.Dataset)
num_boost_round (int)
valid_sets (list['lgb.Dataset'] | tuple['lgb.Dataset', ...] | 'lgb.Dataset' | None)
valid_names (Any | None)
feval (Callable[..., Any] | None)
feature_name (str)
categorical_feature (str)
keep_training_booster (bool)
callbacks (list[Callable[..., Any]] | None)
sample_size (int | None)
Methods
compare_validation_metrics
(val_score, best_score)Return the best booster.
higher_is_better
()run
()Perform the hyperparameter-tuning with given parameters.
Make subset of self.train_set Dataset object.
tune_bagging
([n_trials])tune_feature_fraction
([n_trials])tune_feature_fraction_stage2
([n_trials])tune_min_data_in_leaf
()tune_num_leaves
([n_trials])tune_regularization_factors
([n_trials])Attributes
Return parameters of the best booster.
Return the score of the best booster.
- get_best_booster()[source]
Return the best booster.
If the best booster cannot be found,
ValueError
will be raised. To prevent the errors, please save boosters by specifying themodel_dir
argument of__init__()
, when you resume tuning or you run tuning in parallel.- Return type:
lgb.Booster
- run()
Perform the hyperparameter-tuning with given parameters.
- Return type:
None
- sample_train_set()
Make subset of self.train_set Dataset object.
- Return type:
None