python - `sklearn` asking for eval dataset when there is one

Question

Welcome To Ask or Share your Answers For Others

python - `sklearn` asking for eval dataset when there is one

posted Oct 7, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - `sklearn` asking for eval dataset when there is one

I am working on Stacking Regressor from sklearn and I used lightgbm to train my model. My lightgbm model has an early stopping option and I have used eval dataset and metric for this.

When it feeds into the StackingRegressor, I saw this error

ValueError: For early stopping, at least one dataset and eval metric is required for evaluation

Which is frustrating because I do have them in my code. I wonder what is happening? Here's my code.

import numpy as np 
import pandas as pd 

import lightgbm as lgb
from sklearn.ensemble import AdaBoostRegressor
from sklearn.tree import DecisionTreeRegressor
import xgboost as xgb
from sklearn.ensemble import StackingRegressor

opt_parameters_LGBM = {'bagging_fraction': 0.37031434827212084, 'bagging_seed': 47, 'boosting_type': 'gbdt', 
                       'feature_fraction': 0.3894822966866982, 'learning_rate': 0.01, 'max_bin': 177, 'max_depth': -1, 
                       'metric': 'rmse', 'min_child_weight': 1000.0, 'num_leaves': 161, 'objective': 'regression', 
                       'random_state': 47, 'reg_alpha': 10, 'reg_lambda': 50, 'verbosity': -1}  
m1 = lgb.LGBMRegressor(valid_sets = [lgb_train, lgb_eval], verbose_eval = 30, num_boost_round = 10000, early_stopping_rounds = 10, n_jobs=4, n_estimators=3000, **opt_parameters_LGBM)
m1.fit(X_train_df, y_train_df, eval_set = (X_val_df, y_val_df), eval_metric = 'rmse')

opt_parameters_ADA = {'learning_rate': 0.03, 'n_estimators': 5} 
m2 = AdaBoostRegressor(base_estimator=DecisionTreeRegressor(max_depth=3, min_samples_leaf=1, min_impurity_decrease=10, random_state=47), random_state=47, **opt_parameters_ADA)
m2.fit(X_train_df, y_train_df)

'''
Where problem starts
'''

gbm = xgb.XGBRegressor(
 learning_rate = 0.02,
 n_estimators= 5,
 max_depth= 4,
 min_child_weight= 2,
 gamma=0.9,                        
 subsample=0.8,
 colsample_bytree=0.8,
 objective= 'reg:squaredlogerror',
 nthread= -1,
 verbosity=3,
 random_state=20)

estimators = [('lgbm', m1), ('ada', m2)]

gbm = StackingRegressor(estimators=estimators, final_estimator=gbm, cv=5, verbose=1)
gbm.fit(X_train_df, y_train_df)

question from:https://stackoverflow.com/questions/65713104/sklearn-asking-for-eval-dataset-when-there-is-one

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-06T18:53:28+0000

I guess the issue is causing by the fact that early_stopping was used in the LGBMRegressor, thus it expects eval data in StackingRegressor() as well.

Try doing the following:

Just after the line you've fitted your LGBMRegressor() model with the following line - m1.fit(X_train_df, y_train_df, eval_set = (X_val_df, y_val_df), eval_metric = 'rmse'), add these lines after that.

params = m1.get_params()

# remove early_stopping_rounds as your model is already fitted the data
params["early_stopping_rounds"] = None
m1.set_params(**params)

see if the error goes away.

Categories

python - `sklearn` asking for eval dataset when there is one

python - `sklearn` asking for eval dataset when there is one

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Try doing the following:

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags