【发布时间】:2021-01-19 04:47:27
【问题描述】:
我正在尝试在使用 spark 时使用交叉验证,但它会引发错误:
gbtClassifier = GBTClassifier(featuresCol= "features", labelCol="is_goal")
lr = LogisticRegression(featuresCol= "features" ,labelCol="is_goal")
pipelineStages = stringIndexers + encoders + [featureAssembler]
pipeline = Pipeline(stages=pipelineStages)
param_grid_lr = ParamGridBuilder().addGrid(lr.regParam, [0.1,0.01]).addGrid(lr.elasticNetParam, [0,0.5,1]).build()
crossval = CrossValidator(estimator=lr, estimatorParamMaps=param_grid_lr ,evaluator=BinaryClassificationEvaluator(), numFolds=3)
cross_model = crossval.fit(df_tr)
IllegalArgumentException:标签不存在。 Available: event_type_str, event_team, shot_place_str, location_str, assist_method_str, situation_str, country_code, is_goal, event_type_str_idx, event_team_idx, shot_place_str_idx, location_str_idx, assist_method_str_idx, situation_str_idx, country_code_idx, event_type_str_vec, event_team_vec, shot_place_str_vec, location_str_vec, assist_method_str_vec, situation_str_vec, country_code_vec, features, CrossValidator_2fc516202d9d_rand, rawPrediction,概率,预测
[这是我的特征的样子1
【问题讨论】:
标签: apache-spark pyspark