预测每个样本，检查 pandas 数据帧中的值并附加到新数据帧答案

【问题标题】：Predict for each sample, check the value in pandas dataframe and append to a new dataframe预测每个样本，检查 pandas 数据帧中的值并附加到新数据帧
【发布时间】：2018-04-17 10:43:21
【问题描述】：

我是使用 python 和 pandas 数据框进行机器学习的新手。我正在训练我的模型并对 x_test(dataframe) 进行预测。我想对 x_test 中的每一行（样本）进行预测，如果预测值小于某个值（0.4），我想将该行附加到一个新的数据帧（new_train）。我已经提供了我的想法的主体。你能帮帮我吗？

 c = XGBRegressor()  
 dt = c.fit(x_train, y_train)

 new_train = pd.DataFrame()  

 for rows in x_test:  
     y_pred = c.predict(x_test[rows])  
     if y_pred < 0.4:
           new_train.append(x_test[rows])

【问题讨论】：

标签： python pandas dataframe machine-learning xgboost

【解决方案1】：

你基本上已经想通了。只是一些微调。你可以使用ilocthis way

 for i in range(x_test.shape[0]):  
     row_i = x_test.iloc[i] # a row in x_test
     y_pred = c.predict(row_i)  
     if y_pred < 0.4:
           new_train = new_train.append(row_i)

或者这样使用

 for i in range(len(x_test)):  
     row_i = x_test.iloc[i, :] # a row in x_test
     y_pred = c.predict(row_i)  
     if y_pred < 0.4:
           new_train = new_train.append(row_i)

两者都会产生<class 'pandas.core.series.Series'>类型的结果

对pd.DataFrame 对象使用.append() 方法不是就地操作。请参阅here 了解更多信息。

【讨论】：

两者都没有按我的需要工作。第一个正在执行，但没有任何内容附加到 new_train DataFrame（检查 y_pred 0.4），第二个给我一个值错误。感谢您的努力，但您能否弄清楚并找出问题所在？
很难说没有更多信息。以上所有答案都是：（1）对于我的数据帧中的每一行（2）抓取该行（3）将该行发送到我的分类器（注意这里的类型，数组与系列）（4）给我输出预测..一步一步地做，你应该没问题

【解决方案2】：

我想这就是你要找的，

    for i in range(len(X_test)):  
       row = X_test.iloc[i,:].to_frame().T
       y_pred = forest.predict(row)  
       if y_pred.item(0) < 0.4:
           new_train = new_train.append(row)

【讨论】：