写出一个类中的整个输入列表以进行模型预测答案

【问题标题】：write out the whole list of inputs in a class for model prediction写出一个类中的整个输入列表以进行模型预测
【发布时间】：2016-07-15 10:00:51
【问题描述】：

我有以下代码，一个基于BaseEstimator和ClassifierMixin在python中应用sklearn的非常简单的模型。它旨在报告城市（X）的预测分数（y）。在这里，作为一个简单的模型，我只希望它在每次调用城市时报告一个城市的平均得分作为其预测值。

class MeanClassifier(BaseEstimator, ClassifierMixin):
    def __inif__(self):
        self.cityid_ = []
        self.cntX = []

    def X3(self, X):
        self.cityid_, idx = np.unique(X, return_inverse = True)
        self.cntX = map(list(self.cityid_).index, X)
        return self.cntX

    def fit(self, X, y):
        self.meanclasses_, meanindicies = np.unique(y, return_inverse = True)
        self.cityid_, idx = np.unique(X, return_inverse = True)
        self.df = pd.DataFrame({"X":X, "y":y})
        self.mean_ = self.df.groupby(['X'].mean())

    def predict(self, X):
        return self.df['y']['X']

要使用该类，我有 B，其中 city 是在该类中充当 X 和星号为 y 的城市列表。

B = MeanClassifier()
asncityid = city

B.fit(asncityid, stars)
pred = B.predict(asncityid[2]) #use the third city in the city list for prediction
print(pred)

当我运行此代码时，我收到以下错误

  `File "ml2_cp.py", line 66, in <module>
   pred = B.predict(asncityid[2])
  File "ml2_cp.py", line 58, in predict
  return self.df['y']['X'] ## using sklearn requires all X inputs
  File "/opt/conda/lib/python2.7/site-packages/pandas/core/series.py", line 583, in __getitem__
  result = self.index.get_value(self, key)
  File "/opt/conda/lib/python2.7/site-packages/pandas/indexes/base.py", line 1980, in get_value
  tz=getattr(series.dtype, 'tz', None))
  File "pandas/index.pyx", line 103, in pandas.index.IndexEngine.get_value (pandas/index.c:3332)
  File "pandas/index.pyx", line 111, in pandas.index.IndexEngine.get_value (pandas/index.c:3035)
  File "pandas/index.pyx", line 161, in pandas.index.IndexEngine.get_loc (pandas/index.c:4084)
KeyError: 'X'`

我很困惑，但是，如何将整个 X 列表保留在 def predict(self, X) 中我确信我的写作方式不对，因为我也有 y 在那里。请让我知道任何可能的解决方案，如果不清楚，我想进一步解释我的代码和问题。非常感谢。

【问题讨论】：

抱歉，您“收到以下错误”但列出的是行号而不是错误。有没有抛出异常？

标签： python machine-learning scikit-learn

【解决方案1】：

我想也许你想拥有

self.mean_ = self.df.groupby(['X']).mean()

而不是

self.mean_ = self.df.groupby(['X'].mean())

和

return self.mean_.ix[X].values

而不是

return self.df['y']['X']

【讨论】：

非常感谢！！我什至不知道 .ix 我需要查一下！！