由于 TypeError 无法绘制散点图答案

【问题标题】：Unable to plot scatter plot because of TypeError由于 TypeError 无法绘制散点图
【发布时间】：2019-10-05 17:17:58
【问题描述】：

我有一个数据集，我将只使用一个列来应用 kmeans 聚类。然而，在绘制图表时，我得到了“numpy.ndarray”。我尝试转换为浮动，但仍然面临同样的问题

数据框：

代码：

 from sklearn.cluster import KMeans
 import numpy as np
 km = KMeans(n_clusters=4, init='k-means++',n_init=10)
 km.fit(df1)
 x = km.fit_predict(df1)
 x
 array([0, 0, 0, ..., 3, 3, 3])

 np.shape(x)
 (1097,)

  import matplotlib.pyplot as plt
  %matplotlib inline

  plt.scatter(df1[x ==1,0], df1[x == 0,1], s=100, c='red')
  plt.scatter(df1[x ==1,0], df1[x == 1,1], s=100, c='black')
  plt.scatter(df1[x ==2,0], df1[x == 2,1], s=100, c='blue')
  plt.scatter(df1[x ==3,0], df1[x == 3,1], s=100, c='cyan')

错误：

   ---------------------------------------------------------------------------
    TypeError                                 Traceback (most recent call last)
    <ipython-input-62-5f0966ccc828> in <module>()
     1 import matplotlib.pyplot as plt
     2 get_ipython().run_line_magic('matplotlib', 'inline')
  ----> 3 plt.scatter(df1[x ==1,0], df1[x == 0,1], s=100, c='red')
     4 plt.scatter(df1[x ==1,0], df1[x == 1,1], s=100, c='black')
     5 plt.scatter(df1[x ==2,0], df1[x == 2,1], s=100, c='blue')

     ~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
     2137             return self._getitem_multilevel(key)
     2138         else:
   ->2139             return self._getitem_column(key)
     2140 
     2141     def _getitem_column(self, key):

    ~\AppData\Local\Continuum\anaconda3\lib\site- 
 packages\pandas\core\frame.py in _getitem_column(self, key)
     2144         # get column
     2145         if self.columns.is_unique:
  -> 2146             return self._get_item_cache(key)
     2147 
     2148         # duplicate columns & possible reduce dimensionality

   ~\AppData\Local\Continuum\anaconda3\lib\site- packages\pandas\core\generic.py in _get_item_cache(self, item)
     1838         """Return the cached item, item represents a label indexer."""
     1839         cache = self._item_cache
  -> 1840         res = cache.get(item)
     1841         if res is None:
     1842             values = self._data.get(item)

   TypeError: unhashable type: 'numpy.ndarray'

【问题讨论】：

标签： python-3.x matplotlib scatter-plot numpy-ndarray

【解决方案1】：

就我而言，我试图随机选择 2 个特征并在其上运行 KMeans 分类器。

sample = df[['f1','f2','f3','f4','f5','f6','f7']].sample(2, axis=1)
kmeans_classifier = KMeans(n_clusters=3) # select random features
y_kmeans = kmeans_classifier.fit_predict(sample)
plt.scatter(X[y_kmeans == 0, 0], X[y_kmeans == 0, 1], s = 75, c ='red', label = 'Zero')

最后一行是抛出TypeError。我通过使用 values 将示例 DataFrame 转换为 Numpy 表示来解决这个问题。

修改代码：

sample = df[['f1','f2','f3','f4','f5','f6','f7']].sample(2, axis=1).values

【讨论】：

【解决方案2】：

如果我正确理解了您的代码，那么您正在尝试根据 x 的值对 DataFrame 进行切片以进行绘图。为此，您应该使用df1.loc[x==1,0] 而不是df1[x==1,0]（对所有其他切片依此类推）。

【讨论】：

仍然是错误，TypeError: cannot do label indexing on with these indexers [0] of
猜是因为只有一个变量，在这种情况下如何绘制单变量图？
@anaghas that (new?) error 听起来你的 DataFrame 有一个“字符串”索引（而不是例如 0,1,2,...），所以你不能使用 x=[0,0,0,..3,3,] (类型为int64) 作为掩码。print(df1.index.dtype) 和print(x.dtype) 的返回值是多少？
我遇到了同样的问题。 print(sample[y_kmeans == 0]) 工作正常，即使用值 == 0 正确过滤行。但是，print(sample[y_kmeans == 0, 0]) 抛出低于错误 TypeError: '(array([False, False, False, False, False, False, False, False, False,True, True]), 0)' is an invalid key 我试过 print(sample[y_kmeans == 0, True])，这也会抛出相同的错误。有什么建议？请