调试类型错误：不可散列的类型：'numpy.ndarray'答案

【问题标题】：Debug TypeError: unhashable type: 'numpy.ndarray'调试类型错误：不可散列的类型：'numpy.ndarray'
【发布时间】：2016-06-10 12:43:06
【问题描述】：

我正在研究 kmeans 聚类。我在网络上的一些可用参考资料的帮助下写下了一段代码，但是当我运行这段代码时，它会引发一个错误：

    Traceback (most recent call last):
  File "clustering.py", line 16, in <module>
    ds = df[np.where(labels==i)]
  File "/usr/lib/python2.7/dist-packages/pandas/core/frame.py", line 1678, in __getitem__
    return self._getitem_column(key)
  File "/usr/lib/python2.7/dist-packages/pandas/core/frame.py", line 1685, in _getitem_column
    return self._get_item_cache(key)
  File "/usr/lib/python2.7/dist-packages/pandas/core/generic.py", line 1050, in _get_item_cache
    res = cache.get(item)
TypeError: unhashable type: 'numpy.ndarray'

虽然，许多以前的线程都存在相同的错误，但在我的程序中没有可以处理此错误的单一解决方案。如何调试此错误？

我使用的代码：

from sklearn import cluster
import pandas as pd

df = [
[0.57,-0.845,-0.8277,-0.1585,-1.616],
[0.47,-0.14,-0.5277,-0.158,-1.716],
[0.17,-0.845,-0.5277,-0.158,-1.616],
[0.27,-0.14,-0.8277,-0.158,-1.716]]

df = pd.DataFrame(df,columns= ["a","b","c","d", "e"])

# df = pd.read_csv("cleaned_remove_cor.csv")

k = 3
kmeans = cluster.KMeans(n_clusters=k)
kmeans.fit(df)
labels = kmeans.labels_
centroids = kmeans.cluster_centers_
from matplotlib import pyplot
import numpy as np

for i in range(k):
    # select only data observations with cluster label == i
    ds = df[np.where(labels==i)]
    # plot the data observations
    pyplot.plot(ds[:,0],ds[:,1],'o')
    # plot the centroids
    lines = pyplot.plot(centroids[i,0],centroids[i,1],'kx')
    # make the centroid x's bigger
    pyplot.setp(lines,ms=15.0)
    pyplot.setp(lines,mew=2.0)
pyplot.show()

我的 DataFrame 的形状是 (8127x600)

【问题讨论】：

总是给出完整的错误回溯，而不仅仅是最后一行。
@cel 更新错误日志
ds = df[np.where(labels==i)] 这似乎很奇怪。您的意思是：ds = df[labels==i]？
修剪数据集并将其修改为独立且可运行的示例。
@DavidG 我已经用简单的例子更新了我的问题我已经为上面的数据框运行了这个代码并且它抛出了同样的错误。

标签： python numpy pandas matplotlib

【解决方案1】：

我试过了，这对我有用，将 pandas df 转换为 numpy 矩阵：

df = df.as_matrix(columns= ["a","b","c","d", "e"])

【讨论】：