【发布时间】:2018-12-03 03:50:43
【问题描述】:
我正在尝试在 DZone (https://dzone.com/articles/cv-r-cvs-retrieval-system-based-on-job-description) 上实施示例项目并遇到问题。在这种情况下,我设置了
dir_pca_we_EWE = 'pickle_model_pca.pkl'
并且正在执行以下操作:
def reduce_dimensions_WE(dir_we_EWE, dir_pca_we_EWE):
m1 = KeyedVectors.load_word2vec_format('./wiki.en/GoogleNews.bin', binary=True)
model1 = {}
# normalize vectors
for string in m1.wv.vocab:
model1[string] = m1.wv[string] / np.linalg.norm(m1.wv[string])
# reduce dimensionality
pca = decomposition.PCA(n_components=200)
pca.fit(np.array(list(model1.values())))
model1 = pca.transform(np.array(list(model1.values())))
i = 0
for key, value in model1.items():
model1[key] = model1[i] / np.linalg.norm(model1[i])
i = i + 1
with open(dir_pca_we_EWE, 'wb') as handle:
pickle.dump(model1, handle, protocol=pickle.HIGHEST_PROTOCOL)
return model1
这会产生以下错误:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 12, in reduce_dimensions_WE
AttributeError: 'numpy.ndarray' object has no attribute 'items'
一如既往,非常感谢所有帮助!
【问题讨论】:
-
您将 PCA 转换结果保存到
model1变量中。pca.transform返回np.array而不是dict。 -
感谢 Andreas 和下面的 datasailor - 我如何更改上面的代码才能成功地将尺寸减小到 200?
标签: python numpy machine-learning pca numpy-ndarray