【问题标题】:ValueError: Shape of passed values is, indices implyValueError:传递值的形状是,索引暗示
【发布时间】:2021-02-15 18:56:59
【问题描述】:

因为第一次发帖没有回复,所以重新发帖

我有以下数据:

desc = pd.DataFrame(description, columns =['new_desc'])

                                             new_desc
257623  the public safety report is compiled from crim...
161135  police say a sea isle city man ordered two pou...
156561  two people are behind bars this morning, after...
41690   pumpkin soup is a beloved breakfast soup in ja...
70092   right now, 15 states are grappling with how be...
...                                                   ...
207258  operation legend results in 59 more arrests, i...
222170                                      see story, 3a
204064  st. louis — missouri secretary of state jason ...
151443  tony lavell jones, 54, of sunset view terrace,...
97367   walgreens, on the other hand, is still going t...

[9863 rows x 1 columns]

我试图在文档中找到主要主题,当我运行以下代码时

best_lda_model = lda_desc
data_vectorized = tfidf
lda_output = best_lda_model.transform(data_vectorized)
topicnames = ["Topic " + str(i) for i in range(best_lda_model.n_components)]
docnames = ["Doc " + str(i) for i in range(len(dataset))]
df_document_topic = pd.DataFrame(np.round(lda_output, 2), columns = topicnames, index = docnames)
dominant_topic = np.argmax(df_document_topic.values, axis = 1)
df_document_topic['dominant_topic'] = dominant_topic

我已经尝试调整代码,但是,无论我更改什么,我都会收到以下错误跟踪簿错误

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
c:\python36\lib\site-packages\pandas\core\internals\managers.py in create_block_manager_from_blocks(blocks, axes)
   1673 
-> 1674         mgr = BlockManager(blocks, axes)
   1675         mgr._consolidate_inplace()

c:\python36\lib\site-packages\pandas\core\internals\managers.py in __init__(self, blocks, axes, do_integrity_check)
    148         if do_integrity_check:
--> 149             self._verify_integrity()
    150 

c:\python36\lib\site-packages\pandas\core\internals\managers.py in _verify_integrity(self)
    328             if block.shape[1:] != mgr_shape[1:]:
--> 329                 raise construction_error(tot_items, block.shape[1:], self.axes)
    330         if len(self.items) != tot_items:

ValueError: Shape of passed values is (9863, 8), indices imply (0, 8)

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-41-bd470d69b181> in <module>
      4 topicnames = ["Topic " + str(i) for i in range(best_lda_model.n_components)]
      5 docnames = ["Doc " + str(i) for i in range(len(dataset))]
----> 6 df_document_topic = pd.DataFrame(np.round(lda_output, 2), columns = topicnames, index = docnames)
      7 dominant_topic = np.argmax(df_document_topic.values, axis = 1)
      8 df_document_topic['dominant_topic'] = dominant_topic

c:\python36\lib\site-packages\pandas\core\frame.py in __init__(self, data, index, columns, dtype, copy)
    495                 mgr = init_dict({data.name: data}, index, columns, dtype=dtype)
    496             else:
--> 497                 mgr = init_ndarray(data, index, columns, dtype=dtype, copy=copy)
    498 
    499         # For data is list-like, or Iterable (will consume into list)

c:\python36\lib\site-packages\pandas\core\internals\construction.py in init_ndarray(values, index, columns, dtype, copy)
    232         block_values = [values]
    233 
--> 234     return create_block_manager_from_blocks(block_values, [columns, index])
    235 
    236 

c:\python36\lib\site-packages\pandas\core\internals\managers.py in create_block_manager_from_blocks(blocks, axes)
   1679         blocks = [getattr(b, "values", b) for b in blocks]
   1680         tot_items = sum(b.shape[0] for b in blocks)
-> 1681         raise construction_error(tot_items, blocks[0].shape[1:], axes, e)
   1682 
   1683 

ValueError: Shape of passed values is (9863, 8), indices imply (0, 8)

期望的结果是根据特定主题生成文档列表。下面是示例代码和所需的输出。

df_document_topic(df_document_topic['dominant_topic'] == 2).head(10)

当我运行这段代码时,我得到以下回溯

TypeError                                 Traceback (most recent call last)
<ipython-input-55-8cf9694464e6> in <module>
----> 1 df_document_topic(df_document_topic['dominant_topic'] == 2).head(10)

TypeError: 'DataFrame' object is not callable

下面是想要的输出

任何帮助将不胜感激。

【问题讨论】:

  • 能否请您发布回溯?
  • 我编辑了帖子以包含回溯
  • docnames 是空列表吗?
  • 当您尝试过滤某个值时,请使用方括号:df_document_topic[df_document_topic['dominant_topic'] == 2].head(10)
  • 太棒了,成功了!!!谢谢

标签: python-3.x pandas jupyter-notebook nlp tf-idf


【解决方案1】:

您作为docnames 传递的索引是空的,它是从dataset 获得的,如下所示:

docnames = ["Doc " + str(i) for i in range(len(dataset))]

所以这意味着dataset 也是空的。对于解决方法,您可以根据lda_output 的大小创建Doc 索引,如下所示:

docnames = ["Doc " + str(i) for i in range(len(lda_output))]

让我知道这是否有效。

【讨论】:

  • 这适用于那个单元格,但是当我运行下一个单元格时,它得到 TypeError: 'DataFrame' object is not callable
  • 查看我的最新评论,看看它是否有效
猜你喜欢
  • 2013-11-09
  • 2020-01-12
  • 2018-10-15
  • 2020-05-21
  • 2021-10-31
  • 2021-12-05
  • 2021-05-12
  • 1970-01-01
  • 2020-05-24
相关资源
最近更新 更多