用整数值索引熊猫系列答案

【问题标题】：indexing pandas series with an integer value用整数值索引熊猫系列
【发布时间】：2016-04-19 10:35:30
【问题描述】：

我有两个熊猫系列，如下所示。

bulk_order_id
Out[283]: 
3    523
Name: order_id, dtype: object

和

luster_6_loc
Out[285]: 
3    Cluster 3
Name: Clusters, dtype: object

现在我想要一个看起来像这样的新系列。

Cluster 3  523

我在 python 中做以下操作

cluster_final = pd.Series()
for i in range(len(cluster_6_loc)):
    cluster_final.append(pd.Series(bulk_order_id.values[i], index =  
    cluster_6_loc.iloc[i]))

这给了我一个错误提示

TypeError: Index(...) must be called with a collection of some kind, 'Cluster 3' was passed

【问题讨论】：

标签： python pandas concat series

【解决方案1】：

不确定我是否正确理解了您的问题，但pd.concat() (see docs) 有什么问题：

s1 = pd.Series(data=['523'], index=[3])

3    523
dtype: object

s2 = pd.Series(data=['Cluster 3'], index=[3])

3    Cluster 3
dtype: object

并使用pd.concat()，这也适用于多个值：

pd.concat([s1, s2], axis=1)

     0          1
3  523  Cluster 3

产生DataFrame，当您将Series 与多个值组合时，您可能需要它。您可以使用.set_index() 将任何values 移动到index，或者添加.squeeze() 以获得Series。

所以pd.concat([s1, s2], axis=1).set_index(1) 给出：

             0
1             
Cluster 3  523

【讨论】：

它给了我一个空系列。 Out[298]: Series([], dtype: float64)
您曾询问过TypeError。我更新了我的答案，以反映您对每个 Series 的多个值的新问题。

【解决方案2】：

您可以将luster_6_loc 的值作为索引和bulk_order_id 的值作为值传递给pd.Series：

bulk_order_id = pd.Series(523, index=[3])
cluster_6_loc= pd.Series('Cluster 3', index=[3])

cluster_final = pd.Series(bulk_order_id.values, cluster_6_loc.values)

In [149]: cluster_final 
Out[149]:
Cluster 3    523
dtype: int64

编辑

这很奇怪，但似乎append 到Series 无法正常工作（至少在版本0.17.1 中）：

s = pd.Series()

In [199]: s.append(pd.Series(1, index=[0]))
Out[199]:
0    1
dtype: int64

In [200]: s
Out[200]: Series([], dtype: float64)

顺便说一句，你可以这样做set_value：

cluster_final = pd.Series()
for i in range(len(cluster_6_loc)):
    cluster_final.set_value(cluster_6_loc.iloc[i], bulk_order_id.values[i])

In [209]: cluster_final
Out[209]:
Cluster 3    523
dtype: int64

【讨论】：

好的。有用。但是，当我遍历 for 循环时它不起作用。
@user2927983 为什么需要遍历 for 循环？你能只使用这两个系列的值吗？
我可以在这两个系列中拥有多个值。这就是我想遍历 for 循环的原因。

【解决方案3】：

也许使用concat 和set_index 会更好：

print bulk_order_id

1    523
2    528
3    527
4    573
Name: order_id, dtype: object

print cluster_6_loc

1    Cluster 1
2    Cluster 2
3    Cluster 3
4    Cluster 4
Name: Clusters, dtype: object

cluster_final = pd.concat([bulk_order_id, cluster_6_loc], axis=1).set_index('Clusters')
#reset index name
cluster_final.index.name = ''

print cluster_final.ix[:,0]

Cluster 1    523
Cluster 2    528
Cluster 3    527
Cluster 4    573
Name: order_id, dtype: object

【讨论】：