迭代选择 pandas DataFrame 中的单元格并替换一个值答案

【问题标题】：Iterating overs select cells in pandas DataFrame and replacing a value迭代选择 pandas DataFrame 中的单元格并替换一个值
【发布时间】：2017-10-28 15:56:05
【问题描述】：

我有一个类似于以下示例的 pandas DataFrame：

      tags      tag1      tag2      tag3
0     [a,b,c]     0         0         0
1     [a,b]       0         0         0
2     [b,d]       0         0         0
...
n     [a,b,d]     0         0         0

如果tag1, tag2, tag3 的行索引中存在tags 数组中的tags，我想将它们编码为1。

但是，我无法完全确定要正确迭代；到目前为止，我的想法如下：

for i, row in dataset.iterrows():
    for tag in row[0]:
        for column in range (1,4):
            if dataset.iloc[:,column].index == tag:
                dataset.set_value(i, column, 1)

但是，在从该方法返回数据集时，列仍然都是 0 值。

谢谢！

【问题讨论】：

试试dataset = dataset.set_value(i, column, 1) ?

标签： python pandas iteration indices

【解决方案1】：

看来你需要：

astype 用于转换列，如果包含列表到字符串
str.strip 删除 []
str.get_dummies

df1 = df['tags'].astype(str).str.strip('[]').str.get_dummies(', ')
print (df1)
   'a'  'b'  'c'  'd'
0    1    1    1    0
1    1    1    0    0
2    0    1    0    1
3    1    1    0    1

最后由concat将df1添加到原始DataFrame：

df = pd.concat([df,df1], axis=1)
print (df)
        tags  tag1  tag2  tag3  'a'  'b'  'c'  'd'
0  [a, b, c]     0     0     0    1    1    1    0
1     [a, b]     0     0     0    1    1    0    0
2     [b, d]     0     0     0    0    1    0    1
3  [a, b, d]     0     0     0    1    1    0    1

【讨论】：

谢谢 - 就像一个魅力，虽然它删除了我的数据集的所有其余部分。我会将结果的内容合并到原始数据集中。
美女！非常感谢。