Pandas - 分解列包含字符串和列表答案

【问题标题】：Pandas - decompose column contains strings and listsPandas - 分解列包含字符串和列表
【发布时间】：2017-10-10 20:31:15
【问题描述】：

我有一个数据框，其中有一列包含某些行中的字符串和某些行中的列表。我如何将列表分解为单独的列。这就是有-

>>> df2 = pd.DataFrame(["abc","[u'abc', u'xyz']"])
>>> df2

                  0
0               abc
1  [u'abc', u'xyz']

我想解决这个问题-

     0     1
0  abc  None
1  abc   xyz

我尝试过类似的方法，但它有问题-

>>> for col, col_data in df2.iteritems():
...   col_data = pd.get_dummies(pd.DataFrame(list(col_data)), prefix = col)
... 
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/remote/iims003/harpreet/anaconda2/lib/python2.7/site-packages/pandas/core/reshape.py", line 1095, in get_dummies
    for (col, pre, sep) in zip(columns_to_encode, prefix, prefix_sep):
TypeError: izip argument #2 must support iteration

【问题讨论】：

标签： python pandas dataframe

【解决方案1】：

您可以使用返回系列的应用：

In [11]: from ast import literal_eval

In [12]: def to_series(s):
    ...:     try:
    ...:         return pd.Series(literal_eval(s))  # makes it an actual list
    ...:     except ValueError:
    ...:         return pd.Series([s])
    ...:

In [13]: df2[0].apply(to_series)
Out[13]:
     0    1
0  abc  NaN
1  abc  xyz

【讨论】：

我已经复制了你的代码并在他原来的 df2 上使用它 df2 = pd.DataFrame(["abc","[u'abc', u'xyz']"])`但我没有得到你的输出。我只是把原来的 df2 作为答案。
@moondra 是的，很好，我错过了一步（我认为这是我的复制粘贴错误！）