【问题标题】:Pandas: Create a tuple column from multiple columnsPandas:从多列创建一个元组列
【发布时间】:2018-04-04 08:28:24
【问题描述】:

我有以下数据框my_df

Person       event         time
---------------------------------
John          A        2017-10-11
John          B        2017-10-12
John          C        2017-10-14
John          D        2017-10-15
Ann           X        2017-09-01
Ann           Y        2017-09-02
Dave          M        2017-10-05
Dave          N        2017-10-07
Dave          Q        2017-10-20

我想创建一个新列,即 (event, time) 对。它应该看起来像:

Person       event         time        event_time
------------------------------------------------------
John          A        2017-10-11     (A, 2017-10-11)
John          B        2017-10-12     (B, 2017-10-12)
John          C        2017-10-14     (C, 2017-10-14)
John          D        2017-10-15     (D, 2017-10-15)
Ann           X        2017-09-01     (X, 2017-09-01)
Ann           Y        2017-09-02     (Y, 2017-09-02)
Dave          M        2017-10-05     (M, 2017-10-05)
Dave          N        2017-10-07     (N, 2017-10-07)
Dave          Q        2017-10-20     (Q, 2017-10-20)

这是我的代码:

my_df['event_time'] = my_df.apply(lambda row: (row['event'] , row['time']), axis=1)

但我收到以下错误:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/usr/local/lib/python3.4/dist-packages/pandas/core/internals.py in create_block_manager_from_arrays(arrays, names, axes)
   4309         blocks = form_blocks(arrays, names, axes)
-> 4310         mgr = BlockManager(blocks, axes)
   4311         mgr._consolidate_inplace()

/usr/local/lib/python3.4/dist-packages/pandas/core/internals.py in __init__(self, blocks, axes, do_integrity_check, fastpath)
   2794         if do_integrity_check:
-> 2795             self._verify_integrity()
   2796 

/usr/local/lib/python3.4/dist-packages/pandas/core/internals.py in _verify_integrity(self)
   3005             if block._verify_integrity and block.shape[1:] != mgr_shape[1:]:
-> 3006                 construction_error(tot_items, block.shape[1:], self.axes)
   3007         if len(self.items) != tot_items:

/usr/local/lib/python3.4/dist-packages/pandas/core/internals.py in construction_error(tot_items, block_shape, axes, e)
   4279     raise ValueError("Shape of passed values is {0}, indices imply {1}".format(
-> 4280         passed, implied))
   4281 

ValueError: Shape of passed values is (128, 2), indices imply (128, 3)

知道我在代码中做错了什么吗?谢谢!

【问题讨论】:

    标签: python-3.x pandas tuples apply


    【解决方案1】:

    你可以使用:

    my_df['event_time'] = my_df[['event','time']].apply(tuple, axis=1)
    

    或者:

    my_df['event_time'] = tuple(zip(my_df['event'], my_df['time']))
    

    或者:

    my_df['event_time'] = [tuple(x) for x in my_df[['event','time']].values.tolist()]
    

    全部返回:

    print (my_df)
      Person event        time       event_time
    0   John     A  2017-10-11  (A, 2017-10-11)
    1   John     B  2017-10-12  (B, 2017-10-12)
    2   John     C  2017-10-14  (C, 2017-10-14)
    3   John     D  2017-10-15  (D, 2017-10-15)
    4    Ann     X  2017-09-01  (X, 2017-09-01)
    5    Ann     Y  2017-09-02  (Y, 2017-09-02)
    6   Dave     M  2017-10-05  (M, 2017-10-05)
    7   Dave     N  2017-10-07  (N, 2017-10-07)
    8   Dave     Q  2017-10-20  (Q, 2017-10-20)
    

    【讨论】:

    • 我尝试了第一种方法,但得到了错误:ValueError: Wrong number of items passed 2, placement 暗示 1 我在这里错过了什么?
    • 可能有一些 NaN 吗?我去测试一下。
    • 是的,我有一些事件为“无”,但仍带有时间戳。我希望相应的元组可以是(无,时间戳)
    【解决方案2】:

    没有apply

    df.assign(event_time=list(zip(df.event,df.time)))
    Out[1011]: 
      Person event        time        event_time
    0   John     A  2017-10-11  (A, 2017-10-11)
    1   John     B  2017-10-12  (B, 2017-10-12)
    2   John     C  2017-10-14  (C, 2017-10-14)
    3   John     D  2017-10-15  (D, 2017-10-15)
    4    Ann     X  2017-09-01  (X, 2017-09-01)
    5    Ann     Y  2017-09-02  (Y, 2017-09-02)
    6   Dave     M  2017-10-05  (M, 2017-10-05)
    7   Dave     N  2017-10-07  (N, 2017-10-07)
    8   Dave     Q  2017-10-20  (Q, 2017-10-20)
    

    【讨论】:

      猜你喜欢
      • 2015-12-05
      • 2020-03-18
      • 2017-03-28
      • 1970-01-01
      • 1970-01-01
      • 2019-03-15
      • 1970-01-01
      • 1970-01-01
      • 2020-10-15
      相关资源
      最近更新 更多