【问题标题】:Pandas series to 2d array熊猫系列到二维数组
【发布时间】:2018-07-22 22:33:41
【问题描述】:

所以,我使用Put a 2d Array into a Pandas Series 的答案将 2D numpy 数组放入 pandas 系列。 简而言之,就是

a = np.zeros((5,2))
s = pd.Series(list(a))

现在,将熊猫系列转换回二维数组的最便宜的方法是什么? 如果我尝试s.values,我会得到object dtype 的数组数组。

到目前为止,我尝试了np.vstack(s.values),但它当然会复制数据。

【问题讨论】:

    标签: python pandas


    【解决方案1】:

    我相信你需要:

    a = np.array(s.values.tolist())
    print (a)
    [[ 0.  0.]
     [ 0.  0.]
     [ 0.  0.]
     [ 0.  0.]
     [ 0.  0.]]
    

    a = np.zeros((50000,2))
    s = pd.Series(list(a))
    
    In [131]: %timeit (np.vstack(s.values))
    10 loops, best of 3: 107 ms per loop
    
    In [132]: %timeit (np.array(s.values.tolist()))
    10 loops, best of 3: 19.7 ms per loop
    
    In [133]: %timeit (np.array(s.tolist()))
    100 loops, best of 3: 19.6 ms per loop
    

    但如果转置差异很小(但caching):

    a = np.zeros((2,50000))
    s = pd.Series(list(a))
    #print (s)
    
    In [159]: %timeit (np.vstack(s.values))
    The slowest run took 23.31 times longer than the fastest. This could mean that an intermediate result is being cached.
    10000 loops, best of 3: 55.7 µs per loop
    
    In [160]: %timeit (np.array(s.values.tolist()))
    The slowest run took 7.20 times longer than the fastest. This could mean that an intermediate result is being cached.
    10000 loops, best of 3: 49.8 µs per loop
    
    In [161]: %timeit (np.array(s.tolist()))
    The slowest run took 7.31 times longer than the fastest. This could mean that an intermediate result is being cached.
    10000 loops, best of 3: 62.6 µs per loop
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2020-06-03
      • 2017-07-08
      • 1970-01-01
      • 1970-01-01
      • 2014-06-15
      • 1970-01-01
      • 2016-06-13
      相关资源
      最近更新 更多