【问题标题】:Sort 2d numpy string/float array by column按列对 2d numpy 字符串/浮点数组进行排序
【发布时间】:2020-09-07 20:16:33
【问题描述】:

我有以下 numpy 数组(自 1957 年以来在国家/地区的火箭发射次数),我想按发射次数升序对其进行排序。

   ['Australia', 6.0],
   ['Brazil', 3.0],
   ['China', 269.0],
   ['France', 303.0],
   ['India', 76.0],
   ['Iran', 14.0],
   ['Israel', 11.0],
   ['Japan', 126.0],
   ['Kazakhstan', 701.0],
   ['Kenya', 9.0],
   ['New Zealand', 13.0],
   ['North Korea', 5.0],
   ['Pacific Ocean', 36.0],
   ['Russian Federation', 1398.0],
   ['South Korea', 3.0],
   ['USA', 1351.0]

问题是, np.sort(a, axis = 0) 仅对值进行排序,但国家/地区没有关联,因此 e.i.朝鲜已经发射了 269 枚火箭(这可能比 5 枚更大)

或者,如果我执行 np.sort(a, axis = 1) 那么我会收到一个错误提示

TypeError: 'float' 和 'str' 的实例之间不支持 '

非常感谢任何想法!

【问题讨论】:

  • 这看起来更像是一个列表而不是数组。形状和dtype是什么?您可能需要使用 argsort。
  • arr[arr[:, 1].argsort()] 应该可以工作
  • arr[arr[:, 1].argsort()] 有效。谢谢

标签: python arrays numpy sorting


【解决方案1】:
import numpy as np

data = [
   ['Australia', 6.0], 
   ['Brazil', 3.0],
   ['China', 269.0],
   ['France', 303.0],
   ['India', 76.0],
   ['Iran', 14.0],
   ['Israel', 11.0],
   ['Japan', 126.0],
   ['Kazakhstan', 701.0],
   ['Kenya', 9.0],
   ['New Zealand', 13.0],
   ['North Korea', 5.0],
   ['Pacific Ocean', 36.0],
   ['Russian Federation', 1398.0],
   ['South Korea', 3.0],
   ['USA', 1351.0]
]

我们可以创建一个结构化数组,然后按键排序:

dtype = [
    ('name', '<U18'),    
    ('rockets', float)
]

data = np.array([tuple(x) for x in data], dtype=dtype) 
sorted_data = np.sort(data, order=['rockets'])          

print(sorted_data)

【讨论】:

    【解决方案2】:

    这很容易用 python 列表排序:

    In [208]: alist = [   ['Australia', 6.0],
         ...:    ['Brazil', 3.0],
         ...:    ['China', 269.0],
         ...:    ['France', 303.0],
         ...:    ['India', 76.0],
         ...:    ['Iran', 14.0],
         ...:    ['Israel', 11.0],
         ...:    ['Japan', 126.0],
         ...:    ['Kazakhstan', 701.0],
         ...:    ['Kenya', 9.0],
         ...:    ['New Zealand', 13.0],
         ...:    ['North Korea', 5.0],
         ...:    ['Pacific Ocean', 36.0],
         ...:    ['Russian Federation', 1398.0],
         ...:    ['South Korea', 3.0],
         ...:    ['USA', 1351.0]]
    In [209]: newlist = sorted(alist, key=lambda x: x[1])
    In [210]: newlist
    Out[210]: 
    [['Brazil', 3.0],
     ['South Korea', 3.0],
     ['North Korea', 5.0],
     ['Australia', 6.0],
     ['Kenya', 9.0],
     ['Israel', 11.0],
     ['New Zealand', 13.0],
     ['Iran', 14.0],
     ['Pacific Ocean', 36.0],
     ['India', 76.0],
     ['Japan', 126.0],
     ['China', 269.0],
     ['France', 303.0],
     ['Kazakhstan', 701.0],
     ['USA', 1351.0],
     ['Russian Federation', 1398.0]]
    

    使用对象 dtype 数组(保留字符串和浮点列):

    In [211]: arr = np.array(alist, object)
    In [212]: arr
    Out[212]: 
    array([['Australia', 6.0],
           ['Brazil', 3.0],
           ['China', 269.0],
           ['France', 303.0],
           ...
           ['USA', 1351.0]], dtype=object)
    

    仅查看第二列即可获得排序索引:

    In [213]: idx = np.argsort(arr[:,1])
    In [214]: idx
    Out[214]: array([ 1, 14, 11,  0,  9,  6, 10,  5, 12,  4,  7,  2,  3,  8, 15, 13])
    In [215]: arr[idx]
    Out[215]: 
    array([['Brazil', 3.0],
           ['South Korea', 3.0],
           ['North Korea', 5.0],
           ['Australia', 6.0],
           ['Kenya', 9.0],
           ...
           ['Russian Federation', 1398.0]], dtype=object)
    

    另一个答案中的结构化数组方法也很好。

    【讨论】:

      猜你喜欢
      • 2021-05-18
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2018-08-28
      • 2011-02-11
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多