【问题标题】:Python insertion sorting a csv by rowPython插入按行对csv进行排序
【发布时间】:2021-06-02 16:40:44
【问题描述】:

我的目标是使用插入排序按第一列中的数字对 csv 文件的内容进行排序,例如我想要这样:

[[7831703,  Christian,  Schmidt]
[2299817,   Amber,  Cohen]
[1964394,   Gregory,    Hanson]
[1984288,   Aaron,  White]
[9713285,   Alexander,  Kirk]
[7025528,   Janice, Lee]
[6441979,   Sarah,  Browning]
[8815776,   Rick,   Wallace]
[2395480,   Martin, Weinstein]
[1927432,   Stephen,    Morrison]]

并将其排序为:

[[1927432,  Stephen,    Morrison]
[1964394,   Gregory,    Hanson]
[1984288,   Aaron,  White]
[2299817,   Amber,  Cohen]
[2395480,   Martin, Weinstein]
[6441979,   Sarah,  Browning]
[7025528,   Janice, Lee]
[7831703,   Christian,  Schmidt]
[8815776,   Rick,   Wallace]
[9713285,   Alexander,  Kirk]]

根据 python 中第一列中的数字,我当前的代码如下所示:

import csv
with open('EmployeeList.csv', newline='') as File:  
    reader = csv.reader(File)
    readList = list(reader)
    for row in reader:
        print(row)

def insertionSort(readList): 
  #Traverse through 1 to the len of the list
    for row in range(len(readList)):
# Traverse through 1 to len(arr) 
        for i in range(1, len(readList[row])): 
    
            key = readList[row][i] 
    


    # Move elements of arr[0..i-1], that are 
    # greater than key, to one position ahead 
    # of their current position
            j = i-1
            while j >=0 and key < readList[row][j] : 
                    readList[row] = readList[row] 
                    j -= 1
            readList[row] = key 

insertionSort(readList)
print ("Sorted array is:") 
for i in range(len(readList)): 
    print ( readList[i])

代码已经可以对二维数组的内容进行排序,但它试图对所有内容进行排序。 我认为如果我摆脱了 [] 它会起作用,但在测试中它并没有给出我需要的东西。 为了再次澄清,我想根据第一列的数值对行位置进行排序。

【问题讨论】:

  • 顺便说一句,代码的最后一部分你可以做for x in readList: print (x)。您不需要像其他一些语言那样的索引。只需循环遍历可迭代对象的项目。那是更蟒蛇的方式。

标签: python csv sorting multidimensional-array insertion-sort


【解决方案1】:

抱歉,如果我没有正确理解您的需求。但是你有一个列表,你需要对它进行排序吗?为什么不直接在列表对象中使用sort 方法?

>>> data = [[7831703,  "Christian",  "Schmidt"],
... [2299817,   "Amber",  "Cohen"],
... [1964394,   "Gregory",    "Hanson"],
... [1984288,   "Aaron",  "White"],
... [9713285,   "Alexander",  "Kirk"],
... [7025528,   "Janice", "Lee"],
... [6441979,   "Sarah",  "Browning"],
... [8815776,   "Rick",   "Wallace"],
... [2395480,   "Martin", "Weinstein"],
... [1927432,   "Stephen",    "Morrison"]]
>>> data.sort()
>>> from pprint import pprint
>>> pprint(data)
[[1927432, 'Stephen', 'Morrison'],
 [1964394, 'Gregory', 'Hanson'],
 [1984288, 'Aaron', 'White'],
 [2299817, 'Amber', 'Cohen'],
 [2395480, 'Martin', 'Weinstein'],
 [6441979, 'Sarah', 'Browning'],
 [7025528, 'Janice', 'Lee'],
 [7831703, 'Christian', 'Schmidt'],
 [8815776, 'Rick', 'Wallace'],
 [9713285, 'Alexander', 'Kirk']]
>>> 

请注意,这里我们将第一个元素解析为整数。如果要按数值排序(99 在 100 之前),这一点很重要。

不要因为导入pprint 而感到困惑。你不需要它来排序。我只是用来在控制台中获得更好的输出。

还要注意 List.sort() 是就地方法。它不返回排序列表,而是对列表本身进行排序。

*** 编辑 ***

这里有两种不同的排序功能。两者都可以进行大量优化,但我希望您对如何做到这一点有所了解。两者都应该可以工作,你可以在循环中添加一些打印命令来看看那里发生了什么。

第一个递归版本。它会在每次运行时对列表进行一点排序,直到排序为止。

def recursiveSort(readList):
    # You don't want to mess original data, so we handle copy of it
    data = readList.copy()
    changed = False
    res = []
    while len(data): #while 1 shoudl work here as well because eventually we break the loop
        if len(data) == 1: 
            # There is only one element left. Let's add it to end of our result.
            res.append(data[0])
            break;
        if data[0][0] > data[1][0]:
            # We compare first two elements in list. 
            # If first one is bigger, we remove second element from original list and add it next to the result set. 
            # Then we raise changed flag to tell that we changed the order of original list.
            res.append(data.pop(1))
            changed = True
        else:
            # otherwise we remove first element from the list and add next to the result list.
            res.append(data.pop(0))
    
    if not changed:
       #if no changes has been made,  the list is in order
       return res
    else:
       #if we made changes, we sort list one more time.
       return recursiveSort(res)

这是一个迭代版本,更接近您的原始功能。

def iterativeSort(readList):
    res = []
    for i in range(len(readList)):
       print (res)
       #loop through the original list
       if len(res) == 0:
          # if we don't have any items in our result list, we add first element here.
          res.append(readList[i])
       else:
          done = False
          for j in range(len(res)):
              #loop through the result list this far
              if res[j][0] > readList[i][0]:
                  #if our item in list is smaller than element in res list, we insert it here
                  res.insert(j, readList[i])
                  done = True
                  break
          if not done:
             #if our item in list is bigger than all the items in result list, we put it last.
             res.append(readList[i])
       print(res)
    return res

【讨论】:

  • 我需要使用我的插入排序功能来完成它
  • 您的插入排序函数不返回任何内容。这是故意的吗?
  • 如果您不需要从函数中返回任何内容,您甚至可以使用lambda readList: readList.sort() 执行相同的操作,只需调用它并就地排序您的列表。
  • 或者这是一个作业,你需要实现一个排序功能?如果是这样,请阅读meta.stackoverflow.com/questions/334822/…
  • 严格来说我不是学生,我没有参与任何计划或课程。我完全是自学的。为此,我需要使用我制作的 insertSort 函数,并且不能使用任何预建的排序函数。这样我就可以了解数组操作,据我了解,这是一项宝贵的技能。
猜你喜欢
  • 1970-01-01
  • 2011-01-07
  • 2019-08-23
  • 2010-12-18
  • 2013-12-12
  • 2012-04-12
  • 2011-01-06
  • 2017-06-23
相关资源
最近更新 更多