【问题标题】:How to select rows in a csv file beased on the lenght?如何根据长度选择csv文件中的行?
【发布时间】:2021-05-11 07:02:59
【问题描述】:

我正在尝试根据行的长度来选择行。我的 csv 文件中的某些行有 5 个项目,有些有 20 个项目,有些有 40 个。如果它们的长度在 24 到 34 之间,我想收集所有行。所以我尝试了以下代码:

my_path = r'c:\data\FF\Desktop\my_files' 

for file in os.listdir(my_path):
    path_file = os.path.join(my_path, file)
    with open(path_file, 'r') as output:
        reader = csv.reader(output, delimiter = ',')
        read = [row for row in reader if row] 
        for row in read:
            if len(row) > 24 or len(row) < 34:
                if row[9] == '3080':
                    print(row[0] + ',' + row[24] + ',' + row[25] + ',' 
                          + row[26] + ',' + row[27] + ',' + row[28] + ',' + row[29]
                          + ',' + row[30] + ',' + row[31] + ',' + row[32] + ',' + row[33] + ','
                          + row[34])

我收到以下错误:

  File "C:\data\FF\Desktop\Python\PythongMySQL\untitled2.py", line 15, in <module>
    if row[9] == '3080':

IndexError: list index out of range

我希望得到几行长度在 24 到 34 之间的行。

【问题讨论】:

  • 检查长度时使用and 而不是or。所有行都符合or
  • @MarkTolonen 我将or 更改为and 收到以下错误:` + row[34]) IndexError: list index out of range`
  • 这需要 35 行。
  • @MarkTolonen 还是同样的错误...
  • 或将其设为 pythonic if 24 &lt; len(row) &lt; 34 :,然后一旦确定该行的长度

标签: python list csv for-loop if-statement


【解决方案1】:

正如您所说,您希望长度介于 24 34 之间,因此您必须按如下方式更新该条件:

my_path = r'c:\data\FF\Desktop\my_files' 

for file in os.listdir(my_path):
    path_file = os.path.join(my_path, file)
    with open(path_file, 'r') as output:
        reader = csv.reader(output, delimiter = ',')
        read = [row for row in reader if row] 
        for row in read:
            if len(row) >= 24 and len(row) <= 34:
                if row[9] == '3080':
                    print(row[0] + ',' + row[24] + ',' + row[25] + ',' 
                          + row[26] + ',' + row[27] + ',' + row[28] + ',' + row[29]
                          + ',' + row[30] + ',' + row[31] + ',' + row[32] + ',' + row[33] + ','
                          + row[34])

注意运算符更新。这个小改动将确保您在24 - 34 中拥有您所期望的行。

【讨论】:

  • @E.Mancebo 是的,仍然收到同样的错误。
【解决方案2】:

这是一个示例,其中数据行的长度从 20 到 39 不等。只有长度在 24 到 34 之间且列 9 == 3080 的行才会打印:

input.csv

0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19
0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20
0,1,2,3,4,5,6,7,8,3080,10,11,12,13,14,15,16,17,18,19,20,21
0,1,2,3,4,5,6,7,8,3080,10,11,12,13,14,15,16,17,18,19,20,21,22
0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23
0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24
0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25
0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26
0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27
0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28
0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29
0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
0,1,2,3,4,5,6,7,8,3080,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31
0,1,2,3,4,5,6,7,8,3080,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32
0,1,2,3,4,5,6,7,8,3080,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33
0,1,2,3,4,5,6,7,8,3080,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34
0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35
0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36
0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37
0,1,2,3,4,5,6,7,8,3080,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38

test.py

import csv

with open('input.csv','r',newline='') as f:
    r = csv.reader(f)
    for row in r:
        if 24 <= len(row) <= 34:
            if row[9] == '3080':
                print(','.join([row[0]] + row[24:]))

输出:

0,24,25,26,27,28,29,30,31
0,24,25,26,27,28,29,30,31,32
0,24,25,26,27,28,29,30,31,32,33

【讨论】:

  • 为什么3438行没有被打印出来?
  • 0-34 是长度 35,0-38 是长度 39,在 24-34 长度之外。如果这让您感到困惑,我觉得您没有很好地解释您的要求
  • 是的。明白了。谢谢你。你能支持我的问题吗?
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 2021-05-29
  • 2020-09-17
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多