【发布时间】:2021-05-20 04:11:19
【问题描述】:
我有一个名为new_to_csv 的列表,如下所示:
[['88', 'BRC', 'LON', '2020-09-07 06:05:07+00:00', '2020-09-07 09:03:00+00:00'],
['*021', 'BRC', 'LON', '2020-09-07 08:26:07+00:00', '2020-09-07 09:34:01+00:00'],`
['*023', 'BRC', 'LON', '2020-09-07 08:30:00+00:00', '2020-09-07 13:50:18+00:00'], `
['nodigit', 'BRC', 'LON', '2020-09-07 03:00:03+00:00', '2020-09-07 23:30:01+00:00'],
['++158', 'BRC', 'LON', '2020-09-07 05:42:05+00:00', '2020-09-07 10:02:01+00:00'],`
['200', 'BRC', 'LON', '2020-09-07 05:08:32+00:00', '2020-09-07 13:05:00+00:00']]
我还有一个名为 table_x, 的文件,如下所示:
XXXXXX Version 1.0 BATOS DE ESPANA
11-12-2011 t/m 08-12-2012
RG Mini Maxi Organization ISVL PR BSS Afw Rage TrT Trajecto IBB Ant Dato
*****data*****
R 1 99 BALEARIA BALEARIA Gers INT INT C IB 99
R 100 103 Espanola NIG 4
R 104 105 BALEARIA BALEARIA PC ICE C Asd - Ut - Ah - Espana IB 2
R 106 119 Espanola NIG 14
R 120 129 BALEARIA BALEARIA PC ICE C Asd - Ut - Ah - Barcelona - Almeria IB 10
R 130 139 Espanola NIG 10
R 140 149 BALEARIA BALEARIA PC INT C Shl - Amf - Dv - Hgl - Bh - Algeciras IB 10
R 150 159 BALEARIA BALEARIA SVS ICE PC ICE C Asd - Ut - Vl - Barcelona - Almeria IB 10
R 160 219 BALEARIA BALEARIA Gere INT PC INT C IB 60
R 220 229 BALEARIA BALEARIA PC ICE C Asd - Ut - Ah - Barcelona - Almeria IB 10
现在对于new_to_csv 中的每个x[0],我首先检查它是否为数字。如果它是在 mini 和 maxi 之间检查的数字,则会发生。我抓住它所在行的3:8 列并将其粘贴到该列表后面。如果x[0] 不是数字,我会跳过它并写下“未知”作为输出。
这是我的代码:
table_x = r"C:\Users\ELK\Downloads\table_x.txt"
result_list = []
with open(table_x, "r") as outer:
reader = csv.reader(outer, delimiter="\t")
next(reader) #skip the first 4 lines
next(reader)
next(reader)
next(reader)
for row in reader:
mini = row[1]
maxi = row[2]
for x in new_to_csv:
veh_number = x[0]
check_digit = veh_number.isdigit()
if check_digit: #<--- check if its digit
if int(veh_number) > int(mini) and int(veh_number) < int(maxi):
result_list.append(x+ [row[3]]+ [row[4]]+ [row[5]]+ [row[6]]+ [row[7]]+ [row[8]])
if check_digit is False: #if its not digit
result_list.append(x + [row[3]] + [row[4]] + [row[5]] + ["Unkown"])
print(result_list)
这是我预期的输出:
[['88', 'BRC', 'LON', '2020-09-07 06:05:07+00:00', '2020-09-07 09:03:00+00:00', 'BALEARIA', 'BALEARIA', '', 'Gers INT', '', 'INT'],
['*021', 'BRC', 'LON', '2020-09-07 08:26:07+00:00', '2020-09-07 09:34:01+00:00','Unkown', 'Unkown', '', 'Unkown'],
['*023', 'BRC', 'LON', '2020-09-07 08:30:00+00:00', '2020-09-07 13:50:18+00:00', 'Unkown', 'Unkown', '', 'Unkown'],
['nodigit', 'BRC', 'LON', '2020-09-07 03:00:03+00:00','2020-09-07 23:30:01+00:00', 'Unkown', 'Unkown', '', 'Unkown'],
['++158', 'BRC', 'LON', '2020-09-07 05:42:05+00:00', '2020-09-07 10:02:01+00:00', 'Unkown', 'Unkown', '', 'Unkown'],
['200', 'BRC', 'LON', '2020-09-07 05:08:32+00:00', '2020-09-07 13:05:00+00:00', 'BALEARIA', 'BALEARIA', '', 'Gere INT', 'PC', 'INT']]
在您的上方看到我预期的输出。 88 是一个数字,介于 1 和 99 之间,所以我从该行中抓取 3:8 列。 *021 不是数字,所以我写 'Unknown' 等。
现在您在下面看到我收到的输出,它似乎多次执行非数字。我不明白为什么...
这是我收到的输出:
[['88', 'BRC', 'LON', '2020-09-07 06:05:07+00:00', '2020-09-07 09:03:00+00:00', 'BALEARIA', 'BALEARIA', '', 'Gers INT', '', 'INT'],
['*021', 'BRC', 'LON', '2020-09-07 08:26:07+00:00', '2020-09-07 09:34:01+00:00','BALEARIA', 'BALEARIA', '', 'Unkown'],
['*023', 'BRC', 'LON', '2020-09-07 08:30:00+00:00', '2020-09-07 13:50:18+00:00', 'BALEARIA', 'BALEARIA', '', 'Unkown'],
['nodigit', 'BRC', 'LON', '2020-09-07 03:00:03+00:00','2020-09-07 23:30:01+00:00', 'BALEARIA', 'BALEARIA', '', 'Unkown'],
['++158', 'BRC', 'LON', '2020-09-07 05:42:05+00:00', '2020-09-07 10:02:01+00:00', 'BALEARIA', 'BALEARIA', '', 'Unkown'],
['*021', 'BRC', 'LON', '2020-09-07 08:26:07+00:00', '2020-09-07 09:34:01+00:00', 'Espanola', '', '', 'Unkown'],
['*023', 'BRC', 'LON', '2020-09-07 08:30:00+00:00', '2020-09-07 13:50:18+00:00', 'Espanola', '', '', 'Unkown'],
['nodigit', 'BRC', 'LON', '2020-09-07 03:00:03+00:00', '2020-09-07 23:30:01+00:00', 'Espanola', '', '', 'Unkown'],
['++158', 'BRC', 'LON', '2020-09-07 05:42:05+00:00', '2020-09-07 10:02:01+00:00', 'Espanola', '', '', 'Unkown'],
['*021', 'BRC', 'LON', '2020-09-07 08:26:07+00:00', '2020-09-07 09:34:01+00:00', 'BALEARIA', 'BALEARIA', '', 'Unkown'],
['*023', 'BRC', 'LON', '2020-09-07 08:30:00+00:00', '2020-09-07 13:50:18+00:00', 'BALEARIA', 'BALEARIA', '', 'Unkown'],
['nodigit', 'BRC', 'LON', '2020-09-07 03:00:03+00:00', '2020-09-07 23:30:01+00:00', 'BALEARIA', 'BALEARIA', '', 'Unkown'],
['++158', 'BRC', 'LON', '2020-09-07 05:42:05+00:00', '2020-09-07 10:02:01+00:00', 'BALEARIA', 'BALEARIA', '', 'Unkown'],
['*021', 'BRC', 'LON', '2020-09-07 08:26:07+00:00', '2020-09-07 09:34:01+00:00', 'Espanola', '', '', 'Unkown'],
['*023', 'BRC', 'LON', '2020-09-07 08:30:00+00:00', '2020-09-07 13:50:18+00:00', 'Espanola', '', '', 'Unkown'],
['nodigit', 'BRC', 'LON', '2020-09-07 03:00:03+00:00', '2020-09-07 23:30:01+00:00', 'Espanola', '', '', 'Unkown'],
['++158', 'BRC', 'LON', '2020-09-07 05:42:05+00:00', '2020-09-07 10:02:01+00:00', 'Espanola', '', '', 'Unkown'],
['*021', 'BRC', 'LON', '2020-09-07 08:26:07+00:00', '2020-09-07 09:34:01+00:00', 'BALEARIA', 'BALEARIA', '', 'Unkown'],
['*023', 'BRC', 'LON', '2020-09-07 08:30:00+00:00', '2020-09-07 13:50:18+00:00', 'BALEARIA', 'BALEARIA', '', 'Unkown'],
['nodigit', 'BRC', 'LON', '2020-09-07 03:00:03+00:00', '2020-09-07 23:30:01+00:00', 'BALEARIA', 'BALEARIA', '', 'Unkown'],
['++158', 'BRC', 'LON', '2020-09-07 05:42:05+00:00', '2020-09-07 10:02:01+00:00', 'BALEARIA', 'BALEARIA', '', 'Unkown'],
['*021', 'BRC', 'LON', '2020-09-07 08:26:07+00:00', '2020-09-07 09:34:01+00:00', 'Espanola', '', '', 'Unkown'],
['*023', 'BRC', 'LON', '2020-09-07 08:30:00+00:00', '2020-09-07 13:50:18+00:00', 'Espanola', '', '', 'Unkown'],
['nodigit', 'BRC', 'LON', '2020-09-07 03:00:03+00:00', '2020-09-07 23:30:01+00:00', 'Espanola', '', '', 'Unkown'],
['++158', 'BRC', 'LON', '2020-09-07 05:42:05+00:00', '2020-09-07 10:02:01+00:00', 'Espanola', '', '', 'Unkown'],
['*021', 'BRC', 'LON', '2020-09-07 08:26:07+00:00', '2020-09-07 09:34:01+00:00', 'BALEARIA', 'BALEARIA', '', 'Unkown'],
['*023', 'BRC', 'LON', '2020-09-07 08:30:00+00:00', '2020-09-07 13:50:18+00:00', 'BALEARIA', 'BALEARIA', '', 'Unkown'],
[ 'nodigit', 'BRC', 'LON', '2020-09-07 03:00:03+00:00', '2020-09-07 23:30:01+00:00', 'BALEARIA', 'BALEARIA', '', 'Unkown'],
['++158', 'BRC', 'LON', '2020-09-07 05:42:05+00:00', '2020-09-07 10:02:01+00:00', 'BALEARIA', 'BALEARIA', '', 'Unkown'],
['*021', 'BRC', 'LON', '2020-09-07 08:26:07+00:00', '2020-09-07 09:34:01+00:00', 'BALEARIA', 'BALEARIA', '', 'Unkown'],
['*023', 'BRC', 'LON', '2020-09-07 08:30:00+00:00', '2020-09-07 13:50:18+00:00', 'BALEARIA','BALEARIA', '', 'Unkown'],
['nodigit', 'BRC', 'LON', '2020-09-07 03:00:03+00:00', '2020-09-07 23:30:01+00:00', 'BALEARIA', 'BALEARIA', '', 'Unkown'],
['++158', 'BRC', 'LON', '2020-09-07 05:42:05+00:00', '2020-09-07 10:02:01+00:00', 'BALEARIA', 'BALEARIA', '', 'Unkown'],
['*021', 'BRC', 'LON', '2020-09-07 08:26:07+00:00', '2020-09-07 09:34:01+00:00', 'BALEARIA', 'BALEARIA', '', 'Unkown'],
['*023', 'BRC', 'LON', '2020-09-07 08:30:00+00:00', '2020-09-07 13:50:18+00:00', 'BALEARIA', 'BALEARIA', '', 'Unkown'],
['nodigit', 'BRC', 'LON', '2020-09-07 03:00:03+00:00', '2020-09-07 23:30:01+00:00', 'BALEARIA', 'BALEARIA', '', 'Unkown'],
['++158', 'BRC', 'LON', '2020-09-07 05:42:05+00:00', '2020-09-07 10:02:01+00:00', 'BALEARIA', 'BALEARIA', '', 'Unkown'],
['200', 'BRC', 'LON', '2020-09-07 05:08:32+00:00', '2020-09-07 13:05:00+00:00', 'BALEARIA', 'BALEARIA', '', 'Gere INT', 'PC', 'INT'],
['*021', 'BRC', 'LON', '2020-09-07 08:26:07+00:00', '2020-09-07 09:34:01+00:00', 'BALEARIA', 'BALEARIA', '', 'Unkown'],
['*023', 'BRC', 'LON', '2020-09-07 08:30:00+00:00', '2020-09-07 13:50:18+00:00', 'BALEARIA', 'BALEARIA', '', 'Unkown'],
['nodigit', 'BRC', 'LON', '2020-09-07 03:00:03+00:00', '2020-09-07 23:30:01+00:00', 'BALEARIA', 'BALEARIA', '', 'Unkown'],
['++158', 'BRC', 'LON', '2020-09-07 05:42:05+00:00', '2020-09-07 10:02:01+00:00', 'BALEARIA', 'BALEARIA', '', 'Unkown']]
【问题讨论】:
-
为什么在您的预期输出中有
'Unkown', 'Unkown', '', 'Unkown',而不是单个“未知”,也没有与列 3:8 的字段数相同数量的“未知”? -
@00 我真的不知道.. 你是什么意思?
-
你应该单独读取table_x(在一个循环中),将结果存储在一个变量中(可能是一个dict或类似的),然后单独遍历new_to_csv。现在您正在迭代 new_to_csv inside 您对 table_x 的迭代。这不会按原样工作。
-
@00 但是为什么数字会很顺利呢?这个问题似乎只发生在非数字上。
-
我的意思是,您声明“如果 x[0] 不是数字,我会跳过它并写 'Unkown' 作为输出。”。然而,第一列不是数字的预期输出有 3 次“未知”加上一个空字符串;根据那句话,没有一个“未知”。
标签: python list csv for-loop filter