【发布时间】:2020-09-08 17:41:55
【问题描述】:
我正在尝试将生成的文件解析为对象列表。
不幸的是,生成的文件的结构并不总是相同,但它们包含相同的字段(以及许多其他垃圾)。
例如:
function foo(); # Don't Care
function maybeanotherfoo(); # Don't Care
int maybemoregarbage; # Don't Care
product_serial = "CDE1102"; # I want this <---------------------
unnecessary_info1 = 10; # Don't Care
unnecessary_info2 = "red" # Don't Care
product_id = 1134412; # I want this <---------------------
unnecessary_info3 = "88" # Don't Care
product_serial = "DD1232"; # I want this <---------------------
product_id = 3345111; # I want this <---------------------
unnecessary_info1 = "22" # Don't Care
unnecessary_info2 = "panda" # Don't Care
product_serial = "CDE1102"; # I want this <---------------------
unnecessary_info1 = 10; # Don't Care
unnecessary_info2 = "red" # Don't Care
unnecessary_info3 = "bear" # Don't Care
unnecessary_info4 = 119 # Don't Care
product_id = 1112331; # I want this <---------------------
unnecessary_info5 = "jj" # Don't Care
我想要一个对象列表(每个对象都有:序列号和 ID)。
我尝试了以下方法:
import re
class Product:
def __init__(self, id, serial):
self.product_id = id
self.product_serial = serial
linenum = 0
first_string = "product_serial"
second_string = "product_id"
with open('products.txt', "r") as products_file:
for line in products_file:
linenum += 1
if line.find(first_string) != -1:
product_serial = re.search('\"([^"]+)', line).group(1)
#How do I proceed?
任何建议将不胜感激! 谢谢!
【问题讨论】:
-
那么你的代码是做什么的?它有效吗?有错误吗?如果有,它们是什么?
-
我的代码可以找到第一个product_serial(CDE1102)。但是我怎样才能找到 product_id 然后从那时起继续解析呢?
-
请从intro tour 重复on topic 和how to ask。 “告诉我如何解决这个编码问题”不是堆栈溢出问题。你必须做出诚实的尝试,然后然后就你的算法或技术提出一个具体的问题。 “任何建议”对于 Stack Overflow 来说过于宽泛。有许多教程向您展示如何读取文件、如何处理字符串数据等。您应该能够识别输入中的常量字符串并分隔输入行。