【发布时间】:2021-11-18 16:28:05
【问题描述】:
更新(我得到了进一步...)
所以我的目标是为一个奇怪的 XML 类似但不是 XML 格式的脚本编写一个解析器。
<[file][][]
<[cultivation][][]
<[string8][coordinate_system][lonlat]>
<[list_vegetation_map_exclusion_zone][vegetation_map_exclusion_zone_list][]
>
<[string8][buildings_texture_folder][]>
<[list_plant][plant_list][]
>
<[list_building][building_list][]
<[building][element][0]
<[vector3_float64][position][7.809637 46.182262 0]>
<[float32][direction][-1.82264196872711]>
<[float32][length][25.9434452056885]>
<[float32][width][17.4678573608398]>
<[int32][floors][3]>
<[stringt8c][roof][gable]>
<[stringt8c][usage][residential]>
> ...
到目前为止,我得到了这个:
def toc_parser(file_path):
# save complete file in variable
f = open(file_path, "r")
toc = f.read()
parser = OneOrMore(Word(alphas))
# exclude kommis
parser.ignore('//' + pp.restOfLine())
#exclude <>
klammern = Suppress("<")
klammernzu = Suppress(">")
eckig = Suppress("[")
eckigzu = Suppress("]")
element = Suppress("[element]")
leer = Suppress("[]")
#grammar:
nameBuilding = "building"
namePosition = "position"
nameDirection = "direction"
nameLength = "length"
nameWidth = "width"
nameFloors = "floors"
nameRoof = "roof"
nameUsage = "usage"
buildingzahl = klammern + eckig + nameBuilding + eckigzu + element +eckig + Word(nums) +eckigzu
pos = klammern + eckig + SkipTo(Literal("]")) + eckigzu + eckig + namePosition + eckigzu + eckig + Combine(Word(nums)+"."+Word(nums))+ Combine(Word(nums)+"."+Word(nums))+ Word(nums)+ eckigzu + klammernzu
direc = klammern + eckig + SkipTo(Literal("]")) + eckigzu + eckig + nameDirection + eckigzu + eckig + Combine(Optional("-")+Word(nums)+Optional("."+Word(nums)))+ eckigzu + klammernzu
leng = klammern + eckig + SkipTo(Literal("]")) + eckigzu + eckig + nameLength + eckigzu+eckig + Combine(Word(nums)+Optional("."+Word(nums)))+ eckigzu + klammernzu
widt = klammern + eckig + SkipTo(Literal("]")) + eckigzu + eckig + nameWidth + eckigzu+eckig+Combine(Word(nums)+Optional("."+Word(nums)))+ eckigzu + klammernzu
floors = klammern + eckig + SkipTo(Literal("]")) + eckigzu + eckig + nameFloors + eckigzu+eckig+Word(nums)+ eckigzu + klammernzu
roof = klammern + eckig + SkipTo(Literal("]")) + eckigzu + eckig + nameRoof + eckigzu +eckig+Word(alphas)+ eckigzu + klammernzu
usag = klammern + eckig + SkipTo(Literal("]")) + eckigzu + eckig + nameUsage+ eckigzu+eckig+Word(alphas)+ eckigzu + klammernzu
building = buildingzahl + pos +direc +leng + widt + floors + roof + usag + klammernzu
file = klammern + eckig + Literal("file") + eckigzu + leer + leer + klammern + eckig+ Literal("cultivation") +eckigzu + leer + leer
vegexcl = Literal("<[list_vegetation_map_exclusion_zone][vegetation_map_exclusion_zone_list][]") + klammernzu
coordsis = Literal("<[string8][coordinate_system][lonlat]>")
textures = Literal("<[string8][buildings_texture_folder][]>")
listPlants = Literal("<[list_plant][plant_list][]") + klammernzu
listBuildings = Literal("<[list_building][building_list][]") + OneOrMore(building) + klammernzu
listLights = Literal("<[list_light][light_list][]") + klammernzu
listAirportLights = Literal("<[list_airport_light][airport_light_list][]") + klammernzu
listXref = Literal("<[list_xref][xref_list][]") + klammernzu
fileganz = file + coordsis + vegexcl + textures + listPlants + listBuildings + listLights + listAirportLights + listXref + klammernzu + klammernzu
print(fileganz.parseString(toc))
问题:
我需要能够覆盖外部脚本中的某些值并发现 (here) 这就是你的做法,但它总是输入“else”
#define Values to be updated
valuesToUpdate = {
"building":"home"
""
}
def updateSelectedDefinitions(tokens):
if tokens.name in valuesToUpdate:
newVal = valuesToUpdate[tokens.name]
return "%" % tokens.name, newVal
else:
raise ParseException(print("no Update definded"))
非常感谢您的帮助:)
【问题讨论】:
-
XML 解析器通常解析通用的
<tag attr=val>some content</tag>格式,而不对实际的标签值进行硬编码。您的结构的通用框架是<[type][name][value] contents...>,其中可选内容将是相同<[type][name] etcl>格式的递归实例。只需几行代码,在 pyparsing 中编写代码应该非常简单。然后您将遍历解析的结构以提取“buliding”或“position”或任何值。您也可以考虑让解析器转换为 JSON 或 XML,然后使用 stdlib 提取您的值。 -
@PaulMcG 你能详细说明我会怎么做吗?举个例子?
标签: python python-3.x list parsing pyparsing