【发布时间】:2023-03-21 09:23:01
【问题描述】:
我想用 xPath 提取 N.1.2, N.1.1, N.2.r.1, ...., N.1.3, N.1.4
所以,我的字典中有 xpath。
# Value - Types of Message in batch
"N.1.1": R3Item(
elemId="N.1.1",
xPath="/MCCI_IN200100UV01[@ITSVersion='XML_1.0'][@xsi:schemaLocation='urn:hl7-org:v3 MCCI_IN200100UV01.xsd']/name[@codeSystem='2.16.840.1.113883.3.989.2.1.1.1']/@code",
required=True,
comment="N.1.1 - Types of Message in batch",
),
# Types of Message in batch
"N.1.1_csv": R3Item(
elemId="N.1.1_csv",
xPath="/MCCI_IN200100UV01[@ITSVersion='XML_1.0'][@xsi:schemaLocation='urn:hl7-org:v3 MCCI_IN200100UV01.xsd']/name[@codeSystem='2.16.840.1.113883.3.989.2.1.1.1']/@codeSystemVersion",
required=True,
),
# Value - Batch Number
"N.1.2": R3Item(
elemId="N.1.2",
xPath="/MCCI_IN200100UV01[@ITSVersion='XML_1.0'][@xsi:schemaLocation='urn:hl7-org:v3 MCCI_IN200100UV01.xsd']/id[@root='2.16.840.1.113883.3.989.2.1.3.22']/@extension",
required=True,
comment="N.1.2 - Batch Number",
),
# Value - Batch Sender Identifier
"N.1.3": R3Item(
elemId="N.1.3",
xPath="/MCCI_IN200100UV01[@ITSVersion='XML_1.0'][@xsi:schemaLocation='urn:hl7-org:v3 MCCI_IN200100UV01.xsd']/sender[@typeCode='SND']/device[@classCode='DEV'][@determinerCode='INSTANCE']/id[@root='2.16.840.1.113883.3.989.2.1.3.13'][1]/@extension",
required=True,
comment="N.1.3 - Batch Sender Identifier",
),
# Value - Batch Receiver Identifier
"N.1.4": R3Item(
elemId="N.1.4",
xPath="/MCCI_IN200100UV01[@ITSVersion='XML_1.0'][@xsi:schemaLocation='urn:hl7-org:v3 MCCI_IN200100UV01.xsd']/receiver[@typeCode='RCV']/device[@classCode='DEV'][@determinerCode='INSTANCE']/id[@root='2.16.840.1.113883.3.989.2.1.3.14'][1]/@extension",
required=True,
comment="N.1.4 - Batch Receiver Identifier",
),
# Value - Date of Batch Transmission
"N.1.5": R3Item(
elemId="N.1.5",
xPath="/MCCI_IN200100UV01[@ITSVersion='XML_1.0'][@xsi:schemaLocation='urn:hl7-org:v3 MCCI_IN200100UV01.xsd']/creationTime/@value",
required=True,
comment="N.1.5 - Date of Batch Transmission",
),
# Value - Message Identifier
"N.2.r.1": R3Item(
elemId="N.2.r.1",
xPath="//PORR_IN049016UV[r]/id[@root='2.16.840.1.113883.3.989.2.1.3.1'][1]/@extension",
required=True,
comment="N.2.r.1 - Message Identifier",
),
# Value - Message Sender Identifier
"N.2.r.2": R3Item(
elemId="N.2.r.2",
xPath="/MCCI_IN200100UV01[@ITSVersion='XML_1.0'][@xsi:schemaLocation='urn:hl7-org:v3 MCCI_IN200100UV01.xsd']/PORR_IN049016UV[r]/sender[@typeCode='SND']/device[@classCode='DEV'][@determinerCode='INSTANCE']/id[@root='2.16.840.1.113883.3.989.2.1.3.11'][1]/@extension",
required=True,
comment="N.2.r.2 - Message Sender Identifier",
),
# Value - Message Receiver Identifier
"N.2.r.3": R3Item(
elemId="N.2.r.3",
xPath="/MCCI_IN200100UV01[@ITSVersion='XML_1.0'][@xsi:schemaLocation='urn:hl7-org:v3 MCCI_IN200100UV01.xsd']/PORR_IN049016UV[r]/receiver[@typeCode='RCV']/device[@classCode='DEV'][@determinerCode='INSTANCE']/id[@root='2.16.840.1.113883.3.989.2.1.3.12'][1]/@extension",
required=True,
comment="N.2.r.3 - Message Receiver Identifier",
),
# Value - Date of Message Creation
"N.2.r.4": R3Item(
elemId="N.2.r.4",
xPath="/MCCI_IN200100UV01[@ITSVersion='XML_1.0'][@xsi:schemaLocation='urn:hl7-org:v3 MCCI_IN200100UV01.xsd']/PORR_IN049016UV[r]/creationTime/@value",
required=True,
comment="N.2.r.4 - Date of Message Creation",
),
下面是示例xml的一部分
<?xml version="1.0" encoding="UTF-8"?>
<MCCI_IN200100UV01 ITSVersion="XML_1.0" xsi:schemaLocation="urn:hl7-org:v3 MCCI_IN200100UV01.xsd" xmlns="urn:hl7-org:v3" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<id extension="N.1.2" root="2.16.840.1.113883.3.989.2.1.3.22"/>
<creationTime value="N.1.5"/>
<responseModeCode code="D"/>
<interactionId extension="MCCI_IN200100UV01" root="2.16.840.1.113883.1.6"/>
<name code="N.1.1" codeSystem="2.16.840.1.113883.3.989.2.1.1.1" codeSystemVersion="1.01"/>
<PORR_IN049016UV>
<id extension="N.2.r.1" root="2.16.840.1.113883.3.989.2.1.3.1"/>
<creationTime value="N.2.r.4"/>
<interactionId extension="PORR_IN049016UV" root="2.16.840.1.113883.1.6"/>
<processingCode code="P"/>
<processingModeCode code="T"/>
<acceptAckCode code="AL"/>
<receiver typeCode="RCV">
<device classCode="DEV" determinerCode="INSTANCE">
<id extension="N.2.r.3" root="2.16.840.1.113883.3.989.2.1.3.12"/>
</device>
</receiver>
</PORR_IN049016UV>
<receiver typeCode="RCV">
<device classCode="DEV" determinerCode="INSTANCE">
<id extension="N.1.4" root="2.16.840.1.113883.3.989.2.1.3.14"/>
</device>
</receiver>
<sender typeCode="SND">
<device classCode="DEV" determinerCode="INSTANCE">
<id extension="N.1.3" root="2.16.840.1.113883.3.989.2.1.3.13"/>
</device>
</sender>
</MCCI_IN200100UV01>
下面是我的代码,但结果是空列表。 我想像“N.1.1”一样提取
def extractData(tree):
"""r3 data extracted by xpath"""
root = tree.getroot()
keys = getList(R3_DATA)
for key in keys:
xPath = getxPath(key)
print(root.xpath(xPath))
我应该如何解决这个问题或者我应该怎么做? 如果有其他库或示例代码可以做到这一点,你能告诉我吗?
【问题讨论】:
-
getvalue 返回键的字典路径
-
元素位于命名空间
xmlns="urn:hl7-org:v3"中,因此您的 XPath 评估代码需要考虑命名空间。 -
在一些较旧的 lxml 版本中,但不是最新版本,我认为您可以使用
root.xpath(xPath, namespaces = { None : 'urn:hl7-org:v3' }) -
出现错误(TypeError: empty namespace prefix is not supported in XPath )。 root.xpath("/MCCI_IN200100UV01[@ITSVersion='XML_1.0'][@xsi:schemaLocation='urn:hl7-org:v3 MCCI_IN200100UV01.xsd']/PORR_IN049016UV[1]/sender[@typeCode='SND' ]/device[@classCode='DEV'][@determinerCode='INSTANCE']/id[@root='2.16.840.1.113883.3.989.2.1.3.11'][1]/@extension",namespaces={无:'urn:hl7-org:v3', "xsi":"w3.org/2001/XMLSchema-instance"})
-
每个路径中的所有元素都需要使用命名空间前缀
标签: python-3.x xml xpath xml-parsing lxml