如何在不删除封闭文本的情况下从 python 3 中的 xml 文档中删除粗体标签？答案

【问题标题】：How to get rid of the bold tag from xml document in python 3 without removing the enclosed text?如何在不删除封闭文本的情况下从 python 3 中的 xml 文档中删除粗体标签？
【发布时间】：2019-03-21 02:34:35
【问题描述】：

我正在尝试从 this xml 文档中删除粗体标签 (<b> Some text in bold here </b>)（但希望保持标签所覆盖的文本完好无损）。粗体标签出现在以下单词/文本周围：目标、设计、设置、参与者、干预、主要结果测量、结果、结论和试验注册。

这是我的 Python 代码：

import requests
import urllib
from urllib.request import urlopen
import xml.etree.ElementTree as etree
from time import sleep
import json    

urlHead = 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&retmode=xml&rettype=abstract&id='
pmid = "28420629"
completeUrl = urlHead + pmid    
response = urllib.request.urlopen(completeUrl)
tree = etree.parse(response)
studyAbstractParts = tree.findall('.//AbstractText')
for studyAbstractPart in studyAbstractParts:
    print(studyAbstractPart.text)

此代码的问题在于它会在“AbstractText”标签下找到所有文本，但它会停止（或忽略）粗体标签及其之后的文本。原则上，我需要“<AbstractText> </AbstractText>”标签之间的所有文本，但粗体格式<b> </b> 只是一个糟糕的障碍。

【问题讨论】：

标签： python xml elementtree

【解决方案1】：

您可以使用itertext() 方法获取<AbstractText> 及其子元素中的所有文本。

studyAbstractParts = tree.findall('.//AbstractText')
for studyAbstractPart in studyAbstractParts:
    for t in studyAbstractPart.itertext():
        print(t)

输出：

Objectives
 To determine whether preoperative dexamethasone reduces postoperative vomiting in patients undergoing elective bowel surgery and whether it is associated with other measurable benefits during recovery from surgery, including quicker return to oral diet and reduced length of stay.
Design
 Pragmatic two arm parallel group randomised trial with blinded postoperative care and outcome assessment.
Setting
 45 UK hospitals.
Participants
 1350 patients aged 18 or over undergoing elective open or laparoscopic bowel surgery for malignant or benign pathology.
Interventions
 Addition of a single dose of 8 mg intravenous dexamethasone at induction of anaesthesia compared with standard care.
Main outcome measures
 Primary outcome: reported vomiting within 24 hours reported by patient or clinician.
vomiting with 72 and 120 hours reported by patient or clinician; use of antiemetics and postoperative nausea and vomiting at 24, 72, and 120 hours rated by patient; fatigue and quality of life at 120 hours or discharge and at 30 days; time to return to fluid and food intake; length of hospital stay; adverse events.
Results
 1350 participants were recruited and randomly allocated to additional dexamethasone (n=674) or standard care (n=676) at induction of anaesthesia. Vomiting within 24 hours of surgery occurred in 172 (25.5%) participants in the dexamethasone arm and 223 (33.0%) allocated standard care (number needed to treat (NNT) 13, 95% confidence interval 5 to 22; P=0.003). Additional postoperative antiemetics were given (on demand) to 265 (39.3%) participants allocated dexamethasone and 351 (51.9%) allocated standard care (NNT 8, 5 to 11; P<0.001). Reduction in on demand antiemetics remained up to 72 hours. There was no increase in complications.
Conclusions
 Addition of a single dose of 8 mg intravenous dexamethasone at induction of anaesthesia significantly reduces both the incidence of postoperative nausea and vomiting at 24 hours and the need for rescue antiemetics for up to 72 hours in patients undergoing large and small bowel surgery, with no increase in adverse events.
Trial registration
 EudraCT (2010-022894-32) and ISRCTN (ISRCTN21973627).

【讨论】：