【发布时间】:2019-10-24 07:16:03
【问题描述】:
我一直在使用以下 python 3 代码从数据库中存储的内容请求中提取文本。我将如何从结果字符串中去除 \r、\t 和 \n 字符。我已经尝试过正则表达式和其他各种方法,但目前还没有成功。
original = sitetextslist[0]
original = str(original)
original = re.sub('\s+', ' ', original)
print(original)
当我执行上面的python代码时,会打印出以下字符串,bar linebreaks:
target="_blank" class="clearnet">Tor Network Status</a> (<a href="https://jlve2y45zacpbz6s.onion/"
rel="noreferrer" target="_blank">alt</a>)</li>\r\n\t\t\t\t\t<li><a href="https://www.privoxy.org/"
rel="noreferrer" target="_blank"
class="clearnet">Privoxy</a></li>\r\n\t\t\t\t\t</ul>\r\n\t\t\t\t\t</li>\r\n\t\t\t\t\t</ul>\r\n\t\t\t\t\t
<ul>\r\n\t\t\t\t\t<li>\r\n\t\t\t\t\t<h2>Security & Guides</h2>\r\n\t\t\t\t\t<ul>\r\n\t\t\t\t\t<li><a
href="https://en.wikipedia.org/wiki/Five_Eyes" rel="noreferrer" target="_blank" class="clearnet">5 Eyes
(info)</a></li>\r\n\t\t\t\t\t<li><a href="https://www.bleachbit.org/" rel="noreferrer" target="_blank"
class="clearnet">Bleachbit</a><!-- <span class="error" title="Blocks certain Tor exit nodes
completely! Be careful! (01/05/19)">Tor-Block!</span>--></li>\r\n\t\t\t\t\t<li><a
href="http://bv4saxizrmqmtqpz55bdsxlle2brn46kx7gvnneh7qgs267ii3s3vbid.onion
/viewtopic.php?f=150&t=89829" target="blank">The Official BV4 Whonix Guide</a></li>\r\n\t\t\t\t\t
<li><a href="http://world.std.com/~reinhold/diceware.html" rel="noreferrer" target="_blank"
class="clearnet">Diceware: Secure Passphrases</a></li>\r\n\t\t\t\t\t<li
【问题讨论】:
标签: python regex python-3.x