是否可以在 Python 脚本中使用 lynx 解码 HTML？答案

【问题标题】：Is it possible to decode HTML using lynx, in a Python script?是否可以在 Python 脚本中使用 lynx 解码 HTML？
【发布时间】：2021-09-07 16:25:31
【问题描述】：

设html变量为包含网页全部源代码的字符串，例如

html = "<!doctype html>\n<html><head><title>My title</title></head>LOTS OF CHARS HERE</html>"

我想以人类可读的格式print 这个网页，如果可能的话，使用lynx。我尝试了各种各样的事情

print(subprocess.run(['echo', html, '|', 'lynx', '-stdin', '-dump'], capture_output=True, text=True).stdout)

或

p1 = subprocess.Popen(["echo", html], stdout=subprocess.PIPE)
print(subprocess.run(['lynx', '-stdin', '-dump'], stdin=p1.stdout, capture_output=True, text=True).stdout)

但它失败并出现以下错误

OSError: [Errno 7] 参数列表太长：'echo'

知道如何让它工作吗？

【问题讨论】：

标签： python subprocess lynx

【解决方案1】：

不需要echo，使用html 作为input 的lynx。

print(subprocess.run(['lynx', '-stdin', '-dump'], input=html, capture_output=True, text=True).stdout)

【讨论】：

不错；我没有意识到subprocess.run 现在允许将文字字符串传递给input。