如何使用 email.Parser 从文件中提取电子邮件正文？答案

【问题标题】：How to extract an email body from a file using email.Parser?如何使用 email.Parser 从文件中提取电子邮件正文？
【发布时间】：2016-01-10 02:13:58
【问题描述】：

我正在尝试使用 python 和 email.Parser 来解析来自文件的电子邮件。我使用以下命令

headers = Parser().parse(open(filename, 'r'))

解析文件。但是当我尝试获取我使用的身体时，例如

print(headers.get_payload()[0])

我得到类似的东西

From nobody Mon Oct 12 16:32:25 2015
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Hi Alex,
....

有什么方法可以去掉前三/四行吗？以及如何解码“fr=C3=BCher”之类的内容？

【问题讨论】：

标签： python email

【解决方案1】：

使用Message.get_payload

查看这个答案Python : How to parse the Body from a raw email , given that raw email does not have a "Body" tag or anything

【讨论】：

【解决方案2】：

要获取消息正文，您必须 walk() 它是不同的部分，即：

a = email.message_from_file(open(filename, 'r')) #shorthand for Parser().parse
body = ''

if a.is_multipart():
   for part in b.walk():
       ctype = part.get_content_type()
       cdispo = str(part.get('Content-Disposition'))

       # skip any text/plain (txt) attachments
       if ctype == 'text/plain' and 'attachment' not in cdispo:
           body = part.get_payload(decode=True)  # decode
           break
# not multipart - i.e. plain text, no attachments
else:
    body = b.get_payload(decode=True)

get_payload() 中的 decode=True 进行 base64/etc 解码，即 'fr=C3=BCher' 字符串

【讨论】：

确定；您可以在 Anurag 已链接的问题中查看我关于此主题的更长的咆哮