【发布时间】:2014-10-03 07:30:50
【问题描述】:
我有一个这样的输入文件:
This is a text block start
This is the end
And this is another
with more than one line
and another line.
所需的任务是按由某些特殊行分隔的部分读取文件,在这种情况下,它是一个空行,例如[出]:
[['This is a text block start', 'This is the end'],
['And this is another','with more than one line', 'and another line.']]
这样做我得到了想要的输出:
def per_section(it):
""" Read a file and yield sections using empty line as delimiter """
section = []
for line in it:
if line.strip('\n'):
section.append(line)
else:
yield ''.join(section)
section = []
# yield any remaining lines as a section too
if section:
yield ''.join(section)
但如果特殊行是以#开头的行,例如:
# Some comments, maybe the title of the following section
This is a text block start
This is the end
# Some other comments and also the title
And this is another
with more than one line
and another line.
我必须这样做:
def per_section(it):
""" Read a file and yield sections using empty line as delimiter """
section = []
for line in it:
if line[0] != "#":
section.append(line)
else:
yield ''.join(section)
section = []
# yield any remaining lines as a section too
if section:
yield ''.join(section)
如果我允许per_section() 有一个分隔符参数,我可以试试这个:
def per_section(it, delimiter== '\n'):
""" Read a file and yield sections using empty line as delimiter """
section = []
for line in it:
if line.strip('\n') and delimiter == '\n':
section.append(line)
elif delimiter= '\#' and line[0] != "#":
section.append(line)
else:
yield ''.join(section)
section = []
# yield any remaining lines as a section too
if section:
yield ''.join(section)
但是有没有办法让我不对所有可能的分隔符进行硬编码?
【问题讨论】:
-
为什么不直接作为参数传入而不是硬编码?
-
顺便说一句,@falsetru 的
per_section()已添加到 github.com/alvations/lazyme =)
标签: python file delimiter yield