从具有 readline 偏移量的文件中读取行答案

【问题标题】：Reading lines from a file with offset for readline从具有 readline 偏移量的文件中读取行
【发布时间】：2020-10-29 08:42:53
【问题描述】：

我想逐行读取文件，但我想在每两次读取时移动行指针。该文件看起来像

所以，如果我写

line_1 = f.readline()  # 100
line_2 = f.readline()  # 200

然后在第三个 readline 上，我将获得 300。我想通过 readline 获得 100，而我想通过增量语句获得 200。然后我将它们放入一个循环中，最后我想以这种方式获取这些行：

iteration #1: 100 and 200
iteration #2: 200 and 300
iteration #3: 300 and 400

我该怎么做？

【问题讨论】：

检查docs.python.org/3.8/library/linecache.html - 使用索引，您可以 getline(index) 和 getline(index+1) 然后将索引增加 on 和 rince 并重复直到结束..跨度>

标签： python file

【解决方案1】：

你可以创建一个生成器（它也会删除 EOL 字符，如果你想要不同的东西，你可以去掉 rstrip）：

def readpairsoflines(f):
    l1 = f.readline().rstrip('\n')
    for l2 in f:
        l2 = l2.rstrip('\n')
        yield l1, l2
        l1 = l2

并像这样使用它：

with open(filename) as f:
    for l1, l2 in readpairsoflines(f):
        # Do something with your pair of lines, for example print them
        print(f'{l1} and {l2}')

结果：

100 and 200
200 and 300
300 and 400

使用这种方法，只有两行被读取并保存在内存中。因此，它也适用于可能存在内存问题的大文件。

【讨论】：

【解决方案2】：

我一直是简单且可读解决方案的粉丝（虽然有时不那么“pythonic”）。

with open("example.txt") as f:
    old = f.readline().rstrip()
    
    for line in f:
        line = line.rstrip()
        print("{} and  {}".format(old, line))
        old = line

在遍历其余行之前执行第一次读取
然后，打印所需的输出，并更新old 字符串
rstrip() 调用是必需的离子顺序以删除不需要的尾随 '\n'
我认为对于少于两行的文件不需要打印任何内容；可以轻松修改代码以管理该特殊情况下的任何需求

输出：

100 and  200
200 and  300
300 and  400

【讨论】：

我可以建议使用rstrip 或line = line[:-1] 代替replace 吗？我的替代方案更有效:) 此外，您每行使用两次replace，这没有意义
@RiccardoBucco 感谢您的建议；我相应地更新了答案。我使用了rstrip 建议，每行只应用一次。相反，line = line[:-1] 是错误的；实际上它会删除最后一个字符无论如何（所以它会删除最后一行的最后一个字符）。

【解决方案3】：

现在我建议像这样将文档拆分为换行符

with open('params.txt') as file:
    data = file.read()
data = data.split('\n')
for index, item in enumerate(data):
    try:
        print(str(item) + ' ' + str(data[index + 1]))
    except IndexError:
        print(str(item))

并使用一些列表逻辑打印您需要的内容所以这段代码所做的是创建一个所需值的列表（对于 verrry 大文件效率不高）并获取它们的索引，所以当它打印项目时，它还会打印列表中的下一个项目，并且它对每个项目索引错误都是因为最后一项不会有下一项，但您也可以使用 if else 语句来解决它

【讨论】：

这不会逐行读取文件。它读取整个文件并将其保存在内存中
我就是这么说的not efficient for very large files