【问题标题】:to read line from file without getting "\n" appended at the end [duplicate]从文件中读取行而不在末尾附加“\ n”[重复]
【发布时间】:2012-07-02 01:50:46
【问题描述】:

我的文件是“xml.txt”,内容如下:

books.xml 
news.xml
mix.xml

如果我使用 readline() 函数,它会在所有文件的名称处附加“\n”,这是一个错误,因为我想打开 xml.txt 中包含的文件。我是这样写的:

fo = open("xml.tx","r")
for i in range(count.__len__()): #here count is one of may arrays that i'm using
    file = fo.readline()
    find_root(file) # here find_root is my own created function not displayed here

运行此代码时遇到错误:

IOError: [Errno 2] No such file or directory: 'books.xml\n'

【问题讨论】:

  • 不要使用count.__len__(),而是使用len(count)
  • 尽管该问题专门询问了'\n' 字符,但在读取没有行尾的行时存在更普遍的问题,无论文件可能是什么。几乎所有的答案都没有解决这个问题。 (丹尼尔 F. 似乎是)。

标签: python linux file-io ubuntu-10.04


【解决方案1】:

@Lars Wirzenius 回答的用例:

with open("list.txt", "r") as myfile:
    for lines in myfile:
        lines = lines.rstrip('\n')    # the trick
        try:
            with open(lines) as myFile:
                print "ok"
        except IOError as e:
            print "files does not exist"

【讨论】:

    【解决方案2】:
    # mode : 'r', 'w', 'a'
    f = open("ur_filename", "mode")
    for t in f:
        if(t):
            fn.write(t.rstrip("\n"))
    

    "If" 条件将检查该行是否有字符串,如果是,则下一行将删除末尾的“\n”并写入文件。 代码测试。 ;)

    【讨论】:

      【解决方案3】:

      我只是出于好奇而计时。以下是不同大文件的结果。

      tldr; 文件读取然后拆分似乎是处理大文件最快的方法。

      with open(FILENAME, "r") as file:
          lines = file.read().split("\n")
      

      但是,如果您无论如何都需要遍历这些行,那么您可能想要:

      with open(FILENAME, "r") as file:
          for line in file:
              line = line.rstrip("\n")
      

      Python 3.4.2

      import timeit
      
      
      FILENAME = "mylargefile.csv"
      DELIMITER = "\n"
      
      
      def splitlines_read():
          """Read the file then split the lines from the splitlines builtin method.
      
          Returns:
              lines (list): List of file lines.
          """
          with open(FILENAME, "r") as file:
              lines = file.read().splitlines()
          return lines
      # end splitlines_read
      
      def split_read():
          """Read the file then split the lines.
      
          This method will return empty strings for blank lines (Same as the other methods).
          This method may also have an extra additional element as an empty string (compared to
          splitlines_read).
      
          Returns:
              lines (list): List of file lines.
          """
          with open(FILENAME, "r") as file:
              lines = file.read().split(DELIMITER)
          return lines
      # end split_read
      
      def strip_read():
          """Loop through the file and create a new list of lines and removes any "\n" by rstrip
      
          Returns:
              lines (list): List of file lines.
          """
          with open(FILENAME, "r") as file:
              lines = [line.rstrip(DELIMITER) for line in file]
          return lines
      # end strip_readline
      
      def strip_readlines():
          """Loop through the file's read lines and create a new list of lines and removes any "\n" by
          rstrip. ... will probably be slower than the strip_read, but might as well test everything.
      
          Returns:
              lines (list): List of file lines.
          """
          with open(FILENAME, "r") as file:
              lines = [line.rstrip(DELIMITER) for line in file.readlines()]
          return lines
      # end strip_readline
      
      def compare_times():
          run = 100
          splitlines_t = timeit.timeit(splitlines_read, number=run)
          print("Splitlines Read:", splitlines_t)
      
          split_t = timeit.timeit(split_read, number=run)
          print("Split Read:", split_t)
      
          strip_t = timeit.timeit(strip_read, number=run)
          print("Strip Read:", strip_t)
      
          striplines_t = timeit.timeit(strip_readlines, number=run)
          print("Strip Readlines:", striplines_t)
      # end compare_times
      
      def compare_values():
          """Compare the values of the file.
      
          Note: split_read fails, because has an extra empty string in the list of lines. That's the only
          reason why it fails.
          """
          splr = splitlines_read()
          sprl = split_read()
          strr = strip_read()
          strl = strip_readlines()
      
          print("splitlines_read")
          print(repr(splr[:10]))
      
          print("split_read", splr == sprl)
          print(repr(sprl[:10]))
      
          print("strip_read", splr == strr)
          print(repr(strr[:10]))
      
          print("strip_readline", splr == strl)
          print(repr(strl[:10]))
      # end compare_values
      
      if __name__ == "__main__":
          compare_values()
          compare_times()
      

      结果:

      run = 1000
      Splitlines Read: 201.02846901328783
      Split Read: 137.51448011841822
      Strip Read: 156.18040391519133
      Strip Readline: 172.12281272950372
      
      run = 100
      Splitlines Read: 19.956802833188124
      Split Read: 13.657361738959867
      Strip Read: 15.731161020969516
      Strip Readlines: 17.434831199281092
      
      run = 100
      Splitlines Read: 20.01516321280158
      Split Read: 13.786344555543899
      Strip Read: 16.02410587620824
      Strip Readlines: 17.09326775703279
      

      文件读取然后拆分似乎是处理大文件最快的方法。

      注意:read then split("\n") 将在列表末尾有一个额外的空字符串。

      注意:读取然后 splitlines() 检查更多内容,而不仅仅是“\n”,可能是“\r\n”。

      【讨论】:

        【解决方案4】:

        【讨论】:

          【解决方案5】:

          要从末尾删除换行符,您还可以使用以下内容:

          for line in file:
             print line[:-1]
          

          【讨论】:

            【解决方案6】:

            最好为文件使用上下文管理器,len() 而不是调用.__len__()

            with open("xml.tx","r") as fo:
                for i in range(len(count)): #here count is one of may arrays that i'm using
                    file = next(fo).rstrip("\n")
                    find_root(file) # here find_root is my own created function not displayed here
            

            【讨论】:

            • 你忘了说好的 Python 风格还包括不隐藏带有你自己名字的内置插件,比如 file...
            • @martineau,是的,我放了一张幻灯片,因为它已被弃用
            【解决方案7】:

            您可以使用字符串对象的.rstrip() 方法来获取删除尾随空格(包括换行符)的版本。

            例如:

            find_root(file.rstrip())
            

            【讨论】:

            • 你能告诉我语法吗?我的意思是我应该如何以及在哪里添加这个?
            • 此解决方案将删除所有尾随空格,而不仅仅是换行符。如果读取的行是'foo \n',则.rstrip() 将返回'foo',而根据问题陈述需要'foo '
            【解决方案8】:

            只删除末尾的换行符:

            line = line.rstrip('\n')
            

            readline 保留换行符的原因是您可以区分空行(有换行符)和文件结尾(空字符串)。

            【讨论】:

              猜你喜欢
              • 2014-05-11
              • 1970-01-01
              • 1970-01-01
              • 1970-01-01
              • 1970-01-01
              • 2018-12-17
              • 2011-09-21
              • 1970-01-01
              • 1970-01-01
              相关资源
              最近更新 更多