【问题标题】:Is there a nice way splitting a (potentially) long string without splitting in words in Python?有没有一种很好的方法来拆分(可能)长字符串而不在 Python 中拆分单词?
【发布时间】:2012-04-15 15:01:30
【问题描述】:

我想确保我只打印最多 80 个字符长的行,但我有一个字符串 s 可以比这更短和更长。所以我想把它分成几行而不分割任何单词。

长字符串示例:

s = "This is a long string that is holding more than 80 characters and thus should be split into several lines. That is if everything is working properly and nicely and all that. No mishaps no typos. No bugs. But I want the code too look good too. That's the problem!"

我可以想出办法来做到这一点,例如:

words = s.split(" ")
line = ""
for w in words:
    if len(line) + len(w) <= 80:
        line += "%s " % w
    else:
        print line
        line ="%s " % w

print line

同样,我可以在 while 循环中迭代地使用 s.find(" ")

sub_str_left = 0
pos = 0
next_pos = s.find(" ", pos)
while next_pos > -1:
    if next_pos - sub_str_left > 80:
        print s[sub_str_left:pos-sub_str_left]
        sub_str_left = pos + 1

    pos = next_pos
    next_pos = s.find(" ", pos)

print s[sub_str_left:]

这些都不是很优雅,所以我的问题是,是否有更酷的 pythonic 方式来做到这一点? (也许使用正则表达式左右。)

【问题讨论】:

  • 您的问题与我几天前提出的问题相似。 stackoverflow.com/questions/9894983/…
  • 我错过了那个搜索旧帖子的帖子,猜想是因为我在寻找 splitting 而它谈到 wrapping,但它们是相似的。
  • 嗯,从技术上讲,这称为包装。

标签: python string split word word-wrap


【解决方案1】:

有一个模块:textwrap

例如,您可以使用

print '\n'.join(textwrap.wrap(s, 80))

print textwrap.fill(s, 80)

【讨论】:

    【解决方案2】:
    import re
    
    s = "This is a long string that is holding more than 80 characters and thus should be split into several lines. That is if everything is working properly and nicely and all that. No misshaps no typos. No bugs. But I want the code too look good too. That's the problem!"
    
    print '\n'.join(line.strip() for line in re.findall(r'.{1,80}(?:\s+|$)', s))
    

    输出:

    This is a long string that is holding more than 80 characters and thus should be
    split into several lines. That is if everything is working properly and nicely
    and all that. No misshaps no typos. No bugs. But I want the code too look good
    too. That's the problem!
    

    【讨论】:

      【解决方案3】:
      import re
      re.findall('.{1,80}(?:\W|$)', s)
      

      【讨论】:

      • 与基本的自动换行算法相比,这是一个糟糕的笑话。
      • 不是在速度方面。刚刚将它与 textwrap 进行了基准测试,它的速度大约快了 50 倍。 (注:我知道速度不是一切,只有有趣才是一切)
      • 速度是(几乎 - 您仍然可以尝试更改要求)如果缺少功能;)
      【解决方案4】:

      你可以试试这个python脚本

      import os, sys, re
      s = "This is a long string that is holding more than 80 characters and thus should be split into several lines. That is if everything is working properly and nicely and all that. No misshaps no typos. No bugs. But I want the code too look good too. That's the problem!"
      limit = 83
      n = int(len(s)/limit)
      b = 0
      j= 0
      for i in range(n+2):
      
          while 1:
              if s[limit - j] not in [" ","\t"]:
                  j = j+1
              else:
                  limit = limit - j
                  break
          st = s[b:i*limit]
          print st
          b = i*limit
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 2015-05-15
        • 1970-01-01
        • 1970-01-01
        • 2023-03-13
        • 2013-02-21
        • 1970-01-01
        • 2013-03-14
        相关资源
        最近更新 更多