【问题标题】:Python: split sentence from filePython:从文件中分割句子
【发布时间】:2018-10-07 02:48:48
【问题描述】:

我有一个这样的文件 data.txt:

<<a
<<t This is a title 01
/t>>
<<c
This is a sentence. This is a sentence. This is a sentence. This is a sentence.
This is a sentence. This is a sentence. This is a sentence. This is a sentence.
/c>>
/a>>
<<a
<<t This is a title 02
/t>>
<<c
This is a sentence. This is a sentence. This is a sentence. This is a sentence.
This is a sentence. This is a sentence. This is a sentence. This is a sentence.
/c>>
/a>>

我想读取文件并将每个句子拆分为如下列表:

[[This is a title 01],[This is a sentence.],[This is a sentence.]...[This is a title 02],[This is a sentence.]...]

提前感谢您的帮助。

【问题讨论】:

  • 您好!到目前为止,您自己尝试过什么?

标签: python python-3.x


【解决方案1】:

您可以尝试以下方法-

result = []
with open('data.txt', 'r') as f:
  for line in f:
    if "This is a title" in line:
      cleaned_line = line.lstrip('<<t').strip()
      result.append(cleaned_line)
    elif line.startswith("This is a sentence"):
      sentence_list = line.split('.')
      for _ in sentence_list:
        result.append(_)

这是如何工作的?
打开文件,逐行迭代。 提取标题。去掉 &lt;&lt;t 和空格。
要提取句子,只需在句点 (.) 处拆分行字符串。然后将所有内容附加到 result 列表中。
编辑:
注意:您最终将获得一个字符串列表。由于您是 Python 新手,我将把它作为练习留给您,让您将字符串列表转换为列表列表。它应该非常简单。

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2017-10-23
    • 2015-12-21
    • 1970-01-01
    • 2014-04-08
    • 2020-01-29
    • 1970-01-01
    • 1970-01-01
    • 2018-05-03
    相关资源
    最近更新 更多