【问题标题】:python: read a folder with file name and file conceptpython:读取带有文件名和文件概念的文件夹
【发布时间】:2017-04-30 10:14:03
【问题描述】:

我有一个数据框(如下),用于“名称”列, 我想删除(例如第一行)

'/Users/xccxken/Desktop/NNRelease/paperVersion/'

'.txt'

只保留单词like(第一行示例)

'Topic+Topic_of_Situation.shortageglut'

在每一行

,n_1,n_2,name
0,water,shortage,/Users/xccxken/Desktop/NNRelease/paperVersion/Topic+Topic_of_Situation.shortageglut.txt
1,supply,shortage,/Users/xccxken/Desktop/NNRelease/paperVersion/Topic+Topic_of_Situation.shortageglut.txt
2,skill,shortage,/Users/xccxken/Desktop/NNRelease/paperVersion/Topic+Topic_of_Situation.shortageglut.txt
214,income,policy,/Users/xccxken/Desktop/NNRelease/paperVersion/Topic+Topic_of_Plan&Deal&Rules.rules.legal.txt
215,immigration,policy,/Users/xccxken/Desktop/NNRelease/paperVersion/Topic+Topic_of_Plan&Deal&Rules.rules.legal.txt
216,health,policy,/Users/xccxken/Desktop/NNRelease/paperVersion/Topic+Topic_of_Plan&Deal&Rules.rules.legal.txt
485,license,agreement,/Users/xccxken/Desktop/NNRelease/paperVersion/Topic+Topic_of_Plan&Deal&Rules.deal.txt
486,lease,agreement,/Users/xccxken/Desktop/NNRelease/paperVersion/Topic+Topic_of_Plan&Deal&Rules.deal.txt
487,immunity,agreement,/Users/xccxken/Desktop/NNRelease/paperVersion/Topic+Topic_of_Plan&Deal&Rules.deal.txt
488,franchise,agreement,/Users/xccxken/Desktop/NNRelease/paperVersion/Topic+Topic_of_Plan&Deal&Rules.deal.txt

【问题讨论】:

  • 您的示例显示的是 CSV 文件,而不是数据框。您打算使用pandas 还是CSV 阅读器?
  • 这是一个数据框,我只是将它打印到csv中读取。谢谢

标签: python xml list contain


【解决方案1】:

您可以使用.str.strip() 方法:

prefix = '/Users/xccxken/Desktop/NNRelease/paperVersion/'
suffix = '.txt'
df['name'] = df['name'].str.rstrip(suffix).str.lstrip(prefix)

或正则表达式:

description = r'([^/]+)\.txt'
df['name'] = df['name'].str.extract(description)

【讨论】:

  • 谢谢。你能告诉我是否想将“Topic+Topic_of_Situation.othersituation”提取到“Topic+Topic_of_Situation”,如何编写这个“description2”。 'Topic+Topic_of_Situation'到'Topic','description3'怎么写,谢谢!
  • 您的框架中有几种不同的图案。您可能想了解更多有关 Python 正则表达式 (docs.python.org/3/library/re.html) 的信息并在线使用它们以找到有效的表达式:regex101.com
猜你喜欢
  • 2013-08-08
  • 2018-07-28
  • 1970-01-01
  • 1970-01-01
  • 2017-08-02
  • 2013-12-25
  • 2016-10-14
  • 1970-01-01
  • 2017-10-29
相关资源
最近更新 更多