【发布时间】:2016-02-15 17:38:46
【问题描述】:
在 Python 2.7 中,我想查找并计算文件名中包含特定字符串列表的文件。
文件列表:
- Passport_Mike.pdf
- David-Passport.pd
- 伊恩身份证.pdf
- 复制护照 Michael.pdf
- 驾照 John.pdf
我想统计所有包含“Passport”或“ID”的文件。
目前我找到了一种方法,可以根据分隔符 (_-/') 将文件名拆分为不同的单词。无法始终找到我的文件,因为文件无法始终分隔,例如“CopyPassport Michael”,因为它没有相应的分隔符将“Passport”与“Copy”分开。
我的代码基于另一个问题中给出的this 答案。对于此代码,我使用 collections.Counter()
这是我的代码:
from collections import Counter
listOfFiles = [Passport_Mike.pdf, David-Passport.pdf, Iain ID Card.pdf, CopyPassport Michael.pdf, Driving License John.pdf]
searrchTermsList = ["Passport", ÏD']
def fileSplit(string, delimiters):
delimiters = tuple(delimiters)
stack = [string,]
for delimiter in delimiters:
for i, substring in enumerate(stack):
substack = substring.split(delimiter)
stack.pop(i)
for j, _substring in enumerate(substack):
stack.insert(i+j, _substring)
return stack
#This is a complicated split function but this method makes the files split into parts in my next function. Other split methods didn't work for me.
def searchTermsCount(listOfFiles, searchTermsList):
counts = Counter()
for myFile in listOfFiles:
myFileSplit = fileSplit(myFile,('_',' ','-','.'))
counts.update(word.upper() for word in myFileSplit)
myCount = 0
for word in searchTermsList:
myCount +=counts[word]
print "Count files:", myCount
什么是 Python 2.7 方法来计算文件名中包含字符串列表而不使用分隔符的文件?
【问题讨论】:
标签: python string list python-2.7 search