循环遍历 Python 中的文件目录答案

【问题标题】：Looping through a directory of files in Python循环遍历 Python 中的文件目录
【发布时间】：2012-11-20 19:07:40
【问题描述】：

我已经完成了我的第一个 python 脚本的 99%，但是我被一个目录中文件的 for-each 循环所绊倒。我的脚本适用于单个文件，我只是不确定如何将它应用于多个文件，一次一个。

我有一个路径 path = ~/documents 和一个带有我想排除的文件名的 XML 文件：

 <root><synced name="Already Synced"><sfile name="Filename">base</sfile><sfile name="Filename">File1.blah</sfile><sfile name="Filename">File2.blah</sfile><sfile name="Filename">File3.blah</sfile></synced></root>

我将如何在所有以*.blah 结尾且不在 XML 文件中的文件上运行我的脚本？

我有这个，但它是不行的：

path = '~/documents'
tree = ET.parse("sync_list.xml")
root = tree.getroot()
for elem in root.findall('sfile'):
    synced = elem.text
do_library = os.listdir(path)
if glob.fnmatch.fnmatch(file,"*.blah") and not synced:
  for entry in do_library:
    file = os.path.join(path, entry)
    result = plistlib.readPlist('file')

非常感谢您提供的任何帮助。

【问题讨论】：

它在做什么，你期望它做什么？

标签： python loops elementtree os.walk

【解决方案1】：

import fnmatch
import os

path = os.path.expanduser('~/documents')
tree = ET.parse("sync_list.xml")
root = tree.getroot()
synced = [elt.text for elt in root.findall('synced/sfile')]
for filename in os.listdir(path):
    if fnmatch.fnmatch(filename, '*.blah') and filename not in synced:
        filename = os.path.join(path, filename)

编辑：按照@mata 的建议添加了 os.path.expanduser。

【讨论】：

这非常接近工作，但它似乎并没有排除 sync_list.xml 中列出的文件名。它确实正确地遍历了文件，并运行了我的所有逻辑。
synced 为空，因为 root.findall('sfile') 应该是 root.findall('synced/sfile')。也许现在就试试吧。
完美！非常感谢。

【解决方案2】：

path = '~/documents'

这不会为您提供主目录中的documents 文件夹。 ~ 并不真正代表主目录，让您可以像通常使用它一样使用它的是 shell 进行波浪号扩展。没有它，~ 可以是任何文件或目录的名称，python 会这样对待它。要正确获取它，请使用：

path = os.path.expanduser('~/documents')

【讨论】：