遍历python中的目录和过滤文件的递归代码答案

【问题标题】：Recursive code to traverse through directories in python and filter files遍历python中的目录和过滤文件的递归代码
【发布时间】：2018-08-30 16:16:48
【问题描述】：

我想在“项目”目录中递归搜索“反馈报告”文件夹，如果该文件夹没有更多子目录，我想以特定方式处理文件。

到达目标目录后，我想在该目录中找到最新的反馈report.xlsx（其中会包含它的许多以前的版本）

数据非常庞大且目录结构不一致。我相信以下算法应该让我接近我想要的行为，但仍然不确定。我尝试了多个杂乱无章的代码脚本来转换为 json 路径层次结构，然后从中解析，但不一致使得代码非常庞大且不可读

文件的路径很重要。

我想实现的算法是：

dictionary_of_files_paths = {}
def recursive_traverse(path):

    //not sure if this is a right base case
    if(path.isdir):    
        if re.match(dir_name, *eedback*port*) and dir has no sub directory:
          process(path,files)
          return

    for contents in os.listdir(path):
        recursive_traverse(os.path.join(path, contents)) 

    return

def process(path,files):

    files.filter(filter files only with xlsx)
    files.filter(filter files only that have *eedback*port* in it)
    files.filter(os.path.getmtime > 2016)
    files.sort(key=lambda x:os.path.getmtime(x))
    reversed(files)
    dictionary_of_files_paths[path] = files[0]

recursive_traverse("T:\\Something\\Something\\Projects")

在实际实施之前我需要指导，并且需要验证这是否正确。

我从 stackoverflow 获得了另一个用于路径层次结构的 sn-p，它是

try:
    for contents in os.listdir(path):
        recursive_traverse(os.path.join(path, contents)) 
except OSError as e:
    if e.errno != errno.ENOTDIR:
        raise
    //file

【问题讨论】：

标签： python-3.x lambda subdirectory os.path

【解决方案1】：

使用pathlib 和glob。

测试目录结构：

.
├── Untitled.ipynb
├── bar
│   └── foo
│       └── file2.txt
└── foo
    ├── bar
    │   └── file3.txt
    ├── foo
    │   └── file1.txt
    └── test4.txt

代码：

from pathlib import Path
here = Path('.')
for subpath in here.glob('**/foo/'):
    if any(child.is_dir() for child in subpath.iterdir()):
        continue # Skip the current path if it has child directories
    for file in subpath.iterdir():
        print(file.name)
        # process your files here according to whatever logic you need

输出：

file1.txt
file2.txt

【讨论】：