【问题标题】:Copying specific files to a new folder, while maintaining the original subdirectory tree将特定文件复制到新文件夹,同时保留原始子目录树
【发布时间】:2023-03-27 06:59:01
【问题描述】:

我有一个大目录,其中包含许多要排序的子目录,我正在尝试将特定文件类型复制到新文件夹,但我想保留原始子目录。

def copyFile(src, dest):
try:
    shutil.copy(src,dest)
except shutil.Error as e:
    print('Error: %s' % e)
except IOError as e:
    print('Error: %s' % s.strerror)


for root, directories, files in os.walk(directory):
    for directoryname in directories:
        dirpath = os.path.join(root,directoryname)
        dir_paths.append(dirpath)
        dir_names.append(directoryname)

        if not os.listdir(dirpath): #Cheching if directory is empty
            print("Empty")
            EmptyDirs.append(directoryname) #Add directory name to empty directory list
            EmptyDirPath.append(dirpath)
        else:
            pass

    for filename in files:
        filepath = os.path.join(root,filename)
        file_paths.append(filepath)
        file_names.append(filename)

    if filename.lower().endswith(".sldasm"):
            print(filename.encode('utf8'))
            SolidModels.append(filename)
            copyFile(filepath,dest)
    elif filename.lower().endswith(".sldprt"):
            print(filename.encode('utf8'))
            SolidModels.append(filename)
            copyFile(filepath,dest)
    else:
        pass

这是我现在使用的代码,但它只是复制文件,没有复制它们原来所在的子目录,因此它们在新文件夹中完全没有组织。

这是使用copytree的新代码,但是现在特定文件不会复制,只有子目录可以。

def copytree(src, dst, symlinks=False, ignore=None):
    names = os.listdir(src)
    if ignore is not None:
        ignored_names = ignore(src, names)
    else:
        ignored_names = set()

os.makedirs(dst)
errors = []

for name in names:
    if name in ignored_names:
        continue

    srcname = os.path.join(src, name)
    dstname = os.path.join(dst, name)

    try:
        if symlinks and os.path.islink(srcname):
            linkto = os.readlink(srcname)
            os.symlink(linkto, dstname)
        elif os.path.isdir(srcname):
            copytree(srcname, dstname, symlinks, ignore)
        else:
            if src is "*.sldasm":
                copy2(srcname, dstname)
            elif src is "*.sldprt":
                copy2(srcname, dstname)

    except (IOError, os.error) as why:
            errors.append((srcname, dstname, str(why)))

【问题讨论】:

  • shutil.copytreedocs.python.org/2/library/shutil.html特别是copytree示例代码。
  • 样品我用过,谢谢!我得到了所有要复制的子目录,但现在我无法将文件复制到它们所属的子目录中。有什么建议吗?我在上面添加了我的更新代码
  • 代码中的缩进非常令人困惑,因为它混合了制表符和空格。仅使用空格通常被认为是一种最佳做法,并且大多数编辑器都可以配置为在您键入时自动将制表符转换为空格。

标签: python copy subdirectory shutil os.walk


【解决方案1】:

您可以通过使用(滥用?)其可选的ignore 关键字参数来使用内置的shutil.copytree() 函数做您想做的事情。棘手的部分是,如果给定,它必须是一个可调用的,它返回每个目录中应该复制的内容,而不是应该复制的内容。

但是,可以编写一个类似于 shutil.ignore_patterns() 的工厂函数来创建一个执行所需操作的函数,并将其用作 ignore 关键字参数的值。

返回的函数首先通过fnmatch.filter() 函数确定要保留的文件,然后将它们从给定目录中的所有内容列表中删除,除非它们是子目录名称 ,在这种情况下,它们将留待以后的[递归]处理。 (这就是它复制整个树的原因,也是您尝试编写自己的 copytree() 函数时可能出现的问题。

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Works in Python 2.7 & 3.x

import fnmatch
from os.path import isdir, join

def include_patterns(*patterns):
    """ Function that can be used as shutil.copytree() ignore parameter that
    determines which files *not* to ignore, the inverse of "normal" usage.

    This is a factory function that creates a function which can be used as a
    callable for copytree()'s ignore argument, *not* ignoring files that match
    any of the glob-style patterns provided.

    ‛patterns’ are a sequence of pattern strings used to identify the files to
    include when copying the directory tree.

    Example usage:

        copytree(src_directory, dst_directory,
                 ignore=include_patterns('*.sldasm', '*.sldprt'))
    """
    def _ignore_patterns(path, all_names):
        # Determine names which match one or more patterns (that shouldn't be
        # ignored).
        keep = (name for pattern in patterns
                        for name in fnmatch.filter(all_names, pattern))
        # Ignore file names which *didn't* match any of the patterns given that
        # aren't directory names.
        dir_names = (name for name in all_names if isdir(join(path, name)))
        return set(all_names) - set(keep) - set(dir_names)

    return _ignore_patterns


if __name__ == '__main__':

    from shutil import copytree, rmtree
    import os

    src = r'C:\vols\Files\PythonLib\Stack Overflow'
    dst = r'C:\vols\Temp\temp\test'

    # Make sure the destination folder does not exist.
    if os.path.exists(dst) and os.path.isdir(dst):
        print('removing existing directory "{}"'.format(dst))
        rmtree(dst, ignore_errors=False)

    copytree(src, dst, ignore=include_patterns('*.png', '*.gif'))

    print('done')

【讨论】:

  • 太好了,非常感谢!我也可以使用它来包含基于名称的文件吗?例如,如果我想包含名称中带有“Fork”的所有文件?
  • 是的,我相信是的——任何你想要的glob-style文件名模式都可以使用。即"*Fork*.*".