如何使用python拆分文件名答案

【问题标题】：how to split the filename using python如何使用python拆分文件名
【发布时间】：2021-03-11 16:09:08
【问题描述】：

我正在使用 python 使用元素和子元素过程创建 xml 文件。我的文件夹中有一个 zip 文件列表，如下所示：

Retirement_participant-plan_info_v1_getPlankeys_rev1_2021_03_09.zip
Retirement_participant-plan_info_resetcache_secretmanager_rev1_2021_03_09.zip
Retirement_participant-plan_info_v1_mypru_plankeys_rev1_2021_03_09.zip
Retirement_participant-plan_info_resetcache_param_value_rev1_2021_03_09.zip
Retirement_participant-plan_info_resetcache_param_v1_balances_rev1_2021_03_09.zip

我想拆分这些 zip 文件并获得如下名称：

Retirement_participant-plan_info_v1_getPlankeys
Retirement_participant-plan_info_resetcache_secretmanager
Retirement_participant-plan_info_v1_mypru_plankeys
Retirement_participant-plan_info_resetcache_param_value
Retirement_participant-plan_info_resetcache_param_v1_balances

PS：我想在从 zip 文件创建名称时删除 _rev1_2021_03_09.zip。

这是我的 Python 代码。它适用于Retirement_participant-plan_info_v1_getPlankeys_rev1_2021_03_09.zip，但如果我的 zip 文件名称太大，例如 Retirement_participant-plan_info_resetcache_param_v1_balances_rev1_2021_03_09.zip，它就不起作用了

    Proxies = SubElement(proxy, 'Proxies')
    path = "./"
    for f in os.listdir(path):
        if '.zip' in f:
            Proxy = SubElement(Proxies, 'Proxy')
            name = SubElement(Proxy, 'name')
            fileName = SubElement(Proxy, 'fileName')
            a = f.split('_')
            name.text = '_'.join(a[:3])
            fileName.text = str(f)

【问题讨论】：

您想从末尾剥离的位是否始终相同，即_rev1_2021_03_09.zip？还是会有所不同，但始终遵循该模式，例如改天可能是_rev8_2021_03_11.zip？
我认为问题出在倒数第二行：name.text = '_'.join(a[:3])。目前，它只是从 _ 拆分中获取前 3 个段。由于您的文件名在开始时有不同的长度，这有时会切断文件名的一部分。由于结尾是一致的，您可以将 3 更改为 -4，这会将所有内容保留到最后四个部分。

标签： python python-3.x xml python-2.7

【解决方案1】：

你可以str.splitrev1_

>>> filenames

['Retirement_participant-plan_info_v1_getPlankeys_rev1_2021_03_09.zip',
 'Retirement_participant-plan_info_resetcache_secretmanager_rev1_2021_03_09.zip',
 'Retirement_participant-plan_info_v1_mypru_plankeys_rev1_2021_03_09.zip',
 'Retirement_participant-plan_info_resetcache_param_value_rev1_2021_03_09.zip',
 'Retirement_participant-plan_info_resetcache_param_v1_balances_rev1_2021_03_09.zip']

>>> names = [fname.split('_rev1_')[0] for fname in filenames]

>>> names

['Retirement_participant-plan_info_v1_getPlankeys',
 'Retirement_participant-plan_info_resetcache_secretmanager',
 'Retirement_participant-plan_info_v1_mypru_plankeys',
 'Retirement_participant-plan_info_resetcache_param_value',
 'Retirement_participant-plan_info_resetcache_param_v1_balances']

同样可以通过将maxsplit 限制为4 来实现str.rsplit：

>>> names = [fname.rsplit('_', 4)[0] for fname in filenames]
>>> names
['Retirement_participant-plan_info_v1_getPlankeys',
 'Retirement_participant-plan_info_resetcache_secretmanager',
 'Retirement_participant-plan_info_v1_mypru_plankeys',
 'Retirement_participant-plan_info_resetcache_param_value',
 'Retirement_participant-plan_info_resetcache_param_v1_balances']

【讨论】：

谢谢。这实际上奏效了。我刚刚用你说的逻辑更新了我的代码：for f in os.listdir(path): if '.zip' in f: Proxy = SubElement(Proxies, 'Proxy') name = SubElement(Proxy, 'name') fileName = SubElement(Proxy, 'fileName') a = f.split('_rev1_')[0] # a = f.split('_') name.text = a # name.text = '_'.join(a[:5]) fileName.text = str(f)
我们可以在这个逻辑上使用 argparse 定义一个变量吗names = [fname.split('_rev1_')[0] for fname in filenames]`` using names = [fname.split('repa.version')[0] for fname in filenames]` ``它对我不起作用。有没有更好的办法！

【解决方案2】：

如果 rev 和 date 始终相同 (2021_03_09)，只需将它们替换为空字符串：

filenames = [f.replace("_rev1_2021_03_09.zip", "") for f in os.listdir(path)]

【讨论】：