【发布时间】:2018-01-15 03:51:43
【问题描述】:
只是为了好玩,我正在尝试在 Python 3.6.0 中创建一个批量重命名应用程序,该应用程序应该捕获、基于正则表达式拆分文件名,并正确命名文件。出于测试目的,我在输出文件中打印,直到它正常工作。
这是我的代码:
def batch_rename(self):
if self._root is None:
raise NotADirectoryError("self._root is empty")
with open('output.txt', 'w') as self._open_file:
for root, dirs, files in os.walk(self._root):
for name in files:
new_file = self._rename_file(root, name)
self._add_size(root, name)
self._open_file.write("\"{0}\" renamed to \"{1}\"\n".format(name, new_file))
self._count += 1
self._open_file.write("\n")
self._open_file.write("Total files: {0}\n".format(self._count))
self._open_file.write("Total size: {0}\n".format(self._get_total_size()))
def _rename_file(self, root_path, file_name):
file_name = bytes(file_name, 'utf-8').decode('utf-8', 'ignore')
# file_name = ''.join(x for x in file_name if x in string.printable)
split_names = re.split(pattern=self._re, string=file_name)
if len(split_names) > 1:
new_file = self._prefix + ' ' + ''.join(split_names)
else:
new_file = self._prefix + ' ' + '' + split_names[0]
new_file = new_file.replace(' ', ' ')
return new_file
由于不可写字符,我遇到了编码问题,例如:
- 俄语字母(奇怪,我知道)
- 红心、梅花、黑桃等符号。
我收到的错误信息是:
Traceback (most recent call last):
File "C:/Users/thisUser/OneDrive/Projects/Examples.Python/BatchFileRenamer/BatchFileRename2.py", line 90, in <module>
br.batch_rename()
File "C:/Users/thisUser/OneDrive/Projects/Examples.Python/BatchFileRenamer/BatchFileRename2.py", line 34, in batch_rename
self._open_file.write("\"{0}\" renamed to \"{1}\"\n".format(name, new_file))
File "C:\Users\thisUser\Anaconda3\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2665' in position 10: character maps to <undefined>
我尝试浏览 3 个 SO 问题/答案:
- Delete every non utf-8 symbols froms string
- read a file and try to remove all non UTF-8 chars
- 'str' object has no attribute 'decode' in Python3
我没有找到任何有用的答案。
有人可以帮忙吗?我将不胜感激:)
【问题讨论】:
-
如果您想修改这些 Unicode 文件名,使其只是纯 ASCII,请查看 Unidecode。
-
您是否尝试使用
codecs而不是 open 打开文件? -with codecs.open('output.txt', 'w', 'utf-8')...,看这里 - stackoverflow.com/questions/934160/… -
@droravr 这对我有用!如果您想将其发布为答案,我会将其标记为可接受的:)
-
@Sometowngeek,当然,谢谢 :)
标签: python string python-3.x encoding utf-8