【问题标题】:Python - ZipFile' object has no attribute 'seek'Python - ZipFile'对象没有属性'seek'
【发布时间】:2023-10-02 06:56:01
【问题描述】:

我的代码有问题。我正在尝试使可以制作 ePub 文件的脚本正常工作。它们是压缩的 zip 文件,被压缩(即没有压缩)并且必须按顺序完成。此当前脚本将创建一个 .zip,但在运行 zip -t 命令时,它在 Python Shell 和终端应用程序中都无法使用。

有问题的错误在Python shell上如下:

Traceback (most recent call last):
  File "/Users/Hal/Documents/GitHub/Damore-essay-ebook/GenEpub-old.py", line 29, in <module>
    if zipfile.is_zipfile(zf) is True:
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/zipfile.py", line 183, in is_zipfile
    result = _check_zipfile(fp=filename)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/zipfile.py", line 169, in _check_zipfile
    if _EndRecData(fp):
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/zipfile.py", line 241, in _EndRecData
    fpin.seek(0, 2)
AttributeError: 'ZipFile' object has no attribute 'seek'

Mac 终端上有问题的错误(尽管我确信无论我在哪里运行 zip -t,输出都会相同:

Archive:  IdealogicalEcho.epub
  End-of-central-directory signature not found.  Either this file is not
  a zipfile, or it constitutes one disk of a multi-part archive.  In the
  latter case the central directory and zipfile comment will be found on
  the last disk(s) of this archive.
unzip:  cannot find zipfile directory in one of IdealogicalEcho.epub or
        IdealogicalEcho.epub.zip, and cannot find IdealogicalEcho.epub.ZIP, period.

Python 源代码:

#!/usr/bin/env python

#GenEpub.py - Generates an .epub file from the data provided.
#Ideally with no errors or warnings from epubcheck (needs to be implemented, maybe with the Python wrapper).

import os
import json
import zipfile

with open('metadata.json') as json_file:
        data = json.load(json_file)

#The ePub standard requires deflated compression and a compression order.
zf = zipfile.ZipFile(data["fileName"] + '.epub', mode='w', compression=zipfile.ZIP_STORED)

zf.write(data["fileName"] + '/mimetype')

for dirname, subdirs, files in os.walk(data["fileName"] + '/META-INF'):
    zf.write(dirname)
    for filename in files:
        zf.write(os.path.join(dirname, filename))

for dirname, subdirs, files in os.walk(data["fileName"] + '/EBOOK'):
    zf.write(dirname)
    for filename in files:
        zf.write(os.path.join(dirname, filename))

#zipfile has a built-in validator for debugging
if zipfile.is_zipfile(zf) is True:
    print("ZIP file is valid.")

#Extra debugging information
#print(getinfo.compress_type(zf))
#print(getinfo.compress_size(zf))
#print(getinfo.file_size(zf))

zf.close()

我使用的 JSON 文件:

{
        "comment1": "Metadata.json - Insert the e-book's metadata here. WIP",

        "comment2": "Technical metadata - This is the where the cover image is specified. Recommended to use ePub V2.0.1 over 3.0 for epubVersion and Reflowable rather than Fixed for textPresentation (unless doing a project that requires a specific layout). mobiCover and generateKindle are currently unused but added for futureproofing.",
        "epubCover": "cover.jpg",
        "mobiCover": "cover.jpg",
        "fileName": "IdealogicalEcho",
        "epubVersion": "2.0.1",
        "textPresentation": "Reflowable",
        "generateKindle": "no",

        "comment3": "Book metadata - Information about the e-book itself. Language is specified with ISO 639-1. Rights can be worldwide, country specific or under a permissable license such as Creative-Commons SA",
        "title": "Google's Idealogical Echochamber",
        "creator": "James Damore",
        "subject": "Academic",
        "publisher": "Hal Motley",
        "ISBN": "-",
        "language": "en",
        "rights": "Creative-Commons SA",

        "comment4": "This is the page order that the e-book has. The first number before the colon is the page order, the second is the indentation, third is the page name and fourth is file itself.",
            "pages": [
                    {
                        "1": [0, "Cover", "bookcover.xhtml"],
                        "2": [0, "Title", "title.xhtml"],
                        "3": [0, "Indicia", "indicia.xhtml"],
                        "4": [0, "License", "license.xhtml"],
                        "5": [0, "Contents", "toc.xhtml"],
                        "6": [0, "Foreword", "foreword.xhtml"],
                        "7": [0, "Article", "article.xhtml"]
                    }
                            ]
}

【问题讨论】:

  • 虽然 OP 的问题标题暗示这个问题与 here.. 的问题重复,但实际上并不相关。审查结束。请结束游览。尽情享受吧 ;-)
  • “...放气(即没有压缩)”是一个不合逻辑的。 Deflate 一种压缩方案——不会“存储”任何压缩(这也是一种有效的压缩方案,尽管不是很流行,因为它只会增加总文件大小)。但是 e-pub 格式确实支持压缩。

标签: python json zipfile epub seek


【解决方案1】:

问题出在is_zipfile 内部的某个地方。尽管“文件名可能是文件或类似文件的对象”(13.5.1. ZipFile Objects: zipfile.is_zipfile)仍然存在,但它失败并出现seek错误。

一个可能的解决方案是关闭文件并重新打开它只是为了检查:

zf.close()

with open(data["fileName"] + '.epub','r') as f:
    if zipfile.is_zipfile(f) is True:
        print("ZIP file is valid.")

我还发现该检查非常基础,即使您手动损坏了某些字节,它也会返回True。真正让它失败需要一些努力。

有趣的是,显然更彻底的zipfile.ZipFile.testzip 函数再次需要zf——但如果在zf.close() 之前调用它也会失败。而且没有zf.flush() ...

幸运的是,在运行脚本后使用 zip 检查创建的 ePub 文件发现它没有错误:

~/Documents $ zip -T IdealogicalEcho.epub 
test of IdealogicalEcho.epub OK

(顺便说一下,not 告诉你它是一个有效的 epub。(它不是。))

【讨论】:

  • 这就是我使用 is_zipfile 的来源。 epubcheck 会反复将 zip 文件视为无效而失败,并且所有电子书阅读器都会拒绝打开该书。不过,我确实很欣赏摆脱 seek 错误,这在技术上回答了这个问题。现在制作一个经过 IDPF 验证的 ePub 生成器。
  • @HalMotley:解压一个工作的 epub 并模仿它的结构。应该有一些可以用作模板的元文件夹和文件。
  • 我在 ePub 上运行了代码,我知道它可以充分发挥作用,将扩展名更改为 .zip,将存档解压缩到一个文件夹中。更改了我的 JSON 的 fileName 属性,然后执行 GenEpub.py。 Adobe Digital Editions 从事先识别电子书到现在完全不知道它是什么。肯定是我的代码导致了这个。 :-( github.com/inferno986return/Damore-essay-ebook
【解决方案2】:

我建议您尝试在验证之前关闭。对仍然打开以供写入的文件执行整个文件操作可能不会给出有效的结果。

【讨论】:

  • 在 zf.close() 之后没有运行验证器的骰子。但这仍然是一个有趣的想法。