在plone上上传文件并通过python脚本下载它们？答案

【问题标题】：Take uploaded files on plone and download them via a python script?在plone上上传文件并通过python脚本下载它们？
【发布时间】：2019-04-11 09:23:25
【问题描述】：

我在 plone 上创建了一个文档站点，可以从中进行文件上传。我看到 plone 以 blob 的形式将它们保存在文件系统中，现在我需要通过一个 python 脚本来处理它们，该脚本将处理使用 OCR 下载的 pdf。有谁知道该怎么做？谢谢

【问题讨论】：

标签： python pdf blob ocr plone

【解决方案1】：

不确定如何从 BLOB 存储中提取 PDF 或者是否可以，但您可以从正在运行的 Plone 站点中提取它们（例如，通过浏览器视图执行脚本）：

import os
from Products.CMFCore.utils import getToolByName

def isPdf(search_result):
    """Check mime_type for Plone >= 5.1, otherwise check file-extension."""
    if mimeTypeIsPdf(search_result) or search_result.id.endswith('.pdf'):
        return True
    return False


def mimeTypeIsPdf(search_result):
    """
    Plone-5.1 introduced the mime_type-attribute on files.
    Try to get it, if it doesn't exist, fail silently.
    Return True if mime_type exists and is PDF, otherwise False.
    """
    try:
        mime_type = search_result.mime_type
        if mime_type == 'application/pdf':
            return True
    except:
        pass
    return False


def exportPdfFiles(context, export_path):
    """
    Get all PDF-files of site and write them to export_path on the filessytem.
    Remain folder-structure of site.
    """
    catalog = getToolByName(context, 'portal_catalog')
    search_results = catalog(portal_type='File', Language='all')
    for search_result in search_results:
        # For each PDF-file:
        if isPdf(search_result):
            file_path = export_path + search_result.getPath()
            file_content = search_result.getObject().data
            parent_path = '/'.join(file_path.split('/')[:-1])
            # Create missing directories on the fly:
            if not os.path.exists(parent_path):
                os.makedirs(parent_path)
            # Write PDF:
            with open(file_path, 'w') as fil:
                fil.write(file_content)
                print 'Wrote ' + file_path

    print 'Finished exporting PDF-files to ' + export_path

该示例将 Plone 站点的文件夹结构保留在导出目录中。如果您希望它们平放在一个目录中，则需要重复文件名的处理程序。

【讨论】：

De nada，祝您出口愉快！