【问题标题】:How to create a PDF of images stored in Google Cloud Storage?如何创建存储在 Google Cloud Storage 中的图像 PDF?
【发布时间】:2018-05-23 02:13:50
【问题描述】:

如果这是一个愚蠢的问题,我很抱歉。我对 GCP 很陌生。

对于网络应用,我需要从存储在 Cloud Storage 中的图像创建 PDF。

首先,我尝试将 python 包 fpdf 与存储在 Cloud Storage 中的文件一起使用,看看这是否可行。 因为图片是在线存储的,所以我使用urllib2来获取图片。

代码:

from fpdf import FPDF
import urllib2
import os

imagelist = ["https://storage.googleapis.com/seventh-terrain-179700.appspot.com/excuses.jpg", "https://storage.googleapis.com/seventh-terrain-179700.appspot.com/excuses2.jpg"]

pdf = FPDF()
i = 0
for image in imagelist:
    image = urllib2.urlopen(image)

    # writing image files in current folder
    with open('image'+str(i)+'.jpg','wb') as output:
        output.write(image.read())

    pdf.add_page()
    pdf.image('image'+str(i)+'.jpg', 10, 10, 100, 100) # pdf.image(image,x,y,w,h)

    # removing images
    os.remove('image'+str(i)+'.jpg')
    i += 1

# Creating PDF in current folder
pdf.output("yourfile.pdf", "F")

这句话很好。

然后我尝试在本地服务器中部署相同的代码:

import webapp2
from fpdf import FPDF
import urllib2
import os

pdf = FPDF()

class MainPage(webapp2.RequestHandler):
    def get(self):
        imagelist = ["https://storage.googleapis.com/seventh-terrain-179700.appspot.com/excuses.jpg", "https://storage.googleapis.com/seventh-terrain-179700.appspot.com/excuses2.jpg"]

        pdf = FPDF()
        i = 0
        for image in imagelist:
            image = urllib2.urlopen(image)

            with open('image'+str(i)+'.jpg','wb') as output:
                output.write(image.read())

            pdf.add_page()
            pdf.image('image'+str(i)+'.jpg', 10, 10, 100, 100) # pdf.image(image,x,y,w,h)

            os.remove('image'+str(i)+'.jpg')
            i += 1

        pdf.output("yourfile.pdf", "F")

application = webapp2.WSGIApplication([('/', MainPage)],
                                      debug=True)

但是,我遇到了错误:

WARNING  2017-12-08 19:21:56,184 sandbox.py:1082] The module _winreg is whitelisted for local dev only. If your application relies on _winreg, it is likely that it will not function properly in production.
WARNING  2017-12-08 14:21:56,190 urlfetch_stub.py:551] Stripped prohibited headers from URLFetch request: ['Host']
ERROR    2017-12-08 19:21:57,332 webapp2.py:1528] [Errno 30] Read-only file system: 'image0.jpg'
Traceback (most recent call last):
  File "C:\Program Files (x86)\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\lib\webapp2-2.3\webapp2.py", line 1511, in __call__
    rv = self.handle_exception(request, response, e)
  File "C:\Program Files (x86)\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\lib\webapp2-2.3\webapp2.py", line 1505, in __call__
    rv = self.router.dispatch(request, response)
  File "C:\Program Files (x86)\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\lib\webapp2-2.3\webapp2.py", line 1253, in default_dispatcher
    return route.handler_adapter(request, response)
  File "C:\Program Files (x86)\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\lib\webapp2-2.3\webapp2.py", line 1077, in __call__
    return handler.dispatch()
  File "C:\Program Files (x86)\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\lib\webapp2-2.3\webapp2.py", line 547, in dispatch
    return self.handle_exception(e, self.app.debug)
  File "C:\Program Files (x86)\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\lib\webapp2-2.3\webapp2.py", line 545, in dispatch
    return method(*args, **kwargs)
  File "C:\MyMiniGCPProjects\FPDF\main.py", line 23, in get
    with open('image'+str(i)+'.jpg','wb') as output:
  File "C:\Program Files (x86)\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\google\appengine\tools\devappserver2\python\runtime\stubs.py", line 278, in __init__
    raise IOError(errno.EROFS, 'Read-only file system', filename)
IOError: [Errno 30] Read-only file system: 'image0.jpg'
ERROR    2017-12-08 19:21:57,339 wsgi.py:279]
Traceback (most recent call last):
  File "C:\Program Files (x86)\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\google\appengine\runtime\wsgi.py", line 267, in Handle
    result = handler(dict(self._environ), self._StartResponse)
  File "C:\Program Files (x86)\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\lib\webapp2-2.3\webapp2.py", line 1519, in __call__
    response = self._internal_error(e)
  File "C:\Program Files (x86)\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\lib\webapp2-2.3\webapp2.py", line 1511, in __call__
    rv = self.handle_exception(request, response, e)
  File "C:\Program Files (x86)\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\lib\webapp2-2.3\webapp2.py", line 1505, in __call__
    rv = self.router.dispatch(request, response)
  File "C:\Program Files (x86)\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\lib\webapp2-2.3\webapp2.py", line 1253, in default_dispatcher
    return route.handler_adapter(request, response)
  File "C:\Program Files (x86)\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\lib\webapp2-2.3\webapp2.py", line 1077, in __call__
    return handler.dispatch()
  File "C:\Program Files (x86)\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\lib\webapp2-2.3\webapp2.py", line 547, in dispatch
    return self.handle_exception(e, self.app.debug)
  File "C:\Program Files (x86)\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\lib\webapp2-2.3\webapp2.py", line 545, in dispatch
    return method(*args, **kwargs)
  File "C:\MyMiniGCPProjects\FPDF\main.py", line 23, in get
    with open('image'+str(i)+'.jpg','wb') as output:
  File "C:\Program Files (x86)\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\google\appengine\tools\devappserver2\python\runtime\stubs.py", line 278, in __init__
    raise IOError(errno.EROFS, 'Read-only file system', filename)
IOError: [Errno 30] Read-only file system: 'image0.jpg'

我找不到任何可行的解决方案。 有没有办法直接使用 Cloud Storage 中的文件并将 PDF 保存在 Cloud Storage 中?

【问题讨论】:

    标签: python google-app-engine pdf google-cloud-storage webapp2


    【解决方案1】:

    您遇到了沙盒限制之一。来自The sandbox

    App Engine 应用程序不能:

    • 写入文件系统。应用程序必须使用Cloud Datastore 来存储持久数据。允许从文件系统读取, 并且所有随应用上传的应用文件都可用。

    好吧,关于数据存储的说明实际上是一种误导,有几种存储选项,最适合您的情况是恕我直言云存储 (GCS)。

    但是您不能使用常规的open() 将文件写入 GCS,您需要为此使用 GCS 客户端库。你可以在这里找到一个例子:Write a CSV to store in Google Cloud Storage

    【讨论】:

    • 谢谢。 imagelist 中的图像存储在 Google Cloud Storage 中。我不能在FPDF().image() 中直接向您发送它们,因为它需要将图像存储在本地。所以,我使用urllib2.urlopen(image) 创建图像文件的对象。但是,因为FPDF().image() 需要将图像存储在本地,所以我使用的是with open('image'+str(i)+'.jpg','wb') as output,这会产生上述错误。您建议存储在 Google Cloud Storage 中,但图像已经在 Google Cloud Storage 中。知道如何解决这个问题吗?
    • 文档指出file参数也可以是URL:Path or URL of the image:fpdf.org/en/doc/image.htm
    • 是的,它是这么说的,但是当我使用时:pdf.image("https://storage.googleapis.com/seventh-terrain-179700.appspot.com/excuses.jpg", 10, 10, 100, 100) 我得到错误:RuntimeError: FPDF error: Missing or incorrect image file: https://storage.googleapis.com/seventh-terrain-179700.appspot.com/excuses.jpg. error: [Errno 22] invalid mode ('rb') or filename: 'https://storage.googleapis.com/seventh-terrain-179700.appspot.com/excuses.jpg'
    • 所以,我使用urllib2.urlopen(image) 并将其存储在本地,如下所述:link 你知道如何让它工作吗?
    猜你喜欢
    • 2020-01-17
    • 1970-01-01
    • 2015-02-15
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2017-04-11
    • 1970-01-01
    • 2021-03-06
    相关资源
    最近更新 更多