【问题标题】:google app engine + python: uploading to blobstore causes wrong encodinggoogle app engine + python:上传到blobstore会导致编码错误
【发布时间】:2014-05-30 13:58:28
【问题描述】:

我尝试使用以下 HTML 表单将 blob 上传到 Google App Engine 的 blobstore:

<!DOCTYPE html>
<html>
<head>
<meta charset=utf-8>
</head>
<body>
<form id=upload action={{upload_url}} method=post enctype=multipart/form-data>
  Name: <input type=text name=name>
  Your photo: <input type=file name=image required=required><br><br>
  <input type=submit value=submit>
</form>
</body>
</html>

模板变量{{upload_url}}的值是通过服务器端的upload_url = blobstore.create_upload_url('/upload')获取的。后处理脚本如下:

    class Test(ndb.Model):
        name = StringProperty()
        image = StringProperty()

    test = Test()
    test.name = self.request.get('name')
    image = self.get_uploads('image')[0]
    test.image = str(image.key())
    test.put()

通常,name 字段将填充非英文字符(例如中文)。上述程序在我的本地 SDK 上运行良好。但是,当程序在 Google App Engine 上运行时,name 的编码不正确。那有什么问题呢?

【问题讨论】:

  • 尝试:test.name = self.request.get('name').decode('utf-8')
  • 好吧,错误信息:UnicodeEncodeError: 'ascii' codec can't encode character u'\u6211' in position 0: ordinal not in range(128)
  • 你可以尝试不带upload_url和redirect的方式上传,找出编码问题。看看这个要点中的 gcs_upload.py:gist.github.com/voscausa/9541133

标签: python google-app-engine blobstore


【解决方案1】:

您不必在元标记参数周围加上引号:&lt;meta charset="UTF-8"&gt;?另外,请尝试:&lt;meta http-equiv="content-type" content="text/html; charset=utf-8" /&gt;。并且,请确保您以 UTF-8 编码保存模板的文本文档。

【讨论】:

  • 谢谢。但是 HTML5 允许不使用引号。所以,为了效率,我觉得还是减少HTML文件的大小比较好。
【解决方案2】:

刚刚发现这是一个多年的老错误,请参阅here。有两种解决方案:

(1) 在app.yaml中加入如下语句:

libraries:
- name: webob
  version: "1.2.3"

(2) 添加文件appengine_config.yaml,内容如下:

# -*- coding: utf-8 -*-
from webob import multidict

def from_fieldstorage(cls, fs):
    """Create a dict from a cgi.FieldStorage instance.
    See this for more details:
    http://code.google.com/p/googleappengine/issues/detail?id=2749
    """
    import base64
    import quopri

    obj = cls()
    if fs.list:
        # fs.list can be None when there's nothing to parse
        for field in fs.list:
            if field.filename:
                obj.add(field.name, field)
            else:
                # first, set a common charset to utf-8.
                common_charset = 'utf-8'
                # second, check Content-Transfer-Encoding and decode
                # the value appropriately
                field_value = field.value
                transfer_encoding = field.headers.get('Content-Transfer-Encoding', None)
                if transfer_encoding == 'base64':
                    field_value = base64.b64decode(field_value)
                if transfer_encoding == 'quoted-printable':
                    field_value = quopri.decodestring(field_value)
                if field.type_options.has_key('charset') and field.type_options['charset'] != common_charset:
                    # decode with a charset specified in each
                    # multipart, and then encode it again with a
                    # charset specified in top level FieldStorage
                    field_value = field_value.decode(field.type_options['charset']).encode(common_charset)
                    # TODO: Should we take care of field.name here?
                    obj.add(field.name, field_value)
    return obj

multidict.MultiDict.from_fieldstorage = classmethod(from_fieldstorage)

【讨论】:

    猜你喜欢
    • 2013-06-22
    • 2011-03-25
    • 2015-02-07
    • 2020-10-10
    • 1970-01-01
    • 1970-01-01
    • 2019-04-02
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多