【问题标题】:Wagtail Documents: Large file size (>2GB) upload failsWagtail 文档:大文件大小 (>2GB) 上传失败
【发布时间】:2019-05-13 02:25:02
【问题描述】:

我正在尝试使用 Wagtail 应用程序中内置的 wagtaildocs 应用程序上传文件。我已经设置了我的 Ubuntu 16.04 服务器是使用 Nginx 的 Digital Ocean 教程方法设置的 |独角兽 | Postgres

一些初步说明:

  1. 在我的 Nginx 配置中,我设置了 client_max_body_size 10000M;
  2. 在我的生产设置中,我有以下几行: MAX_UPLOAD_SIZE = "5242880000" WAGTAILIMAGES_MAX_UPLOAD_SIZE = 5000 * 1024 * 1024
  3. 我的文件类型是.zip
  4. 此时这是一个生产测试。我只实现了一个基本的 wagtail 应用程序,没有额外的模块。

因此,只要我的文件大小低于 10Gb,从配置的角度来看,我应该没问题,除非我遗漏了某些内容或对错字视而不见。

我已经尝试将所有配置值调整为不合理的大值。我尝试使用其他文件扩展名,但没有改变我的错误。

我认为这与会话期间关闭的 TCP 或 SSL 连接有关。我以前从未遇到过这个问题,所以我很感激一些帮助。

这是我的错误信息:

Internal Server Error: /admin/documents/multiple/add/
Traceback (most recent call last):
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
psycopg2.DatabaseError: SSL SYSCALL error: Operation timed out


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/core/handlers/exception.py", line 34, in inner
    response = get_response(request)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/core/handlers/base.py", line 115, in _get_response
    response = self.process_exception_by_middleware(e, request)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/core/handlers/base.py", line 113, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/views/decorators/cache.py", line 44, in _wrapped_view_func
    response = view_func(request, *args, **kwargs)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/wagtail/admin/urls/__init__.py", line 102, in wrapper
    return view_func(request, *args, **kwargs)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/wagtail/admin/decorators.py", line 34, in decorated_view
    return view_func(request, *args, **kwargs)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/wagtail/admin/utils.py", line 151, in wrapped_view_func
    return view_func(request, *args, **kwargs)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/views/decorators/vary.py", line 20, in inner_func
    response = func(*args, **kwargs)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/wagtail/documents/views/multiple.py", line 60, in add
    doc.save()
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/models/base.py", line 741, in save
    force_update=force_update, update_fields=update_fields)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/models/base.py", line 779, in save_base
    force_update, using, update_fields,
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/models/base.py", line 870, in _save_table
    result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/models/base.py", line 908, in _do_insert
    using=using, raw=raw)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/models/manager.py", line 82, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/models/query.py", line 1186, in _insert
    return query.get_compiler(using=using).execute_sql(return_id)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/models/sql/compiler.py", line 1335, in execute_sql
    cursor.execute(sql, params)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/backends/utils.py", line 99, in execute
    return super().execute(sql, params)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/backends/utils.py", line 67, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/backends/utils.py", line 76, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/utils.py", line 89, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
django.db.utils.DatabaseError: SSL SYSCALL error: Operation timed out

这是我的设置

### base.py ###
import os

PROJECT_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
BASE_DIR = os.path.dirname(PROJECT_DIR)
SECRET_KEY = os.getenv('SECRET_KEY_WAGTAILDEV')

# Quick-start development settings - unsuitable for production
# See https://docs.djangoproject.com/en/2.2/howto/deployment/checklist/


# Application definition

INSTALLED_APPS = [
    'home',
    'search',

    'wagtail.contrib.forms',
    'wagtail.contrib.redirects',
    'wagtail.embeds',
    'wagtail.sites',
    'wagtail.users',
    'wagtail.snippets',
    'wagtail.documents',
    'wagtail.images',
    'wagtail.search',
    'wagtail.admin',
    'wagtail.core',

    'modelcluster',
    'taggit',

    'django.contrib.admin',
    'django.contrib.auth',
    'django.contrib.contenttypes',
    'django.contrib.sessions',
    'django.contrib.messages',
    'django.contrib.staticfiles',
    'storages',
]

MIDDLEWARE = [
    'django.contrib.sessions.middleware.SessionMiddleware',
    'django.middleware.common.CommonMiddleware',
    'django.middleware.csrf.CsrfViewMiddleware',
    'django.contrib.auth.middleware.AuthenticationMiddleware',
    'django.contrib.messages.middleware.MessageMiddleware',
    'django.middleware.clickjacking.XFrameOptionsMiddleware',
    'django.middleware.security.SecurityMiddleware',

    'wagtail.core.middleware.SiteMiddleware',
    'wagtail.contrib.redirects.middleware.RedirectMiddleware',
]

ROOT_URLCONF = 'wagtaildev.urls'

TEMPLATES = [
    {
        'BACKEND': 'django.template.backends.django.DjangoTemplates',
        'DIRS': [
            os.path.join(PROJECT_DIR, 'templates'),
        ],
        'APP_DIRS': True,
        'OPTIONS': {
            'context_processors': [
                'django.template.context_processors.debug',
                'django.template.context_processors.request',
                'django.contrib.auth.context_processors.auth',
                'django.contrib.messages.context_processors.messages',
            ],
        },
    },
]

WSGI_APPLICATION = 'wagtaildev.wsgi.application'


# Database
# https://docs.djangoproject.com/en/2.2/ref/settings/#databases

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql_psycopg2',
        'HOST': os.getenv('DATABASE_HOST_WAGTAILDEV'),
        'USER': os.getenv('DATABASE_USER_WAGTAILDEV'),
        'PASSWORD': os.getenv('DATABASE_PASSWORD_WAGTAILDEV') ,
        'NAME': os.getenv('DATABASE_NAME_WAGTAILDEV'),
        'PORT': '5432',
    }
}


# Password validation
# https://docs.djangoproject.com/en/2.2/ref/settings/#auth-password-validators

AUTH_PASSWORD_VALIDATORS = [
    {
        'NAME': 'django.contrib.auth.password_validation.UserAttributeSimilarityValidator',
    },
    {
        'NAME': 'django.contrib.auth.password_validation.MinimumLengthValidator',
    },
    {
        'NAME': 'django.contrib.auth.password_validation.CommonPasswordValidator',
    },
    {
        'NAME': 'django.contrib.auth.password_validation.NumericPasswordValidator',
    },
]


# Internationalization
# https://docs.djangoproject.com/en/2.2/topics/i18n/

LANGUAGE_CODE = 'en-us'

TIME_ZONE = 'UTC'

USE_I18N = True

USE_L10N = True

USE_TZ = True


# Static files (CSS, JavaScript, Images)
# https://docs.djangoproject.com/en/2.2/howto/static-files/

STATICFILES_FINDERS = [
    'django.contrib.staticfiles.finders.FileSystemFinder',
    'django.contrib.staticfiles.finders.AppDirectoriesFinder',
]

STATICFILES_DIRS = [
    os.path.join(PROJECT_DIR, 'static'),
]

# ManifestStaticFilesStorage is recommended in production, to prevent outdated
# Javascript / CSS assets being served from cache (e.g. after a Wagtail upgrade).
# See https://docs.djangoproject.com/en/2.2/ref/contrib/staticfiles/#manifeststaticfilesstorage
STATICFILES_STORAGE = 'django.contrib.staticfiles.storage.ManifestStaticFilesStorage'

STATIC_ROOT = os.path.join(BASE_DIR, 'static')
STATIC_URL = '/static/'

MEDIA_ROOT = os.path.join(BASE_DIR, 'media')
MEDIA_URL = '/media/'


# Wagtail settings

WAGTAIL_SITE_NAME = "wagtaildev"

# Base URL to use when referring to full URLs within the Wagtail admin backend -
# e.g. in notification emails. Don't include '/admin' or a trailing slash
BASE_URL = 'http://example.com'

### production.py ###

from .base import *

DEBUG = True

ALLOWED_HOSTS = ['wagtaildev.wesgarlock.com', '127.0.0.1','134.209.230.125']

from wagtaildev.aws.conf import *

EMAIL_BACKEND = 'django.core.mail.backends.console.EmailBackend'

MAX_UPLOAD_SIZE = "5242880000"
WAGTAILIMAGES_MAX_UPLOAD_SIZE = 5000 * 1024 * 1024
FILE_UPLOAD_TEMP_DIR = str(os.path.join(BASE_DIR, 'tmp'))

这是我的 Nginx 设置

server {
    listen 80;

    server_name wagtaildev.wesgarlock.com;
    client_max_body_size 10000M;

    location = /favicon.ico { access_log off; log_not_found off; }

    location / {
        include proxy_params;
        proxy_pass http://unix:/home/wesgarlock/run/wagtaildev.sock;
    }
}

【问题讨论】:

  • 我试过了,没有效果。我制作了一个新的 droplet 和一个新的 wagtail 项目,使用最低设置来执行此操作。我现在正在尝试上传 1.9Gb 和 2.3Gb 的文件。我会发布输出。

标签: python django wagtail


【解决方案1】:

我一直无法直接解决这个问题,但我确实想出了一个技巧来解决它。

我不是 Wagtail 或 Django 专家,所以我确信这个答案有适当的解决方案,但无论如何这就是我所做的。如果您有任何改进建议,请随时发表评论。

作为说明,这确实是用来提醒我所做的事情的文档。在这一点上 (05-25-19) 有很多多余的代码行,因为我把很多代码放在一起弗兰肯斯坦。我会加班编辑它。

这是我为科学怪人编写的用于创建此解决方案的教程。

  1. https://www.codingforentrepreneurs.com/blog/large-file-uploads-with-amazon-s3-django/
  2. http://docs.wagtail.io/en/v2.1.1/advanced_topics/documents/custom_document_model.html
  3. https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html
  4. https://medium.com/faun/summary-667d0fdbcdae
  5. http://docs.aws.amazon.com/sdk-for-javascript/v2/developer-guide/loading-browser-credentials-federated-id.html
  6. https://kite.com/python/examples/454/threading-wait-for-a-thread-to-finish
  7. http://docs.celeryproject.org/en/latest/userguide/daemonizing.html#usage-systemd

可能还有其他一些,但这些是原则。

好的,我们开始吧。

我创建了一个名为“files”的应用程序,然后一个自定义文档模拟了一个 models.py 文件。您需要在设置文件中指定 WAGTAILDOCS_DOCUMENT_MODEL = 'files.LargeDocument'。我这样做的唯一原因是为了更明确地跟踪我正在改变的行为。这个自定义文档模型只是扩展了 Wagtail 中的标准文档模型。

#models.py

from django.db import models
from wagtail.documents.models import AbstractDocument
from wagtail.admin.edit_handlers import FieldPanel
# Create your models here.
class LargeDocument(AbstractDocument):

    admin_form_fields = (
        'file',
    )
    panels = [
        FieldPanel('file', classname='fn'),
    ]

接下来,您需要创建一个 wagtail_hook.py 文件,其中包含以下内容。

#wagtail_hook.py
from wagtail.contrib.modeladmin.options import (
    ModelAdmin, modeladmin_register)
from .models import LargeDocument
from .views import LargeDocumentAdminView


class LargeDocumentAdmin(ModelAdmin):
    model = LargeDocument

    menu_label = 'Large Documents'  # ditch this to use verbose_name_plural from model
    menu_icon = 'pilcrow'  # change as required
    menu_order = 200  # will put in 3rd place (000 being 1st, 100 2nd)
    add_to_settings_menu = False  # or True to add your model to the Settings sub-menu
    exclude_from_explorer = False # or True to exclude pages of this type from Wagtail's explorer view

    create_template_name ='large_document_index.html'

# Now you just need to register your customised ModelAdmin class with Wagtail
modeladmin_register(LargeDocumentAdmin)

这允许你做两件事:

  1. 创建一个用于上传大型文档的新菜单项,同时保持标准文档菜单项的标准功能。
  2. 指定用于处理大型上传的自定义 html 文件。

这里是html

{% extends "wagtailadmin/base.html" %}
{% load staticfiles cache %}
{% load static wagtailuserbar %}
{% load compress %}
{% load underscore_hyphan_to_space %}
{% load url_vars %}
{% load pagination_value %}

{% load static %}
{% load i18n %}

{% block titletag %}{{ view.page_title }}{% endblock %}

{% block content %}

    {% include "wagtailadmin/shared/header.html" with title=view.page_title icon=view.header_icon %}
          <!-- Google Signin Button -->
          <div class="g-signin2" data-onsuccess="onSignIn" data-theme="dark">
          </div>
          <!-- Select the file to upload -->

          <div class="input-group mb-3">
            <link rel="stylesheet" href="{% static 'css/input.css'%}"/>
            <div class="custom-file">
              <input type="file" class="custom-file-input" id="file" name="file">
              <label id="file_label" class="custom-file-label" style="width:auto!important;" for="inputGroupFile02" aria-describedby="inputGroupFileAddon02">Choose file</label>
            </div>
            <div class="input-group-append">
              <span class="input-group-text" id="file_submission_button">Upload</span>
            </div>
            <div id="start_progress"></div>
          </div>
          <div class="progress-upload">
            <div class="progress-upload-bar" role="progressbar" style="width: 100%;" aria-valuenow="100" aria-valuemin="0" aria-valuemax="100"></div>
          </div>
{% endblock %}

{% block extra_js %}
    {{ block.super }}
    {{ form.media.js }}
    <script src="https://apis.google.com/js/platform.js" async defer></script>
    <script src="https://sdk.amazonaws.com/js/aws-sdk-2.148.0.min.js"></script>
    <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.2.1/jquery.min.js"></script>
    <script src="{% static 'js/awsupload.js' %}"></script>
{% endblock %}

{% block extra_css %}
    {{ block.super }}
    {{ form.media.css }}
    <meta name="google-signin-client_id" content="847336061839-9h651ek1dv7u1i0t4edsk8pd20d0lkf3.apps.googleusercontent.com">
    <link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/css/bootstrap.min.css" integrity="sha384-ggOyR0iXCbMQv3Xipma34MD+dH/1fQ784/j6cY/iJTQUOhcWr7x9JvoRxT2MZw1T" crossorigin="anonymous">

{% endblock %}

然后我在views.py中创建了一些对象

#views.py
from django.shortcuts import render

# Create your views here.
import base64
import hashlib
import hmac
import os
import time
from rest_framework import permissions, status, authentication
from rest_framework.response import Response
from rest_framework.views import APIView
from .config_aws import (
    AWS_UPLOAD_BUCKET,
    AWS_UPLOAD_REGION,
    AWS_UPLOAD_ACCESS_KEY_ID,
    AWS_UPLOAD_SECRET_KEY
)
from .models import LargeDocument
import datetime
from wagtail.contrib.modeladmin.views import WMABaseView
from django.db.models.fields.files import FieldFile
from django.core.files import File
import urllib.request
from django.core.mail import send_mail
from .tasks import file_creator

class FilePolicyAPI(APIView):
    """
    This view is to get the AWS Upload Policy for our s3 bucket.
    What we do here is first create a LargeDocument object instance in our
    Django backend. This is to include the LargeDocument instance in the path
    we will use within our bucket as you'll see below.
    """
    permission_classes = [permissions.IsAuthenticated]
    authentication_classes = [authentication.SessionAuthentication]

    def post(self, request, *args, **kwargs):
        """
        The initial post request includes the filename
        and auth credientails. In our case, we'll use
        Session Authentication but any auth should work.
        """
        filename_req = request.data.get('filename')
        if not filename_req:
                return Response({"message": "A filename is required"}, status=status.HTTP_400_BAD_REQUEST)
        policy_expires = int(time.time()+5000)
        user = request.user
        username_str = str(request.user.username)
        """
        Below we create the Django object. We'll use this
        in our upload path to AWS.

        Example:
        To-be-uploaded file's name: Some Random File.mp4
        Eventual Path on S3: <bucket>/username/2312/2312.mp4
        """
        doc_obj = LargeDocument.objects.create(uploaded_by_user=user, )
        doc_obj_id = doc_obj.id
        doc_obj.title=filename_req
        upload_start_path = "{location}".format(
                    location = "LargeDocuments/",
            )
        file_extension = os.path.splitext(filename_req)
        filename_final = "{title}".format(
                    title= filename_req,
                )
        """
        Eventual file_upload_path includes the renamed file to the
        Django-stored LargeDocument instance ID. Renaming the file is
        done to prevent issues with user generated formatted names.
        """
        final_upload_path = "{upload_start_path}/{filename_final}".format(
                                 upload_start_path=upload_start_path,
                                 filename_final=filename_final,
                            )
        if filename_req and file_extension:
            """
            Save the eventual path to the Django-stored LargeDocument instance
            """
            policy_document_context = {
                "expire": policy_expires,
                "bucket_name": AWS_UPLOAD_BUCKET,
                "key_name": "",
                "acl_name": "public-read",
                "content_name": "",
                "content_length": 524288000,
                "upload_start_path": upload_start_path,

                }
            policy_document = """
            {"expiration": "2020-01-01T00:00:00Z",
              "conditions": [
                {"bucket": "%(bucket_name)s"},
                ["starts-with", "$key", "%(upload_start_path)s"],
                {"acl": "public-read"},

                ["starts-with", "$Content-Type", "%(content_name)s"],
                ["starts-with", "$filename", ""],
                ["content-length-range", 0, %(content_length)d]
              ]
            }
            """ % policy_document_context
            aws_secret = str.encode(AWS_UPLOAD_SECRET_KEY)
            policy_document_str_encoded = str.encode(policy_document.replace(" ", ""))
            url = 'https://thearchmedia.s3.amazonaws.com/'
            policy = base64.b64encode(policy_document_str_encoded)
            signature = base64.b64encode(hmac.new(aws_secret, policy, hashlib.sha1).digest())
            doc_obj.file_hash = signature
            doc_obj.path = final_upload_path

            doc_obj.save()



        data = {
            "policy": policy,
            "signature": signature,
            "key": AWS_UPLOAD_ACCESS_KEY_ID,
            "file_bucket_path": upload_start_path,
            "file_id": doc_obj_id,
            "filename": filename_final,
            "url": url,
            "username": username_str,
        }
        return Response(data, status=status.HTTP_200_OK)

class FileUploadCompleteHandler(APIView):
    permission_classes = [permissions.IsAuthenticated]
    authentication_classes = [authentication.SessionAuthentication]

    def post(self, request, *args, **kwargs):
        file_id = request.POST.get('file')
        size = request.POST.get('fileSize')
        data = {}
        type_ = request.POST.get('fileType')
        if file_id:
            obj = LargeDocument.objects.get(id=int(file_id))
            obj.size = int(size)
            obj.uploaded = True
            obj.type = type_
            obj.file_hash
            obj.save()
            data['id'] = obj.id
            data['saved'] = True
            data['url']=obj.url
        return Response(data, status=status.HTTP_200_OK)

class ModelFileCompletion(APIView):
    permission_classes = [permissions.IsAuthenticated]
    authentication_classes = [authentication.SessionAuthentication]

    def post(self, request, *args, **kwargs):
        file_id = request.POST.get('file')
        url = request.POST.get('aws_url')
        data = {}
        if file_id:
            obj = LargeDocument.objects.get(id=int(file_id))
            file_creator.delay(obj.pk)
            data['test'] = 'process started'
        return Response(data, status=status.HTTP_200_OK)

def LargeDocumentAdminView(request):
    context = super(WMABaseView, self).get_context(request)
    render(request, 'modeladmin/files/index.html', context)

此视图围绕标准文件处理系统进行。我不想放弃标准文件处理系统或编写一个新系统。这就是为什么我称这种 hack 和非理想解决方案的原因。

// javascript upload file "awsupload.js"
var id_token; //token we get upon Authentication with Web Identiy Provider
function onSignIn(googleUser) {
  var profile = googleUser.getBasicProfile();
  // The ID token you need to pass to your backend:
  id_token = googleUser.getAuthResponse().id_token;
}

$(document).ready(function(){

  // setup session cookie data. This is Django-related
  function getCookie(name) {
      var cookieValue = null;
      if (document.cookie && document.cookie !== '') {
          var cookies = document.cookie.split(';');
          for (var i = 0; i < cookies.length; i++) {
              var cookie = jQuery.trim(cookies[i]);
              // Does this cookie string begin with the name we want?
              if (cookie.substring(0, name.length + 1) === (name + '=')) {
                  cookieValue = decodeURIComponent(cookie.substring(name.length + 1));
                  break;
              }
          }
      }
      return cookieValue;
  }
  var csrftoken = getCookie('csrftoken');
  function csrfSafeMethod(method) {
      // these HTTP methods do not require CSRF protection
      return (/^(GET|HEAD|OPTIONS|TRACE)$/.test(method));
  }
  $.ajaxSetup({
      beforeSend: function(xhr, settings) {
          if (!csrfSafeMethod(settings.type) && !this.crossDomain) {
              xhr.setRequestHeader("X-CSRFToken", csrftoken);
          }
      }
  });
  // end session cookie data setup.

  // declare an empty array for potential uploaded files
  var fileItemList = []

  $(document).on('click','#file_submission_button', function(event){
      var selectedFiles = $('#file').prop('files');
      formItem = $(this).parent()
      $.each(selectedFiles, function(index, item){
          uploadFile(item)
      })
      $(this).val('');
      $('.progress-upload-bar').attr('aria-valuenow',progress);
      $('.progress-upload-bar').attr('width',progress.toString()+'%');
      $('.progress-upload-bar').attr('style',"width:"+progress.toString()+'%');
      $('.progress-upload-bar').text(progress.toString()+'%');
  })
  $(document).on('change','#file', function(event){
      var selectedFiles = $('#file').prop('files');
      $('#file_label').text(selectedFiles[0].name)
  })



  function constructFormPolicyData(policyData, fileItem) {
     var contentType = fileItem.type != '' ? fileItem.type : 'application/octet-stream'
      var url = policyData.url
      var filename = policyData.filename
      var repsonseUser = policyData.user
      // var keyPath = 'www/' + repsonseUser + '/' + filename
      var keyPath = policyData.file_bucket_path
      var fd = new FormData()
      fd.append('key', keyPath + filename);
      fd.append('acl','private');
      fd.append('Content-Type', contentType);
      fd.append("AWSAccessKeyId", policyData.key)
      fd.append('Policy', policyData.policy);
      fd.append('filename', filename);
      fd.append('Signature', policyData.signature);
      fd.append('file', fileItem);
      return fd
  }

  function fileUploadComplete(fileItem, policyData){
      data = {
          uploaded: true,
          fileSize: fileItem.size,
          file: policyData.file_id,

      }
      $.ajax({
          method:"POST",
          data: data,
          url: "/api/files/complete/",
          success: function(data){
              displayItems(fileItemList)
          },
          error: function(jqXHR, textStatus, errorThrown){
              alert("An error occured, please refresh the page.")
          }
      })
  }

  function modelComplete(policyData, aws_url){
      data = {
          file: policyData.file_id,
          aws_url: aws_url
      }
      $.ajax({
          method:"POST",
          data: data,
          url: "/api/files/modelcomplete/",
          success:
          console.log('model complete success')  ,
          error: function(jqXHR, textStatus, errorThrown){
              alert("An error occured, please refresh the page.")
          }
      })
  }

  function displayItems(fileItemList){
      var itemList = $('.item-loading-queue')
      itemList.html("")
      $.each(fileItemList, function(index, obj){
          var item = obj.file
          var id_ = obj.id
          var order_ = obj.order
          var html_ = "<div class=\"progress\">" +
            "<div class=\"progress-bar\" role=\"progressbar\" style='width:" + item.progress + "%' aria-valuenow='" + item.progress + "' aria-valuemin=\"0\" aria-valuemax=\"100\"></div></div>"
          itemList.append("<div>" + order_ + ") " + item.name + "<a href='#' class='srvup-item-upload float-right' data-id='" + id_ + ")'>X</a> <br/>" + html_ + "</div><hr/>")

      })
  }

  function uploadFile(fileItem){
          var policyData;
          var newLoadingItem;
          // get AWS upload policy for each file uploaded through the POST method
          // Remember we're creating an instance in the backend so using POST is
          // needed.
          $.ajax({
              method:"POST",
              data: {
                  filename: fileItem.name
              },
              url: "/api/files/policy/",
              success: function(data){
                      policyData = data
              },
              error: function(data){
                  alert("An error occured, please try again later")
              }
          }).done(function(){
              // construct the needed data using the policy for AWS
              var file = fileItem;
              AWS.config.credentials = new AWS.WebIdentityCredentials({
                  RoleArn: 'arn:aws:iam::120974195102:role/thearchmedia-google-role',
                  ProviderId: null, // this is null for Google
                  WebIdentityToken: id_token // Access token from identity provider
              });
              var bucket = 'thearchmedia'
              var key = 'LargeDocuments/'+file.name
              var aws_url = 'https://'+bucket+'.s3.amazonaws.com/'+ key
              var s3bucket = new AWS.S3({params: {Bucket: bucket}});
              var params = {Key: key , ContentType: file.type, Body: file, ACL:'public-read', };
              s3bucket.upload(params, function (err, data) {
                  $('#results').html(err ? 'ERROR!' : 'UPLOADED :' + data.Location);
                }).on(
                  'httpUploadProgress', function(evt) {
                    progress = parseInt((evt.loaded * 100) / evt.total)
                    $('.progress-upload-bar').attr('aria-valuenow',progress)
                    $('.progress-upload-bar').attr('width',progress.toString()+'%')
                    $('.progress-upload-bar').attr('style',"width:"+progress.toString()+'%')
                    $('.progress-upload-bar').text(progress.toString()+'%')

                  }).send(
                    function(err, data) {
                      alert("File uploaded successfully.")
                      fileUploadComplete(fileItem, policyData)
                      modelComplete(policyData, aws_url)
                    });
          })
  }


})

.js和.vi​​ew.py交互的解释

首先,在头部带有文件信息的 Ajax 调用会创建 Document 对象,但由于文件从不接触服务器,因此不会在 Document 对象中创建“File”对象。这个“文件”对象包含我需要的功能,所以我需要做更多。接下来,我的 javascript 文件使用 AWS Javascript SDK 将文件上传到我的 s3 存储桶。 SDK 中的 s3bucket.upload() 函数足够强大,可以上传高达 5GB 的文件,但不包括其他一些修改,它可以上传高达 5TB(aws 限制)。文件上传到 s3 存储桶后,我的最终 API 调用发生。最后的 API 调用触发了一个 Celery 任务,该任务将文件下载到我的远程服务器上的临时目录中。一旦文件存在于我的远程服务器上,就会创建文件对象并将其保存到文档模型中。

task.py 文件处理从 S3 存储桶下载文件到远程服务器,然后创建 File 对象并将其保存到文档文件。

#task.py
from .models import LargeDocument
from celery import shared_task
import urllib.request
from django.core.mail import send_mail
from django.core.files import File
import threading

@shared_task
def file_creator(pk_num):
    obj = LargeDocument.objects.get(pk=pk_num)
    tmp_loc = 'tmp/'+ obj.title
    def downloadit():
        urllib.request.urlretrieve('https://thearchmedia.s3.amazonaws.com/LargeDocuments/' + obj.title, tmp_loc)

    def after_dwn():
         dwn_thread.join()           #waits till thread1 has completed executing
         #next chunk of code after download, goes here
         send_mail(
             obj.title + ' has finished to downloading to the server',
             obj.title + 'Downloaded to server',
             'info@thearchmedia.com',
             ['wes@wesgarlock.com'],
             fail_silently=False,
         )
         reopen = open(tmp_loc, 'rb')
         django_file = File(reopen)
         obj.file = django_file
         obj.save()
         send_mail(
             obj.title + ' has finished to downloading to the server',
             'File Model Created for' + obj.title,
             'info@thearchmedia.com',
             ['wes@wesgarlock.com'],
             fail_silently=False,
         )

    dwn_thread = threading.Thread(target=downloadit)
    dwn_thread.start()

    metadata_thread = threading.Thread(target=after_dwn)
    metadata_thread.start()

这是需要在 Celery 中运行的过程,因为下载大文件需要时间,而且我不想在浏览器打开的情况下等待。此 task.py 内部还有一个 python thread(),它强制进程等待文件成功下载到远程服务器。如果你是 Celery 的新手,这里是他们文档的开始 (http://docs.celeryproject.org/en/master/getting-started/introduction.html)

我还添加了一些电子邮件通知,以确认流程已完成。

最后一点我在我的项目中创建了一个 /tmp 目录,并设置了每天删除所有文件以赋予它 tmp 功能。

crontab -e
find ~/thearchmedia/tmp -mtime +1 -delete

【讨论】:

    【解决方案2】:

    我怀疑如果 droplet 内存不足,会发生异常 psycopg2.DatabaseError SSL SYSCALL error: Operation timed out

    尝试创建交换分区或扩展内存。

    Creating a swap partition

    【讨论】:

    • 我会试试这个并回复你。
    猜你喜欢
    • 2018-12-01
    • 2013-10-19
    • 2017-06-01
    • 2018-02-13
    • 2012-05-31
    • 2016-02-17
    • 2018-09-21
    • 1970-01-01
    • 2014-11-17
    相关资源
    最近更新 更多