当使用Signed Url 将对象(文件)上传到 Google Cloud Storage 时,PUT 文件大小限制确实是可执行的。
这是通过设置 HTTP 标头 x-goog-content-length-range(查找文档 here)并指定您希望签名 URL 允许的字节范围来实现的。例如:
"x-goog-content-length-range":"0,24117249"
这指定将接受上传到该 URL 的文件,大小从 0B(字节)到 23MB(24117249 字节)。
在创建签名 URL 和访问该 URL(即上传文件)时,您都必须使用此标头。
编辑:
作为对Martin Zeitler 的评论的回应,我对该主题进行了更多研究,并设法使用带有可恢复上传的签名 URL 获得了一些可行的脚本。
它是如何工作的?首先,我们创建 POST 方法签名 URL 带有一个标头,指示存储桶启动可恢复的上传操作,作为交换,它以 Location 标头响应,我们必须将带有 PUT 的文件发送到该 URI请求。
您希望在启动服务器之前设置您的凭据。详细了解如何操作here。
但是,为了获得调用签名 URL 和将文件上传到存储桶所需的权限,我们需要一个访问令牌。你可以得到它here。你也可以learn more about OAuth2 Authentication。这个访问令牌在获取上传URI和上传时不必相同;但是,为了简单起见,我决定保持不变。
脚本本身您不想在生产中使用:它只是为了说明目的而制作的。
(您需要flask 和google-cloud-storage Python 库才能使其工作)
main.py:
from flask import Flask, render_template
import datetime, requests
from google.cloud import storage
#----------------------------------------------------------------------
#----------------------------------------------------------------------
def generate_upload_signed_url_v4(bucket_name, blob_name):
storage_client = storage.Client()
bucket = storage_client.bucket(bucket_name) #Sets name of the target bucket
blob = bucket.blob(blob_name) #Sets the filename our object will have once uploaded to the bucket
headers = {
"x-goog-resumable":"start", #Needed for creating a resumable upload: https://cloud.google.com/storage/docs/xml-api/reference-headers#xgoogresumable
}
url = blob.generate_signed_url(
version="v4",
expiration=datetime.timedelta(minutes=15),
headers=headers,
method="POST",
)
return url
#----------------------------------------------------------------------
#----------------------------------------------------------------------
bucket_name = 'sample-bucket' #INSERT YOUR BUCKET NAME HERE
blob_name = 'your-desired-filename' #INSERT THE NAME OF THE FILE HARE
url = generate_upload_signed_url_v4(bucket_name,blob_name) #Instantiates the Signed URL to get the Session ID to upload the file
app = Flask(__name__) #Flask
token = "access-token" #Insert access token here
headers = { #Must have the same headers used in the generation of the Signed URL + the Authorization header
"Authorization":f"Bearer {token}",
"x-goog-resumable":"start",
}
#Get Session ID from the `Location` response header and store it in the `session_url` variable
r = requests.post(url, data="", headers=headers)
if r.status_code == requests.codes.created:
session_url = r.headers["Location"]
else:
session_url = "None"
#----------------------------------------------------------------------
#----------------------------------------------------------------------
@app.route("/gcs",methods=["PUT","GET","POST"])
def main():
return render_template("index.html",token=token,url=session_url) # Sends token and session_url to the template
if __name__ == "__main__":
app.run(debug=True,port=8080,host="0.0.0.0") #Starts the server on port 8080 and sets the host to 0.0.0.0 (available to the internet)
templates/index.html(了解更多here Flask 模板):
<html>
<head>
</head>
<body>
<input type="file" id="fileinput" />
<script>
// Select your input type file and store it in a variable
const input = document.getElementById('fileinput');
// This will upload the file after having read it
const upload = (file) => {
fetch('{{ url }}', { // Your PUT endpoint -> On this case, the Session ID URL retrieved by the Signed URL
method: 'PUT',
body: file,
headers: {
"Authorization": "Bearer {{ token }}", //I don't think it's a good idea to have this publicly available.
"x-goog-content-length-range":"0,24117249" //Having this on the front-end may allow users to tamper with your system.
}
}).then(
response => response.text()
).then(str => (new window.DOMParser()).parseFromString(str, "text/xml")
).then(data => console.log(data) //Prints response sent from server in an XML format
).then(success => console.log(success) // Handle the success response object
).catch(
error => console.log(error) // Handle the error response object
);
};
const onSelectFile = () => upload(input.files[0]);
input.addEventListener('change', onSelectFile, false); //Whenever a file is selected, the EventListener is triggered and executes the `onSelectFile` function
</script>
</body>
</html>
现在我们必须为我们的存储桶配置CORS 设置。我们必须通过更改origin 值来允许我们的服务器。然后,我们必须明确说明我们想要允许的 HTTP 标头和方法。 如果设置不正确,将引发 CORS 错误。
cors.json:
[
{
"origin": ["http://<your-ip-here>:<yourport-here>"],
"responseHeader": [
"Content-Type",
"Authorization",
"Access-Control-Allow-Origin",
"X-Upload-Content-Length",
"X-Goog-Resumable",
"x-goog-content-length-range"
],
"method": ["PUT", "OPTIONS","POST"],
"maxAgeSeconds": 3600
}
]
正确配置后,我们可以使用命令将此配置应用于我们的存储桶
gsutil cors set <name-of-configfile> gs://<name-of-bucket>
要尝试这个,请转到您的浏览器并输入此网址:http://<your-ip>:<your-port>/gcs。
选择您选择的文件(小于 23MB 或您可以设置的上限),并观察它实际上是如何上传到您的存储桶的。
现在您可能想尝试上传一个大于x-goog-content-length-range 标头上设置的上限的文件,并观察上传如何失败并出现EntityTooLarge 错误。