【发布时间】:2013-08-24 01:40:11
【问题描述】:
有没有人体验过RCurl 包中postForm 的限制?
我正在从服务器中提取数据,并且几乎不知从何而来,我收到了错误消息 * HTTP 1.0, assume close after body,然后是 500 Internal Server Error。我测试了配置,一切似乎都很好。我创建了一个干净的数据库并重新上传了我的数据库 20/30 案例,同时使用来自R 的 API/postForm 调用反复提取数据。一切正常,直到我达到大约 150 个案例,然后出现错误消息。无论我上传错误中的案例的顺序如何,都会出现大约 150/160 个案例,总文件大小约为 11 到 12 MB。换句话说,错误似乎并不依赖于特定情况,因为破坏它的情况并不相同
任何建议将不胜感激。
我附上了一个截图来为这篇相当无聊的帖子增添一点趣味,并弥补没有一个有效的例子,
2013-08-24 19:33:18Z 更新
这是我的curlVersion()$version 和sessionInfo() 信息,
> curlVersion()$version
[1] "7.22.0"
> sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: i686-pc-linux-gnu (32-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] RCurl_1.95-4.1 bitops_1.0-6
2013-08-26 05:39:26Z 更新
正如hadley's comment 中所建议的那样,我已经从有效的调用和失败的调用中添加了详细的RCurl 输出,见下文
适用于数据库中少于 150 个案例的调用
> R.object.API <- postForm(R.object.URL, token=R.object.token, content="record", type="flat", format="csv", rawOrLabel="Label", .opts=curlOptions(ssl.verifypeer=TRUE, cainfo=R.object.crt, verbose=TRUE))
* About to connect() to research.org port 443 (#0)
* Trying xx.xx.xxx.xxx... * connected
* successfully set certificate verify locations:
* CAfile: /home/dir/research.cert
CApath: /etc/ssl/certs
* SSL connection using DHE-RSA-AES256-SHA
* Server certificate:
* subject: C=XX; postalCode=XXXXX-XXXX; ST=XX; L=XXXXXX; street=XXX; street=XX XXXXXX XX; O=XXXX, XXX; OU=XXX; CN=research.org
* start date: 2013-02-04 00:00:00 GMT
* expire date: 2016-02-04 23:59:59 GMT
* subjectAltName: research.org matched
* issuer: C=US; O=XXXXXX; OU=XXXXXX; CN=XXXXXX Server XX
* SSL certificate verify ok.
> POST /api/ HTTP/1.1
Host: research.org
Accept: */*
Content-Length: 573
Expect: 100-continue
Content-Type: multipart/form-data; boundary=----------------------------XXXXXXXXXXXX
< HTTP/1.1 100 Continue
< HTTP/1.1 200 OK
< Date: Mon, 26 Aug 2013 05:16:44 GMT
< Server: Apache/2.2.15 (Red Hat)
< X-Powered-By: PHP/5.3.3
< Expires: 0
< cache-control: no-store, no-cache, must-revalidate
< Pragma: no-cache
< Connection: close
< Transfer-Encoding: chunked
< Content-Type: text/html; charset=utf-8
<
* Closing connection #0
>
调用失败,数据库中有超过 150 个案例
> R.object.API <- postForm(R.object.URL, token=R.object.token, content="record", type="flat", format="csv", rawOrLabel="Label", .opts=curlOptions(ssl.verifypeer=TRUE, cainfo=R.object.crt, verbose=TRUE))
* About to connect() to research.org port 443 (#0)
* Trying xx.xx.xxx.xxx... * connected
* successfully set certificate verify locations:
* CAfile: /home/dir/research.cert
CApath: /etc/ssl/certs
* SSL connection using DHE-RSA-AES256-SHA
* Server certificate:
* subject: C=XX; postalCode=XXXXX-XXXX; ST=XX; L=XXXXXX; street=XXX; street=XX XXXXXX XX; O=XXXX, XXX; OU=XXX; CN=research.org
* start date: 2013-02-04 00:00:00 GMT
* expire date: 2016-02-04 23:59:59 GMT
* subjectAltName: research.org matched
* issuer: C=US; O=XXXXXX; OU=XXXXXX; CN=XXXXXX Server XX
* SSL certificate verify ok.
> POST /api/ HTTP/1.1
Host: research.org
Accept: */*
Content-Length: 573
Expect: 100-continue
Content-Type: multipart/form-data; boundary=----------------------------XXXXXXXXXXXX
< HTTP/1.1 100 Continue
* HTTP 1.0, assume close after body
< HTTP/1.0 500 Internal Server Error
< Date: Mon, 26 Aug 2013 05:15:05 GMT
< Server: Apache/2.2.15 (Red Hat)
< X-Powered-By: PHP/5.3.3
< Expires: 0
< cache-control: no-store, no-cache, must-revalidate
< Pragma: no-cache
< Content-Length: 276
< Connection: close
< Content-Type: text/html; charset=UTF-8
<
* Closing connection #0
Error: Internal Server Error
【问题讨论】:
-
您是否尝试过为 curl 的 --keepalive-time 参数传递值?
-
感谢您回答我的问题!不,我还没有尝试过。当我回到我的工作站并尝试将值传递给 Rcurl 的“keepalive-time 参数”时,我会阅读文档。
-
@scottyaz,我使用的是
Rurl版本1.95-4.1,我用listCurlOptions()查找了RCurl包可以理解的选项名称,但--keepalive-time <seconds>不是't 列出的 174 个选项中的一个。keepalive-time在 libcurl manual page 中被提及,但在listCurlOptions()中没有提及。您(或其他任何人)是否有可能在.opts=curlOptions( ... )中提供如何使连接保持打开的时间比默认的 60 秒更长? -
我怀疑更改 keepalive 参数会影响这种情况 - 即使连接没有保持活动状态,它也应该自动重新连接。我怀疑您以服务器无法理解的方式发送数据,或者您有其他服务器问题。
-
@hadley,感谢您的 cmets。您对我如何开始识别错误有什么建议吗?在有问题的通话中,我只获取服务器的数据,而不是发送实际日期(假设我使用的术语正确)。谢谢。