【问题标题】:php curl gives different result than from command linephp curl 给出与命令行不同的结果
【发布时间】:2019-08-30 11:31:24
【问题描述】:

我正在从 URL 下载 ZIP,但我遇到了问题。我算法的第一步是检查给定 url 的 Content-TypeContent-Length 是什么:

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, "https://www.dropbox.com/s/0hvgw7nvbdnh13d/ColaClassic.zip");
curl_setopt($ch, CURLOPT_HEADER, 1); //I
curl_setopt($ch, CURLOPT_NOBODY, 1); //without body
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); //L
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);

curl_exec($ch);
$content_type = curl_getinfo($ch, CURLINFO_CONTENT_TYPE);

但是,变量$content-type 的值是text/html; charset=utf-8

然后我像这样从命令行检查Content-Type

curl -IL https://www.dropbox.com/s/0hvgw7nvbdnh13d/ColaClassic.zip

我得到了正确的结果 (application/zip)。

那么,这两个代码有什么区别,如何在我的 php 脚本中得到正确的Content-Type

编辑:

curl_setopt($ch, CURLOPT_URL, 'https://www.dropbox.com/s/0hvgw7nvbdnh13d/ColaClassic.zip');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'HEAD');
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_STDERR, $verbose);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);

php curl 的详细输出:

* Hostname was found in DNS cache
* Hostname in DNS cache was stale, zapped
*   Trying 162.125.69.1...
* Connected to www.dropbox.com (162.125.69.1) port 443 (#14)
* successfully set certificate verify locations:
*   CAfile: none
  CApath: /etc/ssl/certs
* SSL connection using ECDHE-RSA-AES128-GCM-SHA256
* Server certificate:
*    subject: businessCategory=Private Organization; 1.3.6.1.4.1.311.60.2.1.3=US; 1.3.6.1.4.1.311.60.2.1.2=Delaware; serialNumber=4348296; C=US; ST=California; L=San Francisco; O=Dropbox, Inc; CN=www.dropbox.com
*    start date: 2017-11-14 00:00:00 GMT
*    expire date: 2020-02-11 12:00:00 GMT
*    subjectAltName: www.dropbox.com matched
*    issuer: C=US; O=DigiCert Inc; OU=www.digicert.com; CN=DigiCert SHA2 Extended Validation Server CA
*    SSL certificate verify ok.
> HEAD /s/0hvgw7nvbdnh13d/ColaClassic.zip HTTP/1.1
Host: www.dropbox.com
Accept: */*

命令行 curl 的详细输出:

*   Trying 162.125.69.1...
* TCP_NODELAY set
* Connected to www.dropbox.com (162.125.69.1) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/cert.pem
  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-CHACHA20-POLY1305
* ALPN, server accepted to use h2
* Server certificate:
*  subject: businessCategory=Private Organization; jurisdictionCountryName=US; jurisdictionStateOrProvinceName=Delaware; serialNumber=4348296; C=US; ST=California; L=San Francisco; O=Dropbox, Inc; CN=www.dropbox.com
*  start date: Nov 14 00:00:00 2017 GMT
*  expire date: Feb 11 12:00:00 2020 GMT
*  subjectAltName: host "www.dropbox.com" matched cert's "www.dropbox.com"
*  issuer: C=US; O=DigiCert Inc; OU=www.digicert.com; CN=DigiCert SHA2 Extended Validation Server CA
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x7fd8c4007a00)
> HEAD /s/0hvgw7nvbdnh13d/ColaClassic.zip HTTP/2
> Host: www.dropbox.com
> User-Agent: curl/7.54.0
> Accept: */*

【问题讨论】:

  • curl_setopt($ch, CURLOPT_HEADER, 1); //I - 是的,不,一厢情愿。 -I 表示发出 HEAD 请求,CURLOPT_HEADER 表示在输出中包含响应头。您希望 curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'HEAD'); 正确翻译 -I
  • @misorude 即使在我添加了CURLOPT_CUSTOMREQUEST 参数后,我仍然将 text/html 作为 Content-Type
  • 您可以在这里将您的 cURL 命令“翻译”为 PHP,incarnate.github.io/curl-to-php 如果它仍然无法使用 - 那么我将首先使用这两种方法发送一个请求到我自己的脚本,它只是记录所有请求标头,然后检查显着差异。
  • 是的,我已经尝试过了。我删除了FOLLOWLOCATION(设置为false),在php中,我得到http状态码200,在cmd中我得到301。这怎么可能?同一个链接
  • 嗯,关于这两个请求的 something 肯定有所不同 - 因此我的建议是从记录它们的实际外观开始。

标签: php curl


【解决方案1】:

似乎 Dropbox 会根据用户代理发出不同的响应代码 - 或者更确切地说是缺少响应代码。您的命令行操作发送类似curl/7.47.0(或您的版本)的内容,而 php 脚本发送一个空的用户代理。将用户代理添加到您的 php 请求中将使 dropbox 以 HTTP/1.1 301 Moved Permanently 响应适当地响应,然后您的脚本将按预期跟随位置:

$ch = curl_init();
// emulates user agent from command line.
$user_agent = 'curl/' . curl_version()['version'];
curl_setopt($ch, CURLOPT_URL, "https://www.dropbox.com/s/0hvgw7nvbdnh13d/ColaClassic.zip");
curl_setopt($ch, CURLOPT_HEADER, 1); //I
curl_setopt($ch, CURLOPT_NOBODY, 1); //without body
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); //L
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_USERAGENT, $user_agent);

curl_exec($ch);
$content_type = curl_getinfo($ch, CURLINFO_CONTENT_TYPE);
echo $content_type;

更新:奇怪的是,我只是尝试了一些其他的东西,比如模拟各种浏览器用户代理字符串,似乎 Dropbox 似乎只在使用 curl/X.X.X 用户代理时发出重定向。 ¯\_(ツ)_/¯

【讨论】:

  • 谢谢你!是的,在你给我这个答案之后,我也在试验用户代理参数。它还使用curl/ 作为用户代理。如果我添加我的浏览器用户代理,它就不起作用了。也许它是这样工作的,因为如果我用浏览器访问那个页面,那么页面会打开并显示 zip 文件的内容。
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 2014-08-08
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2019-11-26
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多