file_get_contents 通过 php 失败，通过浏览器工作答案

【问题标题】：file_get_contents fails via php, works via browserfile_get_contents 通过 php 失败，通过浏览器工作
【发布时间】：2016-11-30 00:20:53
【问题描述】：

我想要达到的目标：
获取对 API 端点的请求，检索 XML 并随后解析结果。
我正在发送file_get_contents 请求以实现此目的。

问题：

`file_get_Contents` fails, error:  

Warning: file_get_contents(https://api.twitter.com/1.1/statuses/mentions_timeline.json):
failed to open stream: 
        A connection attempt failed because the connected party did not properly 
respond after a period of time, or established connection failed because 
connected host has failed to respond.

17/08 更新

巩固我目前的理解：
1. PHP 失败：
1.a 它通过 php 失败（超时）
1.b 它通过命令行失败 (curl -G http://api.eve-central.com/api/quicklook?typeid=34)
1.c 文件获取内容
1.d file_get_contents w/create_stream_context

2。工作原理：
2.a 将 url 粘贴到 chrome 选项卡中
2.b通过邮递员

尝试过的内容： - 检查 Postman 中的标头，并尝试通过 php 复制它们

Postman Headers sent back by eve-central:
Access-Control-Allow-Origin → *  
Connection → Keep-Alive  
Content-Encoding → gzip  
Content-Type → text/xml; charset=UTF-8  
Date → Wed, 17 Aug 2016 10:40:24 GMT  
Proxy-Connection → Keep-Alive  
Server → nginx  
Transfer-Encoding → chunked  
Vary → Accept-Encoding  
Via → HTTP/1.1 proxy10014

对应代码：

$headers = array(     
'method'  => 'GET',        
'header'  => 'Connection: Keep-Alive', 
'header'  => 'Content-Encoding: gzip', 
'header'  => 'Content-Type: text/xml',
'header'  => 'Proxy-Connection: Keep-Alive', 
'header'  => 'Server: nginx', 
'header'  => 'Transfer-Encoding: chunked', 
'header'  => 'Vary: Accept-Encoding', 
'header'  => 'Via: HTTP/1.1 proxy10014');
curl_setopt($curl, CURLOPT_HTTPHEADER, $headers); 
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true );
curl_setopt($curl, CURLOPT_PORT , 8080); // Attempt at changing port in the event it was blocked.
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($curl, CURLOPT_POST,           false );            
curl_setopt($curl, CURLOPT_URL,            $url );   

$resp = curl_exec($curl);
if(curl_error($curl))
{
echo 'error:' . curl_error($curl);
}

使用 Wireshark 捕获 GET 请求，看看更改端口是否有帮助
通过命令行运行 cUrl
我没有想法和选择。所以问题是：
1. 如果它可以在浏览器和 Postman 中运行，为什么它不能通过 PHP 运行？
2. 如何修改我的代码以模仿 Postman 的工作方式？ ?

以前的尝试 我尝试过的： 来自其他线程的各种 cURL 选项，例如

function curl_get_contents($url) { 
$ch = curl_init();
if (!$ch) 
{
die("Couldn't initialize a cURL handle");
} else
echo "Curl Handle initialized ";
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_TIMEOUT, 5);
$data = curl_exec($ch);
// Check if any error occurred
if (!curl_errno($ch)) 
{
$info = curl_getinfo($ch);
echo 'Took ', $info['total_time'], ' seconds to send a request to ', $info['url'], "";
displayData($info);
} else
echo "Failed Curl, reason: ".curl_error($ch)." ";
curl_close($ch);
return $data;
}

结果：没有，没有返回数据。
- 检查 php.ini 选项：
-allow_fopen 开启
-allow_url_include = on
- 启用了相关的 ssl 扩展
- 提高了超时窗口
- 都通过 php.ini
- 也可以通过 php 文件中的显式声明。
- 尝试使用不同的网址
- 同样的错误，所以它并不取决于我的特定端点
- 例如，twitter/wikipedia/google 都返回特定错误 - 尝试过：
- 本地 xml 文件 (https://msdn.microsoft.com/en-us/library/ms762271(v=vs.85).aspx) 上的 file_get_contents --> 有效
- 远程 xml 文件上的 file_get_contents (http://www.xmlfiles.com/examples/note.xml) --> 失败同样的错误
- 总的来说，到目前为止，以下是正确的：
- 卷曲失败，超时
- file_get_Contents 失败，超时
- 在浏览器中打开 XML 文件 url 工作
- 通过 Postman 发出 GET 请求，有效

显然，在file_get_contents 通过 php 失败的所有情况下，我都可以通过任何浏览器轻松访问该文件。

已尝试解决此问题。
尝试 1：
使用 nitrous.io，创建 LAMP 堆栈，通过平台执行操作结果：file_get_contents 工作，但是，由于要检索的大量 xml 文件，操作超时。暂定解决方案：
- 从源下载 XML 文件
- 压缩它们
- 下载 xml_file
- 本地解析上述 xml 文件
稍后，编写一个小的 php 脚本，当调用该脚本时，执行上述位，将数据发送到本地目录，然后将其解包并对其执行额外的工作。
另一种尝试是使用 Google 表格，它具有将数据拉入工作表的用户功能，然后将 excel 文件/值转储到 mysql 中。
就我的目的而言，虽然这是一个非常无知的解决方案，但它确实可以解决问题。

用于避免共享主机超时问题的代码：

function downloadUrlToFile2($url, $outFileName)
{
    //file_put_contents($xmlFileName, fopen($link, 'r'));
    //copy($link, $xmlFileName); // download xml file
    ;
    echo "Passing $url into $outFileName ";
    // $outFileName = touch();
    $fp = fopen($outFileName, "w");
    if(is_file($url)) 
    {
        copy($url, $outFileName); // download xml file
    } else 
        {
            $ch = curl_init();
            $options = array(
            CURLOPT_TIMEOUT =>  28800, // set this to 8 hours so we dont timeout on big files
            CURLOPT_URL     => $url
        );

            curl_setopt($ch, CURLOPT_FILE, $fp);
            curl_setopt_array($ch, $options);
            $contents = curl_exec($ch);
            fwrite($fp, $contents);
            curl_close($ch);
        }
}

我还在 ini 脚本之上添加了这个：

ignore_user_abort(true);
set_time_limit(0);
ini_set('memory_limit', '2048M');

【问题讨论】：

您试图在不执行任何身份验证机制的情况下获取数据。你为什么不试试 Twitter 的 PHP 包装器之一呢？ dev.twitter.com/overview/api/twitter-libraries
感谢您的回复。 Twitter url 只是用于尝试不同选项的随机 URL 之一。如果您执行 file_get_contents($url)，结果不会改变，$url 为：xmlfiles.com/examples/note.xml。因此，正如您从该 url 中看到的那样，它是一个普通的 xml，不需要任何类型的身份验证，仍然会因超时错误而失败。
代码在哪里运行？你有没有想过你运行它的机器有直接的互联网连接？（服务器可能位于前置代理后面）它可以解析名称吗？它没有阻止这种访问的防火墙？它不受其他安全机制的限制吗？你检查过日志吗？出于安全原因，通常配置网络服务器主机以防止它们通过 Internet 进行传出连接（并且是 Redhat 的 SELinux 策略的默认设置）。
我可以上网了。例如，我可以通过 Chrome 访问 xmlfiles.com/examples/note.xml，但不能通过 file_get_contents 或 curl。但是，您的观点可能是有效的。我将尝试从我的家用电脑执行相同的代码，以确保它不是代理/防火墙规则。不过，如果我可以通过浏览器发出 get 请求，并正确显示 xml，我应该也可以通过 php 做到这一点，不是吗？毕竟，他们使用相同的 http 堆栈，包括 curl，来检索数据

标签： php curl timeout file-get-contents simplexml

【解决方案1】：

我发现 HTTPS url 请求存在一些问题，要解决问题，您必须在 CURL 请求中添加以下行

function curl_get_contents($url) { 
    $ch = curl_init();
    $header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,";
    $header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
    $header[] = "Cache-Control: max-age=0";
    $header[] = "Connection: keep-alive";
    $header[] = "Keep-Alive: 300";
    $header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
    $header[] = "Accept-Language: en-us,en;q=0.5";
    $header[] = "Pragma: ";
    curl_setopt( $ch, CURLOPT_HTTPHEADER, $header ); 

    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_URL, $url);

    // I have added below two lines
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
    curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);

    $data = curl_exec($ch);
    curl_close($ch);

    return $data;
}

【讨论】：

谢谢。我合并了您的反馈并添加了更多错误捕获：将代码添加到原始块中。有趣的是，curl 也超时了：Getting contents from xmlfiles.com/examples/note.xml Curl Handle Initialized Failed Curl，原因：Connection timed out after 5008 毫秒
@user3375601 我有一个 URL 相同的问题，该 URL 正在工作的浏览器，但它不能通过 file_get_content 和 curl 请求在 php 中工作。