【问题标题】:Navigates to URL but gets no response导航到 URL 但没有得到响应
【发布时间】:2017-06-12 02:09:03
【问题描述】:

我正在使用 casperjs,我有一系列链接,我正在打开每个页面。数组的循环一直停在http://finishline.com/。好像我没有得到任何回应。我隔离了我根本无法从该链接获取信息的情况。

它只是停止:

... [信息] [幻影] 跑步套件:2 步 [debug] [phantom] 打开网址:http://finishline.com/,HTTP GET [debug] [phantom] 请求导航:url=http://finishline.com/, type=Other, willNavigate=true, isMainFrame=true [debug] [phantom] 请求导航:url=http://www.finishline.com/, type=Oth 呃,willNavigate=true,isMainFrame=true

这里是隔离代码:

var casper = require("casper").create({
    verbose : true,
    logLevel : "debug",
    pageSettings: {
        userAgent: "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.106 Safari/537.36",
         customHeaders: {
            "Server": "Apache/2.4.1 (Unix)"
        }
     },
    clientScripts: ["jquery.js"] ,
    viewportSize: {
        width: 1920,
        height: 1080
    }
});

casper.start().thenOpen("http://finishline.com")
.then(function(){
    console.log("title : ", this.getTitle())
})
casper.run();

为什么我不能从终点线抓取或得到响应

在没有响应的情况下无法得到响应继续代码怎么办?

【问题讨论】:

  • 出于好奇,为什么您会根据请求发送这些实际上应该是响应标头的自定义标头?

标签: javascript node.js web-scraping phantomjs casperjs


【解决方案1】:

我使用 Slimerjs 运行了您的代码:

casperjs --engine=slimerjs my_casper_script.js

它运行没有问题:

    [info] [phantom] Starting... 
[info] [phantom] Running suite: 2 steps 
[debug] [phantom] opening url: http://finishline.com/, HTTP GET 
[debug] [phantom] Navigation requested: url=http://finishline.com/, type=Undefined, willNavigate=true, isMainFrame=true 
[debug] [phantom] url changed to "http://www.finishline.com/" 
[debug] [phantom] Navigation requested: url=https://a7474150781.cdn.optimizely.com/client_storage/a7474150781.html, type=Undefined, willNavigate=true, isMainFrame=false 
[debug] [phantom] Navigation requested: url=https://a7474150781.cdn.optimizely.com/client_storage/a7474150781.html, type=Undefined, willNavigate=true, isMainFrame=false 
[debug] [phantom] Navigation requested: url=https://cdns.us1.gigya.com/gs/webSdk/Api.aspx?apiKey=3_wvYmaqsebd4zKyXKtXex5iT6qVdAQNm8T5Vjh1LhGavP4EApJp4T5CcdKuXozmAe#origin=http://www.finishline.com/&hasGmid=false, type=Undefined, willNavigate=true, isMainFrame=false 
[debug] [phantom] Navigation requested: url=https://cdns.us1.gigya.com/gs/webSdk/Api.aspx?apiKey=3_wvYmaqsebd4zKyXKtXex5iT6qVdAQNm8T5Vjh1LhGavP4EApJp4T5CcdKuXozmAe#origin=http://www.finishline.com/&hasGmid=false, type=Undefined, willNavigate=true, isMainFrame=false 
[debug] [phantom] Navigation requested: url=https://4978775.fls.doubleclick.net/activityi;src=4978775;type=aa;cat=links00;dc_lat=;dc_rdid=;tag_for_child_directed_treatment=;&ord=0.17442537640363165, type=Undefined, willNavigate=true, isMainFrame=false 
[debug] [phantom] Navigation requested: url=https://4978775.fls.doubleclick.net/activityi;src=4978775;type=aa;cat=links00;dc_lat=;dc_rdid=;tag_for_child_directed_treatment=;&ord=0.17442537640363165, type=Undefined, willNavigate=true, isMainFrame=false 
[debug] [phantom] Navigation requested: url=http://login.dotomi.com/ucm/UCMController?dtm_com=28&dtm_fid=101&dtm_cid=61247&dtm_cmagic=963c2b&dtm_format=5&cli_promo_id=1&dtm_user_id=o867995714&dtmc_ref=&dtmc_loc=http%3A//www.finishline.com/&dtm_user_token=, type=Undefined, willNavigate=true, isMainFrame=false 
[debug] [phantom] Navigation requested: url=http://login.dotomi.com/ucm/UCMController?dtm_com=28&dtm_fid=101&dtm_cid=61247&dtm_cmagic=963c2b&dtm_format=5&cli_promo_id=1&dtm_user_id=o867995714&dtmc_ref=&dtmc_loc=http%3A//www.finishline.com/&dtm_user_token=, type=Undefined, willNavigate=true, isMainFrame=false 
[debug] [phantom] Navigation requested: url=https://20725988p.rfihub.com/ca.html?rfiidc=1048283195230333342&rfiaid=8c49df89b5394c92b2630ec07269f9c3&ver=9&rb=24949&ca=20725988&_o=24949&_t=20725988&pe=https%3A%2F%2F4978775.fls.doubleclick.net%2Factivityi%3Bdc_pre%3DCPyZuc2nuNQCFXah7Qod9IsPqg%3Bsrc%3D4978775%3Btype%3Daa%3Bcat%3Dlinks00%3Bdc_lat%3D%3Bdc_rdid%3D%3Btag_for_child_directed_treatment%3D%3B%26ord%3D0.17442537640363165&pf=http%3A%2F%2Fwww.finishline.com%2F&ra=82836050341159, type=Undefined, willNavigate=true, isMainFrame=false 
[debug] [phantom] Navigation requested: url=https://20725988p.rfihub.com/ca.html?rfiidc=1048283195230333342&rfiaid=8c49df89b5394c92b2630ec07269f9c3&ver=9&rb=24949&ca=20725988&_o=24949&_t=20725988&pe=https%3A%2F%2F4978775.fls.doubleclick.net%2Factivityi%3Bdc_pre%3DCPyZuc2nuNQCFXah7Qod9IsPqg%3Bsrc%3D4978775%3Btype%3Daa%3Bcat%3Dlinks00%3Bdc_lat%3D%3Bdc_rdid%3D%3Btag_for_child_directed_treatment%3D%3B%26ord%3D0.17442537640363165&pf=http%3A%2F%2Fwww.finishline.com%2F&ra=82836050341159, type=Undefined, willNavigate=true, isMainFrame=false 
[debug] [phantom] Navigation requested: url=http://dis.us.criteo.com/dis/dis.aspx?p=3616&cb=23181624377&ref=&sc_r=1280x1024&sc_d=24, type=Undefined, willNavigate=true, isMainFrame=false 
[debug] [phantom] Navigation requested: url=http://dis.us.criteo.com/dis/dis.aspx?p=3616&cb=23181624377&ref=&sc_r=1280x1024&sc_d=24, type=Undefined, willNavigate=true, isMainFrame=false 
[debug] [phantom] Automatically injected jquery.js client side 
[debug] [phantom] Successfully injected Casper client-side utilities 
[info] [phantom] Step anonymous 2/2 http://www.finishline.com/ (HTTP 301) 
title :  Finish Line: Shoes, Sneakers & Athletic Gear 
[info] [phantom] Step anonymous 2/2: done in 7686ms. 
[info] [phantom] Done 2 steps in 7694ms 

当我使用 phantomjs 作为引擎运行您的代码时,我得到了与您相同的结果。 有关如何安装 slimerjs 的说明是 here。 但是如果你已经安装了 npm,你可以使用这个命令来安装它。

npm install slimerjs -g

希望这对您有所帮助。

【讨论】:

  • 有机会我会测试一下。感谢您的意见。
  • 哦,如果你使用的是 Ubuntu,我写了一个 bash 脚本,它可以自动安装 casperjs、slimerjs、phantomjs 和必要的图形引擎,使 slimerjs 完全无头,你可以得到它here跨度>
猜你喜欢
  • 2016-01-09
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2018-01-13
  • 1970-01-01
  • 2019-09-04
  • 2013-05-13
  • 1970-01-01
相关资源
最近更新 更多