【问题标题】:Scraping Data from Google Maps从谷歌地图中抓取数据
【发布时间】:2015-01-06 22:14:32
【问题描述】:

我可以通过默认链接直接抓取地图的前10个结果,例如:

https://www.google.com/maps/search/toronto+dentist/

但是我无法抓取接下来的 10 个结果,因为当我单击右箭头时,它使用 Javascript 来获取数据,并且我没有更改 URL 来查看接下来 10 个结果的来源。

经典的 Google 地图版本是可抓取的,但结果与当前的 Google 地图不同。

【问题讨论】:

  • google 不希望你这样做。
  • 虽然我同意@Dagon,但很容易在浏览器的开发者工具的控制台中看到正在发出的请求。

标签: javascript php google-maps


【解决方案1】:

您想使用 pjscrape. 而不是简单地抓取 HTML,它有一个完整的无头 WebKit 引擎,可以运行 Javascript 并相应地呈现页面。

API 非常简单,只要确保对它进行速率限制,因为 Google 相当对这类事情很挑剔。

【讨论】:

    【解决方案2】:

    或者,您可以使用 SerpApi 等第三方解决方案。这是一个免费试用的付费 API。

    每页包含 20 个结果。要实现分页,只需使用定义结果偏移量的start 参数(例如,0(默认)是结果的第一页,20 是结果的第二页,40 是结果的第三页结果等)

    require 'path/to/google_search_results';
    
    $query = [
      "engine" => "google_maps",
      "q" => "toronto dentist",
      "ll" => "@43.7162866,-79.4689839,12z",
      "type" => "search",
      "api_key" => "SECRET_API_KEY"
    ];
    
    $search = new GoogleSearch();
    $results = $search->json($query);
    $local_results = $result->local_results;
    

    示例输出:

    "local_results": [
      {
        "position": 1,
        "title": "Archer Dental Rosedale",
        "data_id": "0x882b2c15997e2805:0xb33e0a75557c03da",
        "reviews_link": "https://serpapi.com/search.json?engine=google_maps_reviews&hl=en&place_id=0x882b2c15997e2805%3A0xb33e0a75557c03da",
        "gps_coordinates": {
          "latitude": 43.6715812,
          "longitude": -79.37675469999999
        },
        "place_id_search": "https://serpapi.com/search.json?data=%214m5%213m4%211s0x882b2c15997e2805%3A0xb33e0a75557c03da%218m2%213d43.6715812%214d-79.37675469999999&engine=google_maps&google_domain=google.com&hl=en&type=place",
        "rating": 4.5,
        "reviews": 180,
        "type": "Dental clinic",
        "address": "600 Sherbourne St #808, Toronto, ON M4X 1W4, Canada",
        "hours": "Opens at 8:00 AM",
        "phone": "+1 416-964-0010",
        "website": "https://www.archerdental.ca/locations/archer-dental-rosedale/",
        "thumbnail": "https://lh5.googleusercontent.com/p/AF1QipMqm7_r6phLhJ-QMwQL1lE1ukz7DBJ4DeyK4yzK=w161-h92-k-no"
      },
      {
        "position": 2,
        "title": "Toronto Dentist Office",
        "data_id": "0x882b2d0d9df040c7:0xc2a062472b901c76",
        "reviews_link": "https://serpapi.com/search.json?engine=google_maps_reviews&hl=en&place_id=0x882b2d0d9df040c7%3A0xc2a062472b901c76",
        "gps_coordinates": {
          "latitude": 43.777342999999995,
          "longitude": -79.41546699999999
        },
        "place_id_search": "https://serpapi.com/search.json?data=%214m5%213m4%211s0x882b2d0d9df040c7%3A0xc2a062472b901c76%218m2%213d43.777342999999995%214d-79.41546699999999&engine=google_maps&google_domain=google.com&hl=en&type=place",
        "rating": 4.6,
        "reviews": 10,
        "type": "Dentist",
        "address": "5460 Yonge St Suite 201, North York, ON M2N 6K7, Canada",
        "hours": "Opens at 9:00 AM",
        "phone": "+1 416-733-7855",
        "website": "https://torontodentistoffice.ca/",
        "thumbnail": "https://lh5.googleusercontent.com/p/AF1QipMrNZUWbaZ7TkWUittUPY_UoWVOBO4Mj3vwyUZ7=w138-h92-k-no"
      },
      ...
    ],
    

    您可以查看documentation了解更多详情。

    免责声明:我在 SerpApi 工作。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2017-09-05
      • 2015-04-23
      • 2017-12-29
      • 1970-01-01
      • 2016-08-01
      相关资源
      最近更新 更多