【问题标题】:Scrapy extracting the wrong IMG SRCScrapy 提取错误的 IMG SRC
【发布时间】:2017-02-10 12:57:30
【问题描述】:

我正在尝试使用 Scrapy 获取 ID 为 HERO_PHOTOURLs of images on a page。目标元素具有以下 HTML 代码

<img alt="Photo of Gray Line" style="position: relative; left: -50px; top: 0px;" id="HERO_PHOTO" class="flexibleImage" src="https://media-cdn.tripadvisor.com/media/photo-s/04/71/70/7c/gray-line-tours-montreal.jpg" width="352" height="260">

在 Chrome 浏览器中,运行

$('#HERO_PHOTO').attr('src')

正确抓取网址

"https://media-cdn.tripadvisor.com/media/photo-s/04/71/70/7c/gray-line-tours-montreal.jpg"

问题: 但是在 Scrapy 中使用以下 CSS 选择器,

response.css('#HERO_PHOTO::attr(src)').extract_first()

response.css('#HERO_PHOTO').xpath('@src').extract_first()

response.css('#HERO_PHOTO[src]').extract_first()

给我们

https://static.tacdn.com/img2/x.gif

使用.extract() 也会返回相同的错误 URL。

为什么 Scrapy 会抓取不同的 SRC 值?

【问题讨论】:

    标签: python css python-2.7 scrapy


    【解决方案1】:

    图片链接在页面中,但不直接作为<img>标签。确实有一些JavaScript代码处理过。 HTML 中有一个 JavaScript sn-p,其中包含您想要的图像链接(重新格式化了一下):

    ...
    }(window,ta));
    </script>
    <script type="text/javascript">
    var lazyImgs = [{
        "data": "//maps.google.com/maps/api/staticmap?&channel=ta.desktop&zoom=15&size=340x225&client=gme-tripadvisorinc&sensor=falselanguageParam&center=45.503395,-73.573174&maptype=roadmap&&markers=icon:http%3A%2F%2Fc1.tacdn.com%2Fimg2%2Fmaps%2Ficons%2Fpin_v2_CurrentCenter.png|45.503395,-73.57317&signature=FqI7Z1egbpsVrlEE0yjw9HmsMJ8=",
        "scroll": false,
        "tagType": "img",
        "id": "lazyload_1098682971_0",
        "priority": 500,
        "logerror": false
    }, {
        "data": "//ad.atdmt.com/i/img;p=11007200799198;cache=?ord=1475487471489",
        "scroll": false,
        "tagType": "img",
        "id": "lazyload_1098682971_1",
        "priority": 1000,
        "logerror": false
    }, {
        "data": "//ad.doubleclick.net/ad/N4764.TripAdvisor/B7050081;sz=1x1?ord=1475487471489",
        "scroll": false,
        "tagType": "img",
        "id": "lazyload_1098682971_2",
        "priority": 1000,
        "logerror": false
    }, {
        "data": "https://static.tacdn.com/img2/maps/icons/spinner24.gif",
        "scroll": false,
        "tagType": "img",
        "id": "lazyload_1098682971_3",
        "priority": 100,
        "logerror": false
    }, {
        "data": "https://media-cdn.tripadvisor.com/media/photo-s/04/71/70/7c/gray-line-tours-montreal.jpg",
        "scroll": false,
        "tagType": "img",
        "id": "HERO_PHOTO",
        "priority": 100,
        "logerror": false
    }, {
        "data": "https://media-cdn.tripadvisor.com/media/photo-s/0c/f5/19/98/montreal-night-tour.jpg",
        "scroll": false,
        "tagType": "img",
        "id": "THUMB_PHOTO1",
        "priority": 100,
        "logerror": false
    }, {
        "data": "https://media-cdn.tripadvisor.com/media/photo-s/0c/f5/19/8f/montreal-night-tour.jpg",
        "scroll": false,
        "tagType": "img",
        "id": "THUMB_PHOTO2",
        "priority": 100,
        "logerror": false
    }, {
        "data": "https://static.tacdn.com/img2/generic/site/no_user_photo-v1.gif",
        "scroll": false,
        "tagType": "img",
        "id": "lazyload_1098682971_4",
        "priority": 100,
        "logerror": false
    }...
    

    解析这个的一种方法是使用js2xml

    from pprint import pprint
    # get all `<script>`s content 
    for js in response.xpath('.//script[@type="text/javascript"]/text()').extract():
        try:
            jstree = js2xml.parse(js)
    
            # look for assignment of `var lazyImgs`
            for imgs in jstree.xpath('//var[@name="lazyImgs"]/*'):
    
                # use js2xml.make_dict() -- poor name I know
                # to build a useful Python object
                data = js2xml.make_dict(imgs)
    
                pprint(data)
    
                break
    
        except Exception as e:
            pass
    

    这就是你得到的结果:

    [{'data': '//maps.google.com/maps/api/staticmap?&channel=ta.desktop&zoom=15&size=340x225&client=gme-tripadvisorinc&sensor=falselanguageParam&center=45.503395,-73.573174&maptype=roadmap&&markers=icon:http%3A%2F%2Fc1.tacdn.com%2Fimg2%2Fmaps%2Ficons%2Fpin_v2_CurrentCenter.png|45.503395,-73.57317&signature=FqI7Z1egbpsVrlEE0yjw9HmsMJ8=',
      'id': 'lazyload_-1977833463_0',
      'logerror': False,
      'priority': 500,
      'scroll': False,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/maps/icons/spinner24.gif',
      'id': 'lazyload_-1977833463_1',
      'logerror': False,
      'priority': 100,
      'scroll': False,
      'tagType': 'img'},
     {'data': 'https://media-cdn.tripadvisor.com/media/photo-s/04/71/70/7c/gray-line-tours-montreal.jpg',
      'id': 'HERO_PHOTO',
      'logerror': False,
      'priority': 100,
      'scroll': False,
      'tagType': 'img'},
     {'data': 'https://media-cdn.tripadvisor.com/media/photo-s/0c/f5/19/98/montreal-night-tour.jpg',
      'id': 'THUMB_PHOTO1',
      'logerror': False,
      'priority': 100,
      'scroll': False,
      'tagType': 'img'},
     {'data': 'https://media-cdn.tripadvisor.com/media/photo-s/0c/f5/19/8f/montreal-night-tour.jpg',
      'id': 'THUMB_PHOTO2',
      'logerror': False,
      'priority': 100,
      'scroll': False,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/generic/site/no_user_photo-v1.gif',
      'id': 'lazyload_-1977833463_2',
      'logerror': False,
      'priority': 100,
      'scroll': False,
      'tagType': 'img'},
     {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/08/38/19/cb/gayle-h.jpg',
      'id': 'lazyload_-1977833463_3',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/lvl_01.png',
      'id': 'lazyload_-1977833463_4',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/rev_02.png',
      'id': 'lazyload_-1977833463_5',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/Appreciated.png',
      'id': 'lazyload_-1977833463_6',
      'logerror': False,
      'priority': 100,
      'scroll': False,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/icons/gray_flag.png',
      'id': 'lazyload_-1977833463_7',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/01/b1/32/93/holidays1958.jpg',
      'id': 'lazyload_-1977833463_8',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/lvl_04.png',
      'id': 'lazyload_-1977833463_9',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/rev_04.png',
      'id': 'lazyload_-1977833463_10',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/FunLover.png',
      'id': 'lazyload_-1977833463_11',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/Appreciated.png',
      'id': 'lazyload_-1977833463_12',
      'logerror': False,
      'priority': 100,
      'scroll': False,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/icons/gray_flag.png',
      'id': 'lazyload_-1977833463_13',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://media-cdn.tripadvisor.com/media/photo-o/06/4d/bc/f6/disneybus.jpg',
      'id': 'lazyload_-1977833463_14',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/lvl_06.png',
      'id': 'lazyload_-1977833463_15',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/rev_06.png',
      'id': 'lazyload_-1977833463_16',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/FunLover.png',
      'id': 'lazyload_-1977833463_17',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/Appreciated.png',
      'id': 'lazyload_-1977833463_18',
      'logerror': False,
      'priority': 100,
      'scroll': False,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/icons/gray_flag.png',
      'id': 'lazyload_-1977833463_19',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/01/2e/70/a7/avatar078.jpg',
      'id': 'lazyload_-1977833463_20',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/rev_01.png',
      'id': 'lazyload_-1977833463_21',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/Appreciated.png',
      'id': 'lazyload_-1977833463_22',
      'logerror': False,
      'priority': 100,
      'scroll': False,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/icons/gray_flag.png',
      'id': 'lazyload_-1977833463_23',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/01/2e/70/9f/avatar070.jpg',
      'id': 'lazyload_-1977833463_24',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/lvl_02.png',
      'id': 'lazyload_-1977833463_25',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/rev_03.png',
      'id': 'lazyload_-1977833463_26',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/Appreciated.png',
      'id': 'lazyload_-1977833463_27',
      'logerror': False,
      'priority': 100,
      'scroll': False,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/icons/gray_flag.png',
      'id': 'lazyload_-1977833463_28',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/03/9f/a6/94/facebook-avatar.jpg',
      'id': 'lazyload_-1977833463_29',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/lvl_04.png',
      'id': 'lazyload_-1977833463_30',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/rev_05.png',
      'id': 'lazyload_-1977833463_31',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/FunLover.png',
      'id': 'lazyload_-1977833463_32',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/Appreciated.png',
      'id': 'lazyload_-1977833463_33',
      'logerror': False,
      'priority': 100,
      'scroll': False,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/icons/gray_flag.png',
      'id': 'lazyload_-1977833463_34',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/06/f3/32/86/complsv.jpg',
      'id': 'lazyload_-1977833463_35',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/lvl_04.png',
      'id': 'lazyload_-1977833463_36',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/rev_05.png',
      'id': 'lazyload_-1977833463_37',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/FunLover.png',
      'id': 'lazyload_-1977833463_38',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/Appreciated.png',
      'id': 'lazyload_-1977833463_39',
      'logerror': False,
      'priority': 100,
      'scroll': False,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/icons/gray_flag.png',
      'id': 'lazyload_-1977833463_40',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/05/f2/4d/68/christine-n.jpg',
      'id': 'lazyload_-1977833463_41',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/lvl_03.png',
      'id': 'lazyload_-1977833463_42',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/rev_04.png',
      'id': 'lazyload_-1977833463_43',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/FunLover.png',
      'id': 'lazyload_-1977833463_44',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/Appreciated.png',
      'id': 'lazyload_-1977833463_45',
      'logerror': False,
      'priority': 100,
      'scroll': False,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/icons/gray_flag.png',
      'id': 'lazyload_-1977833463_46',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/01/2e/70/80/avatar001.jpg',
      'id': 'lazyload_-1977833463_47',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/lvl_03.png',
      'id': 'lazyload_-1977833463_48',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/rev_04.png',
      'id': 'lazyload_-1977833463_49',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/FunLover.png',
      'id': 'lazyload_-1977833463_50',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/Appreciated.png',
      'id': 'lazyload_-1977833463_51',
      'logerror': False,
      'priority': 100,
      'scroll': False,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/icons/gray_flag.png',
      'id': 'lazyload_-1977833463_52',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/0a/45/46/e2/tracey-g.jpg',
      'id': 'lazyload_-1977833463_53',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/lvl_06.png',
      'id': 'lazyload_-1977833463_54',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/rev_06.png',
      'id': 'lazyload_-1977833463_55',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/FunLover.png',
      'id': 'lazyload_-1977833463_56',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/badges/20px/Appreciated.png',
      'id': 'lazyload_-1977833463_57',
      'logerror': False,
      'priority': 100,
      'scroll': False,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/icons/gray_flag.png',
      'id': 'lazyload_-1977833463_58',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://media-cdn.tripadvisor.com/media/photo-f/02/6d/40/b2/montreal-amphi-bus-tour.jpg',
      'id': 'lazyload_-1977833463_59',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/01/39/2d/43/old-montreal-walking.jpg',
      'id': 'lazyload_-1977833463_60',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/06/df/96/c7/excursions-montreal-private.jpg',
      'id': 'lazyload_-1977833463_61',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/02/ad/57/0a/filename-p1010076-jpg.jpg',
      'id': 'lazyload_-1977833463_62',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://media-cdn.tripadvisor.com/media/photo-o/04/b5/6a/8d/ali-l.jpg',
      'id': 'lazyload_-1977833463_63',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/01/2e/70/87/avatar008.jpg',
      'id': 'lazyload_-1977833463_64',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://media-cdn.tripadvisor.com/media/photo-o/06/8a/c5/7d/leonard-d.jpg',
      'id': 'lazyload_-1977833463_65',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://media-cdn.tripadvisor.com/media/photo-o/05/6d/32/ca/rpm13111.jpg',
      'id': 'lazyload_-1977833463_66',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/01/2e/70/87/avatar008.jpg',
      'id': 'lazyload_-1977833463_67',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/neighborhood/icon_hood_white.png',
      'id': 'lazyload_-1977833463_68',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://media-cdn.tripadvisor.com/media/oyster/500/08/5b/34/b0/sherbrooke-street-west-shopping--.jpg',
      'id': 'lazyload_-1977833463_69',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/maps/icons/icon_mapControl_expand_idle_30x30.png',
      'id': 'lazyload_-1977833463_70',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/maps/icons/icon_mapControl_expand_hover_30x30.png',
      'id': 'lazyload_-1977833463_71',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/01/a1/f2/6b/marche-atwater.jpg',
      'id': 'lazyload_-1977833463_72',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/01/41/78/a3/mcgill-university-lower.jpg',
      'id': 'lazyload_-1977833463_73',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/04/06/16/08/musee-grevin.jpg',
      'id': 'lazyload_-1977833463_74',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/03/4a/9a/85/laurie-raphael.jpg',
      'id': 'lazyload_-1977833463_75',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/09/45/53/16/cafe-humble-lion.jpg',
      'id': 'lazyload_-1977833463_76',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/03/2f/37/03/essence.jpg',
      'id': 'lazyload_-1977833463_77',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/branding/logo_with_tagline.png',
      'id': 'LOGOTAGLINE',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'},
     {'data': 'https://static.tacdn.com/img2/icons/bell.png',
      'id': 'lazyload_-1977833463_78',
      'logerror': False,
      'priority': 100,
      'scroll': True,
      'tagType': 'img'}]
    

    【讨论】:

      【解决方案2】:

      我相信您使用了错误的 CSS 选择器。看着w3 schools,它似乎选择了你想要的属性[src]。

      试试这个。

      response.css('#HERO_PHOTO[src]').extract_first()

      我的下一个建议是看看你在不使用 extract_first() 的情况下会得到什么。看看是不是在response.css('#HERO_PHOTO[src]')的返回值中

      编辑:我认为您遇到的问题是您正在查询页面源,而不是呈现的 html。这是我认为正在发生的事情的链接。

      This Questions first answer

      您查询的是服务器响应的内容,而不是 JavaScript 有机会操作的内容。

      【讨论】:

      • response.css('#HERO_PHOTO[src]').extract() 让我[u'&lt;img alt="Photo of PHI Centre" style="position: relative;" id="HERO_PHOTO" class="flexibleImage" src="https://static.tacdn.com/img2/x.gif"&gt;']
      • 你能发布你试图从中提取的 html 吗?
      • 这是页面tripadvisor.com/…。我要定位的 HTML 块在问题中。
      • 页面源中似乎没有您要查找的内容。
      • 修改了我的答案
      猜你喜欢
      • 2012-12-03
      • 1970-01-01
      • 2012-01-07
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2019-05-06
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多