【问题标题】:In VBA, How can we get data shown up with "Inspect Element", but not with "View Page Source"?在 VBA 中,我们如何才能使用“检查元素”而不是“查看页面源”来显示数据?
【发布时间】:2021-10-20 12:01:12
【问题描述】:

我正在尝试抓取包含多个标签的网页。我想获取单击按季度选项卡时显示的季度数据,但我的代码不断返回单击按年选项卡时显示的年度数据。 问题是两种类型的数据都在同一个 URL 上,当右键单击“检查元素”时,它们的 ID 也相同;您无法区分季度数据元素 ID 和年度数据数据元素 ID。 “检查元素”显示季度和年度数据,但“查看页面源”仅显示年度数据。 谁能告诉我如何获取季度数据?非常感谢。

   Sub Getquarterdata()

    Dim html As HTMLDocument
    Set html = New HTMLDocument
    
    URL = "https://s.cafef.vn/hose/VCB-ngan-hang-thuong-mai-co-phan-ngoai-thuong-viet-nam.chn"
 
    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", URL, False
        .SetRequestHeader "User-Agent", "Mozilla/5.0"
        .send
        html.body.innerHTML = .responseText

    End With

        ' By "Inspect Element" pointing at Quarterly Data, I counted "td" and came with these lines of code, but they print yearly data.
          Debug.Print html.getElementById("divHoSoCongTyAjax").getElementsByTagName("td")(23).innerText  '=> Print  9,091,070,000 (Year 2017 data)
          Debug.Print html.getElementById("divHoSoCongTyAjax").getElementsByTagName("td")(24).innerText  '=> Print 14,605,578,000 (Year 2018 data)
          Debug.Print html.getElementById("divHoSoCongTyAjax").getElementsByTagName("td")(25).innerText  '=> Print 18,510,898,000 (Year 2019 data)
          Debug.Print html.getElementById("divHoSoCongTyAjax").getElementsByTagName("td")(26).innerText  '=> Print 18,451,311,000 (Year 2020 data)
         ' The thing is that Quarterly Data shows up only with "Inspect Element", but not with "View Page Source"
    Set html = Nothing
 

结束子


链接

  1. 网址:https://s.cafef.vn/hose/VCB-ngan-hang-thuong-mai-co-phan-ngoai-thuong-viet-nam.chn

  2. 单击按季度选项卡时显示的季度数据 https://drive.google.com/file/d/1oRtrBZxAoKgdE7gMSBsmkpSX_Ljv1c7L/view?usp=sharing

  3. 单击“按年”选项卡时显示的年度数据 https://drive.google.com/file/d/1-tI5TU7IMOXFIhsfH8tGvsCRoB0O7Xl1/view?usp=sharing

  4. 检查季度数据: https://drive.google.com/file/d/1Xc5hRPTBIKFu7hQoLh4mStp92CxipNpU/view?usp=sharing

  5. 检查年度数据: https://drive.google.com/file/d/1LedAF3gvAYSIOKOKfZURR9A2rhK0SNgB/view?usp=sharing

【问题讨论】:

    标签: javascript vba getelementbyid getelementsbytagname getattribute


    【解决方案1】:

    给出的线索之一是在你看到它说Ajax 的班级中。这是动态添加的内容。如果您使用开发工具 (F12) 的网络选项卡,并手动选择季度选项卡,您将看到以下请求端点,它提供您所追求的数据:

    https://s.cafef.vn/Ajax/Bank/BHoSoCongTy.aspx?symbol=VCB&Type=1&PageIndex=0&PageSize=4&donvi=1


    Option Explicit
    
    Public Sub GetQuarterlyTable()
        'required VBE (Alt+F11) > Tools > References > Microsoft HTML Object Library ;  Microsoft XML, v6 (your version may vary)
    
        Dim hTable As MSHTML.HTMLTable
        Dim xhr As MSXML2.XMLHTTP60, html As MSHTML.HTMLDocument
       
        Set xhr = New MSXML2.XMLHTTP60
        Set html = New MSHTML.HTMLDocument
    
        With xhr
            .Open "GET", "https://s.cafef.vn/Ajax/Bank/BHoSoCongTy.aspx?symbol=VCB&Type=1&PageIndex=0&PageSize=4&donvi=1", False
            .send
            html.body.innerHTML = .responseText
        End With
    
        Set hTable = html.querySelector(".tab1child_content")
        
        'Do something with table
        Stop
    End Sub
    

    【讨论】:

    • 太棒了@QHarr,它就像一个魅力,神奇地。我想也许我必须学习一些 VBA 范围之外的新东西,比如 Selenium 或插入 Javascript 对象来解决这个问题,但是你给了我一个很好的继续前进的想法。非常感谢您的帮助。