【发布时间】:2020-07-01 02:34:51
【问题描述】:
我希望从使用 VBA 的 Excel 工作表中定义的 URL 中获取建议的客户定价信息。这些值在 Excel 中的 Cells(i,11) 中,它们都指向 https://ark.intel.com 上的特定页面。这些值从第 5 行开始。
例如,如果我想查找 Intel Xeon 8268 的价格,我会导航到 https://ark.intel.com/content/www/us/en/ark/products/192481/intel-xeon-platinum-8268-processor-35-75m-cache-2-90-ghz.html。如果查看源代码,很明显这个内容是用 JavaScript 生成的,所以我在 Firefox 网络浏览器上使用“检查元素”选项。
从这里,我可以向下导航并在标签中找到我要查找的内容。见下图:
我无法捕获该值并将其写入 excel 列,即 E 列。以下是我所做的一次尝试:
Sub ProcessorPricing()
Dim URL As String, lastRow As Long
Dim XMLHTTP As Object, HTML As Object, objResult As Object, Price As Object
lastRow = Range("A" & Rows.Count).End(xlUp).row
Dim cookie As String
Dim result_cookie As String
For i = 5 To lastRow
If Cells(i, 1) <> "" Then
URL = Cells(i, 11)
Set XMLHTTP = CreateObject("MSXML2.serverXMLHTTP")
XMLHTTP.Open "GET", URL, False
XMLHTTP.setRequestHeader "Content-Type", "text/xml"
XMLHTTP.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1; rv:25.0) Gecko/20100101 Firefox/25.0"
XMLHTTP.send
Set HTML = CreateObject("htmlfile")
HTML.body.innerHTML = XMLHTTP.responseText
Set objResult = html.getElementsByID("bladeInside")
Set Price = objResult.getElementsByTagName("span")(0)
Cells(i, 5) = Price.Value
DoEvents
End If
Next
End Sub
任何帮助将不胜感激。
PS - 我也尝试了在https://www.myonlinetraininghub.com/web-scraping-with-vba 找到的代码也无济于事
更新:
在您的帮助下一切顺利。谢谢你,伯特兰·马特尔和斯塔夫罗斯·乔恩。
这是整个脚本:
Sub UpdateProcessorInfo()
'requirements: JSON Parser installation needs to be added to project - https://github.com/VBA-tools/VBA-JSON - (Download latest release -> Import JsonConverter.bas -> File -> Import File)
'requirements: Windows only, include Reference to "Microsoft Scripting Runtime" (Tools -> References -> Check Microsoft Scripting Runtime)
'requirements: Add a refernce to Microsoft WinHTTP Services 5.1. (Tools -> References -> Check Microsoft WinHTTP Services 5.1)
Dim Connection As WorkbookConnection
Dim url As String, lastRow As Long
Dim XMLHTTP As Object, html As Object, objResultDiv As Object, link As Object
Dim cookie As String
Dim result_cookie As String
Dim req As New WinHttpRequest
Dim ids As String
Dim responseJSON As Object
For Each Connection In ThisWorkbook.Connections
Connection.Refresh
Next Connection
Worksheets("Processor_DB_Intel").Range("A2:A1000").Copy
Worksheets("Processor Comparisons").Range("A5").PasteSpecial Paste:=xlPasteValues
lastRow = Range("A" & Rows.Count).End(xlUp).row
Range("k5:k300").Clear
For i = 5 To lastRow
If Cells(i, 1) <> "" Then
url = "https://www.google.com/search?q=" & "site:ark.intel.com " & Cells(i, 1) & "&rnd=" & WorksheetFunction.RandBetween(1, 10000)
Set XMLHTTP = CreateObject("MSXML2.serverXMLHTTP")
XMLHTTP.Open "GET", url, False
XMLHTTP.setRequestHeader "Content-Type", "text/xml"
XMLHTTP.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1; rv:25.0) Gecko/20100101 Firefox/25.0"
XMLHTTP.send
Set html = CreateObject("htmlfile")
html.body.innerHTML = XMLHTTP.responseText
Set objResultDiv = html.getElementById("rso")
Set link = objResultDiv.getElementsByTagName("a")(0)
Cells(i, 11) = link
DoEvents
End If
Next
lastRow = Range("A" & Rows.Count).End(xlUp).row
For i = 5 To lastRow
ids = Cells(i, 13)
url = "https://ark.intel.com/libs/apps/intel/support/ark/recommendedCustomerPrice?ids=" & ids & "&siteName=ark"
If Cells(i, 1) <> "" Then
With req
.Open "GET", url, False
.send
Set responseJSON = JsonConverter.ParseJson(.responseText)
End With
On Error Resume Next
'Debug.Print responseJSON(1)("displayPrice")
Cells(i, 14) = responseJSON(1)("displayPrice")
End If
Next
结束子
【问题讨论】:
标签: javascript excel vba web-scraping screen-scraping