【发布时间】:2020-08-20 14:45:11
【问题描述】:
我想从网站https://www.amfiindia.com/nav-history-download 导入一些数据。在此页面上,有一个链接“以文本格式下载完整的 NAV 报告”,它将为我提供所需的数据。但是这个链接不是静态的,所以我不能直接在 VBA 中使用它来下载我的数据。那么如何使用excel从网页上的超链接下载数据呢?
我的方法是先获取变量中的超链接,然后使用该变量获取数据?
- 首先,使用
getElementsByTagName函数获取超链接,如下所示。 - 然后使用它作为 URL 来获取数据。
- 但我在将网站(这是一个字符串)与我的超链接等同时遇到类型不匹配错误。
我不知道href的类型。尝试在显示Variant 的监视窗口中查看,尝试仍然错误。
请帮我解决这个问题。
Sub webscraping()
Dim request As Object
Dim response As String
Dim html As New HTMLDocument
Dim website As String
Dim price As Variant
Dim cellAddress As String
Dim rowNumber As Long
Dim ie As InternetExplorer
Dim ht As HTMLDocument
Dim hr As MSHTML.IHTMLElement
'Dim Hra As MSHTML.IHTMLElement
Set ie = New InternetExplorer
ie.Visible = True
ie.Navigate ("https://www.amfiindia.com/nav-history-download")
Do Until ie.ReadyState >= 4
DoEvents
Loop
Set ht = ie.Document
'MsgBox ht.getElementById("navhistorydownload")
Set hr = ht.getElementsByTagName("a")(18).href
' Website to go to.
website = StrConv(hr, vbUnicode)
' Create the object that will make the webpage request.
Set request = CreateObject("MSXML2.XMLHTTP")
' Where to go and how to go there - probably don't need to change this.
request.Open "GET", website, False
' Get fresh data.
request.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
' Send the request for the webpage.
request.send
' Get the webpage response data into a variable.
response = StrConv(request.responseBody, vbUnicode)
' Put the webpage into an html object to make data references easier.
html.body.innerHTML = response
' Get the price from the specified element on the page.
'price = html.getElementstagName("a").Item(0).innerText
cellAddress = Range("A" & Rows.Count).End(xlUp).Address
rowNumber = Range(cellAddress).Row
ThisWorkbook.Sheets(1).Cells(rowNumber + 1, 1) = response
' MsgBox rowNumber
' MsgBox cellAddress
' Output the price into a message box.
'MsgBox price
End Sub
【问题讨论】:
标签: html excel vba web-scraping import