【问题标题】:Use Excel VBA to Extract Data From a Webpage使用 Excel VBA 从网页中提取数据
【发布时间】:2021-10-18 16:35:42
【问题描述】:

我正在尝试使用 Excel VBA 从网页 (https://www.churchofjesuschrist.org/maps/meetinghouses/lang=eng&q=1148+W+100+N) 中提取一些数据。我使用的代码将打开 Internet Explorer,导航到该网站,它将提取最顶部的结果。但我似乎无法弄清楚如何提取其余结果(即病房、语言、联系人姓名、联系人#)。想法?

Sub MeethinghouseLocator()

Dim IE As Object
Set IE = CreateObject("InternetExplorer.Application")
IE.navigate Sheets("Sheet1").Range("A1").Value
IE.Visible = True
While IE.Busy
DoEvents
Wend

Application.Wait (Now + TimeValue("0:00:01"))

IE.document.querySelector("button.search-input__execute.button--primary").Click

  Dim Doc As HTMLDocument
  Set Doc = IE.document
        
Application.Wait (Now + TimeValue("0:00:01"))
        
'WardName
    Dim aaaaFONT As String
    aaaaFONT = Trim(Doc.getElementsByClassName("location-header__name ng-binding")(0).innerText)
    Sheets("Sheet1").Range("D6").Value = aaaaFONT
    
Application.Wait (Now + TimeValue("0:00:01"))
    
'Language
    Dim aaabFONT As String
    aaabFONT = Trim(Doc.getElementsByClassName("location-header__language ng-binding ng-scope")(0).innerText)
    Sheets("Sheet1").Range("E6").Value = aaabFONT

'Click 1st Link
    IE.document.getElementsByClassName("location-header__name ng-binding")(0).Click

Application.Wait (Now + TimeValue("0:00:01"))

'Contact Name
    Dim aaacFONT As String
    aaacFONT = Trim(Doc.getElementsByClassName("maps-card__group maps-card__group--inline ng-scope")(2).innerText)
    Sheets("Sheet1").Range("H6").Value = aaacFONT

'Contact Name Function
    Range("F6").Select
    ActiveCell.FormulaR1C1 = _
        "=LEFT(RIGHT(RC[2],LEN(RC[2])-FIND(CHAR(10),RC[2])-3),FIND(RIGHT(RIGHT(RC[2],LEN(RC[2])-FIND(CHAR(10),RC[2])-3),LEN(RIGHT(RC[2],LEN(RC[2])-FIND(CHAR(10),RC[2])-3))-FIND(CHAR(10),RIGHT(RC[2],LEN(RC[2])-FIND(CHAR(10),RC[2])-3))),RIGHT(RC[2],LEN(RC[2])-FIND(CHAR(10),RC[2])-3))-1)"

'Contact Phone Number
    Dim aaadFONT As String
    aaadFONT = Trim(Doc.getElementsByClassName("phone ng-binding")(0).innerText)
    Sheets("Sheet1").Range("G6").Value = aaadFONT

    
  IE.Quit

End Sub

【问题讨论】:

  • 我刚试过,似乎一切正常,但由于位置建议,我不得不点击更多的东西(我想你的初始查询参数不够精确)。
  • 谢谢,雷蒙德!关于添加 While 循环:非常有帮助。我一定会把它纳入其中。我想我面临的问题(希望我可以解释一下)是我发布的代码将导航到我想要导航到的网页并提取信息以获取 North Park 3rd Ward 的信息,例如(最高的结果)。但我不确定如何提取第二个结果(Edgemont 22nd Ward)的结果。
  • 所以你想浏览病房但不能因为点击第一个病房后结果消失了?
  • 对不起,是的;我认为这是一种更好的表达方式。是的。
  • 我已经更新了我的代码,已经很晚了,我要睡觉了,请尝试一下,如果您遇到任何问题,请尝试解决并在需要时在此处发表评论

标签: excel vba data-extraction


【解决方案1】:

您的大部分代码实际上都有效,所以我不确定您面临什么问题,但您没有考虑单击每个链接后的加载,因此我添加了 While 循环以检查其 Ready 和 @ 987654323@ 属性,然后继续。

编辑:代码现在循环遍历结果中列出的所有病房,其想法是将第一个 IE 保留在结果页面上,并将病房的 URL 和输入行传递给子 ExtractWard 它将打开另一个 IE,导航到给定的 URL 并提取病房详细信息。

Sub MeethinghouseLocator()
   
    Dim IE As Object
    Set IE = CreateObject("InternetExplorer.Application")
    IE.navigate Sheets("Sheet1").Range("A1").Value
    IE.Visible = True
    While IE.Busy Or IE.readyState <> 4
        DoEvents
    Wend
            
    IE.document.querySelector("button.search-input__execute.button--primary").Click
    
    While IE.Busy Or IE.readyState <> 4
        DoEvents
    Wend
    Dim Doc As HTMLDocument
    Set Doc = IE.document
        
    Application.Wait (Now + TimeValue("0:00:01"))
        
    Dim wardContent As Object
    Set wardContent = Doc.getElementsByClassName("maps-card__content")(2)
    
    Dim wardCollection As Object
    Set wardCollection = wardContent.getElementsByClassName("location-header")
    
    Dim rowNum As Long
    rowNum = 6
        
    Dim i As Long
    For i = 0 To wardCollection.Length - 1
        With wardCollection(i)
            'WardName
            Dim aaaaFONT As String
            aaaaFONT = Trim(.getElementsByClassName("location-header__name ng-binding")(0).innerText)
            Sheets("Sheet1").Cells(rowNum, "D").Value = aaaaFONT
                
            'Language
            Dim aaabFONT As String
            aaabFONT = Trim(.getElementsByClassName("location-header__language ng-binding ng-scope")(0).innerText)
            Sheets("Sheet1").Cells(rowNum, "E").Value = aaabFONT
        
            Dim wardURL As String
            wardURL = .getElementsByClassName("location-header__name ng-binding")(0).href
            
            ExtractWard wardURL, rowNum
        End With
        
        rowNum = rowNum + 1
    Next i
            
    Set Doc = Nothing
    IE.Quit
    Set IE = Nothing
End Sub

Private Sub ExtractWard(argURL As String, argRow As Long)
    Dim IE As Object
    Set IE = CreateObject("InternetExplorer.Application")
    IE.navigate argURL
    IE.Visible = True
    While IE.Busy Or IE.readyState <> 4
        DoEvents
    Wend
            
    Dim Doc As HTMLDocument
    Set Doc = IE.document
            
    'Contact Name
    Dim aaacFONT As String
    aaacFONT = Trim(Doc.getElementsByClassName("maps-card__group maps-card__group--inline ng-scope")(2).innerText)
    Sheets("Sheet1").Cells(argRow, "H").Value = aaacFONT
    
    'Contact Name Function
    Sheets("Sheet1").Cells(argRow, "F").FormulaR1C1 = _
        "=LEFT(RIGHT(RC[2],LEN(RC[2])-FIND(CHAR(10),RC[2])-3),FIND(RIGHT(RIGHT(RC[2],LEN(RC[2])-FIND(CHAR(10),RC[2])-3),LEN(RIGHT(RC[2],LEN(RC[2])-FIND(CHAR(10),RC[2])-3))-FIND(CHAR(10),RIGHT(RC[2],LEN(RC[2])-FIND(CHAR(10),RC[2])-3))),RIGHT(RC[2],LEN(RC[2])-FIND(CHAR(10),RC[2])-3))-1)"
    
    'Contact Phone Number
    Dim aaadFONT As String
    aaadFONT = Trim(Doc.getElementsByClassName("phone ng-binding")(0).innerText)
    Sheets("Sheet1").Cells(argRow, "G").Value = aaadFONT
        
    Set Doc = Nothing
    IE.Quit
    Set IE = Nothing
End Sub

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2013-11-27
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多