【问题标题】:Searching collections搜索集合
【发布时间】:2016-02-12 13:56:00
【问题描述】:

我正在处理一个相当大的数据集(>100,000 行)并尝试比较两个列表以找出新列表中的哪些项目尚未在主列表中。换句话说,我想找到新的独特物品。

我有一些使用 vlookup 的 VBA 代码和有效的数组,但是当数组变得太大(~70,000)时会爆炸。所以我转向收藏。但是,我在使用 vlookup 或 match 搜索集合时遇到了困难。

Sub find_uniqueIDs()

Dim a As Long
Dim n As Long
Dim m As Variant
Dim oldnum As Long
Dim oldIDs As Variant
Dim oldcoll As New Collection
Dim newnum As Long
Dim newIDs As Variant
Dim newcoll As New Collection

oldnum = 75000
oldIDs = Range("A1", Range("A" & oldnum))
newnum = 45000 + 3
newIDs = Range("G3", Range("G" & newnum))

'Using arrays to search, but bombs out when oldnum or newnum are ~70000
For n = 1 To newnum - 3
    m = Application.VLookup(newIDs(n, 1), oldIDs, 1, False)
    If IsError(m) Then Range("E100000").End(xlUp).Offset(1, 0) = newIDs(n, 1)
Next n

'Using collections to search
For n = 1 To oldnum
On Error Resume Next
    oldcoll.Add oldIDs(n, 1)
On Error GoTo 0
Next n

For m = 1 To newnum
On Error Resume Next
    newcoll.Add newIDs(m, 1)
On Error GoTo 0
Next m

'This bit of code doesn't work
For a = 1 To newcoll.Count
If Application.VLookup(newcoll(a), oldcoll, 1, False) = "#N/A" Then _
    Range("E100000").End(xlUp).Offset(1, 0) = newcoll(a)
Next a

End Sub

任何想法如何使用集合确定特定项目是否在主列表中?

【问题讨论】:

标签: vba excel


【解决方案1】:

VLookup 是一个worksheet function,不是常规的 VBA 函数,因此是 it's for searching in Ranges, not Collections.

语法: VLOOKUP (lookup_value, table_array, col_index_num, [range_lookup])

[...]

table_array(必需):VLOOKUP 将在其中搜索lookup_value 和返回值的单元格范围

为了在其他 VBA 数据结构(如数组、集合等)中进行搜索,您必须找出其他方法并可能手动实现它。

【讨论】:

    【解决方案2】:

    我会这样做:

    Sub test()
    
    Dim newRow As Long, oldRow As Long
    Dim x As Long, Dim y As Long
    Dim checker As Boolean
    
    With ActiveSheet
    
    newRow = .Cells(.Rows.Count,7).End(xlUp).Row
    oldRow = .Cells(.Rows.Count,1).End(xlUp).Row
    checker = True
    
    for y = 1 To oldRow
    
        for x = 1 To newRow
    
        If .Cells(y,1).Value = .Cells(x,7).Value Then
    
        checker = False
    
        Exit For
    
        End If
    
        Next
    
    If checker Then
    
    Range("E10000").End(xlUp).Offset(1,0).Value = .Cells(y,1).Value
    
    End If
    
    checker = True
    
    Next
    
    End With
    
    End Sub
    

    【讨论】:

      【解决方案3】:

      虽然@Jeeped 建议使用 Scripting.Dictionary 对象可能是最好的,但您也可以尝试使用应用于您的数组的 Filter() function

      【讨论】:

        【解决方案4】:

        这里有一个简短的 sub 演示了一些脚本字典方法。

        Sub list_New_Unique()
            Dim dMASTER As Object, dNEW As Object, k As Variant
            Dim v As Long, vVALs() As Variant, vNEWs() As Variant
        
            Debug.Print "Start: " & Timer
        
            Set dMASTER = CreateObject("Scripting.Dictionary")
            Set dNEW = CreateObject("Scripting.Dictionary")
            dMASTER.comparemode = vbTextCompare
            dNEW.comparemode = vbTextCompare
        
            With Worksheets("Sheet7")
                vVALs = .Range("A2:A100000").Value2
                vNEWs = .Range("C2:C100000").Value2
            End With
        
            'populate the dMASTER values
            For v = LBound(vVALs, 1) To UBound(vVALs, 1)
                dMASTER.Add Key:=vVALs(v, 1), Item:=vVALs(v, 1)
            Next v
        
            'only populate dNEW with items not found in dMASTER
            For v = LBound(vNEWs, 1) To UBound(vNEWs, 1)
                If Not dMASTER.exists(vNEWs(v, 1)) Then
                    If Not dNEW.exists(vNEWs(v, 1)) Then _
                        dNEW.Add Key:=vNEWs(v, 1), Item:=vNEWs(v, 1)
                End If
            Next v
        
            Debug.Print dNEW.Count
        
            For Each k In dNEW.keys
                'Debug.Print k
            Next k
        
            Debug.Print "End: " & Timer
        
            dNEW.RemoveAll: Set dNEW = Nothing
            dMASTER.RemoveAll: Set dMASTER = Nothing
        End Sub
        

        A2:A100000 中有 99,999 个唯一条目,C2:C89747 中有 89747 个随机条目,因此在 9.87 秒内找到了 A2:A100000 中未找到的 70,087 个唯一新条目。

        【讨论】:

          猜你喜欢
          • 2014-12-24
          • 2014-07-31
          • 2020-02-15
          • 2019-03-10
          • 1970-01-01
          • 1970-01-01
          • 2021-04-17
          • 2021-11-03
          • 1970-01-01
          相关资源
          最近更新 更多