【问题标题】:Remove duplicates from array using VBA使用 VBA 从数组中删除重复项
【发布时间】:2012-08-05 21:24:52
【问题描述】:

假设我在 Excel 2010 中有一个数据块,100 行乘 3 列。

C 列包含一些重复项,假设它开始为

1、1、1、2、3、4、5、……、97、98

使用 VBA,我想删除重复的行,所以我剩下 98 行和 3 列。

1、2、3、……、97、98

我知道 Excel 2010 中有一个按钮可以执行此操作,但它随后会干扰我的其余代码并给出不正确的结果。

另外,我想在数组中做,然后将结果粘贴到工作表上,而不是像Application.Worksheetfunction.countif(.....这样的方法

比如:

Dim myarray() as Variant
myarray=cells(1,1).Currentregion.value

Dim a as Long

For a=1 to Ubound(myarray,1)

    'something here to 

Next a

【问题讨论】:

  • 由于 OP 想要一个接近 RemoveDuplicates 保留相关数组行的 VBA 解决方案,我发布了一个迟到的回复 ►“从数组中删除重复项(加上相关的行项)”

标签: excel vba


【解决方案1】:

I answered a similar question。这是我使用的代码:

Dim dict As Object
Dim rowCount As Long
Dim strVal As String

Set dict = CreateObject("Scripting.Dictionary")

rowCount = Sheet1.Range("A1").CurrentRegion.Rows.Count

'you can change the loop condition to iterate through the array rows instead
Do While rowCount > 1
  strVal = Sheet1.Cells(rowCount, 1).Value2

  If dict.exists(strVal) Then
    Sheet1.Rows(rowCount).EntireRow.Delete
  Else
    'if doing this with an array, then add code in the Else block
    ' to assign values from this row to the array of unique values
    dict.Add strVal, 0
  End If

  rowCount = rowCount - 1
Loop

Set dict = Nothing

如果要使用数组,则使用相同的条件 (if/else) 语句遍历元素。如果字典中不存在该项,则可以将其添加到字典中并将行值添加到另一个数组中。

老实说,我认为最有效的方法是调整从宏记录器获得的代码。您可以在一行中执行上述功能:

    Sheet1.UsedRange.RemoveDuplicates Columns:=3, Header:=xlYes

【讨论】:

  • 这会从工作表中删除实际行,而问题是删除 VBA 中的重复项。此外,任何行删除都应始终自下而上进行,以避免跳过行
  • @brettdj 这两段代码都可以删除重复项。提问者想在 VBA 中执行此操作,最好使用数组(在这种情况下,他/她可以轻松地修改循环以遍历数组的行而不是范围,然后仅将唯一元素添加到单独的数组中)。如果您查看代码,您会发现行删除确实是自下而上(rowCount 递减)。 :)
  • 那么问题的标题是VBA,从数组中删除重复项。您提出了一个很好的观点,即您的代码确实按照您使用 .Count 的方式自下而上,尽管在这种情况下,询问者可能需要指定第一次出现应该保持在顶部还是底部。
  • 我稍微编辑了我的答案以反映数组请求。在问题内部,它变得模棱两可,用户表示希望执行单击“删除重复项”按钮所做的操作,但在 VBA 中(使用数组)。内置的删除重复项也可以自下而上工作。如果用户回来并想要一个更彻底的答案,或者发现这个不能帮助他们弄清楚,那么我可以提供更多。也可以随意编辑我的或添加您自己的。感谢您的批评! :)
  • 这个 Scripting.Dictionary 方法是否也适用于 Mac?
【解决方案2】:
Function eliminateDuplicate(poArr As Variant) As Variant
    Dim poArrNoDup()

    dupArrIndex = -1
    For i = LBound(poArr) To UBound(poArr)
        dupBool = False

        For j = LBound(poArr) To i
            If poArr(i) = poArr(j) And Not i = j Then
                dupBool = True
            End If
        Next j

        If dupBool = False Then
            dupArrIndex = dupArrIndex + 1
            ReDim Preserve poArrNoDup(dupArrIndex)
            poArrNoDup(dupArrIndex) = poArr(i)
        End If
    Next i

    eliminateDuplicate = poArrNoDup
End Function

【讨论】:

  • 如果代码有问题,请不要像this revision那样编辑其他SO用户的答案,请发表评论并让他们知道。
  • 你需要为 i 的每个循环设置 dupBool 为 false
【解决方案3】:

对@RBILLC 和@radoslav006 答案的改进,此版本在删除重复项的数组中搜索现有值,因此它搜索较少的值来查找重复项。

Public Function RemoveDuplicatesFromArray(sourceArray As Variant)
    Dim duplicateFound As Boolean
    Dim arrayIndex As Integer, i As Integer, j As Integer
    Dim deduplicatedArray() As Variant
    
    arrayIndex = -1
    deduplicatedArray = Array(1)

    For i = LBound(sourceArray) To UBound(sourceArray)
        duplicateFound = False

        For j = LBound(deduplicatedArray) To UBound(deduplicatedArray)
            If sourceArray(i) = deduplicatedArray(j) Then
                duplicateFound = True
                Exit For
            End If
        Next j

        If duplicateFound = False Then
            arrayIndex = arrayIndex + 1
            ReDim Preserve deduplicatedArray(arrayIndex)
            deduplicatedArray(arrayIndex) = sourceArray(i)
        End If
    Next i

    RemoveDuplicatesFromArray = deduplicatedArray
End Function

【讨论】:

    【解决方案4】:

    从一维数组中删除重复项的简单函数

    Private Function DeDupeArray(vArray As Variant) As Variant
      Dim oDict As Object, i As Long
      Set oDict = CreateObject("Scripting.Dictionary")
      For i = LBound(vArray) To UBound(vArray)
        oDict(vArray(i)) = True
      Next
      DeDupeArray = oDict.keys()
    End Function
    

    编辑:

    使用stdVBA(主要由我自己维护的一个库),您可以使用:

    uniqueValues = stdEnumerator.CreateFromArray(myArray).Unique().AsArray()
    

    【讨论】:

    • CreateObject("Scripting.Dictionary") 失败。 MS Word v.16.32 (Office 365)、VBA v. 7.1。
    • @AndriyMakukha Mac?
    • 这很酷。我不知道您可以简单地执行oDict(key) = value 将项目添加到字典中。 :o
    【解决方案5】:

    这是处理数组的另一种方法:

    Sub tester()
    
        Dim arr, arrout
        
        arr = Range("A1").CurrentRegion.Value   'collect the input array
         
        arrout = UniqueRows(arr)                'get only unique rows
        
        Range("H1").Resize(UBound(arrout, 1), UBound(arrout, 2)).Value = arrout
        
    End Sub
    
    
    
    
    Function UniqueRows(arrIn As Variant) As Variant
        Dim keys, rw As Long, col As Long, k, sep, arrout
        Dim dict As Object, lbr As Long, lbc As Long, ubr As Long, ubc As Long, rwOut As Long
        Set dict = CreateObject("scripting.dictionary")
        'input array bounds
        lbr = LBound(arrIn, 1)
        ubr = UBound(arrIn, 1)
        lbc = LBound(arrIn, 2)
        ubc = UBound(arrIn, 2)
        ReDim keys(lbr To ubr)
        'First pass:collect all the row "keys" in an array 
        '    and unique keys in a dictionary
        For rw = lbr To ubr
            k = "": sep = ""
            For col = lbc To ubc
                k = k & sep & arrIn(rw, col)
                sep = Chr(0)
            Next col
            keys(rw) = k     'collect key for this row
            dict(k) = True   'just collecting unique keys
        Next rw
    
        'Resize output array to # of unique rows
        ReDim arrout(lbr To dict.Count + (lbr - 1), lbc To ubc)
        rwOut = lbr
        'Second pass: copy each unique row to the output array
        For rw = lbr To ubr
            If dict(keys(rw)) Then      'not yet output?
                For col = lbc To ubc    'copying this row over to output...
                    arrout(rwOut, col) = arrIn(rw, col)
                Next col
                rwOut = rwOut + 1      'increment output "row"
                dict(keys(rw)) = False 'flag this key as copied
            End If
        Next rw
        UniqueRows = arrout
    End Function
    

    【讨论】:

      【解决方案6】:

      @RBILLC 的回答可以通过在内部循环中添加 Exit For 来轻松改进:

      Function eliminateDuplicate(poArr As Variant) As Variant
          Dim poArrNoDup()
      
          dupArrIndex = -1
          For i = LBound(poArr) To UBound(poArr)
              dupBool = False
      
              For j = LBound(poArr) To i
                  If poArr(i) = poArr(j) And Not i = j Then
                      dupBool = True
                      Exit For
                  End If
              Next j
      
              If dupBool = False Then
                  dupArrIndex = dupArrIndex + 1
                  ReDim Preserve poArrNoDup(dupArrIndex)
                  poArrNoDup(dupArrIndex) = poArr(i)
              End If
          Next i
      
          eliminateDuplicate = poArrNoDup
      End Function
      

      【讨论】:

        【解决方案7】:

        我认为这确实是使用excel的本机函数的情况,至少对于初始数组获取而言,我认为没有更简单的方法可以做到这一点。这个 sub 将输出从第 5 列开始的唯一值。我假设目标范围是空的,所以如果不是,请更改 r 和 c。

        Sub testUniques()
            
            Dim arr, r As Long, c As Long, h As Long, w As Long
            Dim this As Worksheet: Set this = ActiveSheet
            arr = Application.Unique(this.Cells(1, 1).CurrentRegion)
            
            r = 1
            c = 5
            h = UBound(arr, 1) - 1
            w = UBound(arr, 2) - 1
            
            this.Range(this.Cells(r, c), this.Cells(r + h, c + w)) = arr
            
        End Sub
        

        【讨论】:

          【解决方案8】:

          我知道这已经过时了,但这是我用来将重复值复制到另一个范围的东西,以便我可以快速查看它们,从而为我从各种电子表格中建立的数据库建立数据完整性。要使该过程删除重复项,只需将dupRng 行替换为Cell.Delete Shift:=xlToLeft 或类似的东西即可。

          我没有亲自测试过,但它应该可以工作。

          Sub PartCompare()
              Dim partRng As Range, partArr() As Variant, i As Integer
              Dim Cell As Range, lrow As Integer
          
              lrow = ThisWorkbook.Worksheets("Sheet1").Cells(Rows.Count, 1).End(xlUp).Row
              i = 0
          
              Set partRng = ThisWorkbook.Worksheets("Sheet1").Range(Cells(1, 1), Cells(lrow, 1))
          
              For Each Cell In partRng.Cells
                  ReDim Preserve partArr(i)
                  partArr(i) = Cell.Value
                  i = i + 1
              Next
          
              Dim dupRng As Range, j As Integer, x As Integer, c As Integer
          
              Set dupRng = ThisWorkbook.Worksheets("Sheet1").Range("D1")
          
              x = 0
              c = 1
              For Each Cell In partRng.Cells
                  For j = c To UBound(partArr)
                      If partArr(j) = Cell.Value Then
                          dupRng.Offset(x, 0).Value = Cell.Value
                          dupRng.Offset(x, 1).Value = Cell.Address()
                          x = x + 1
                          Exit For
                      End If
                  Next j
                  c = c + 1
              Next Cell
          End Sub
          

          【讨论】:

            【解决方案9】:

            从数组中删除重复项(以及相关的行项)

            由于 OP 想要一个接近 RemoveDuplicates 的 VBA 解决方案,我演示了一种使用 ►dictionary 的数组方法来获取唯一项目本身 (dict.keys),而是相关的 第一次出现的行索引 (dict.items)。

            这些用于通过过程 LeaveUniques 保留整行数据,这得益于 ►Application.Index() 函数的高级可能性 - c.f. Some peculiarities of the the Application.Index function

            调用示例

            Sub ExampleCall()
            '[0]define range and assign data to 1-based 2-dim datafield
                With Sheet1                   ' << reference to your project's sheet Code(Name)
                    Dim lastRow: lastRow = .Cells(.Rows.Count, "C").End(xlUp).Row
                    Dim rng:  Set rng = .Range("C2:E" & lastRow)
                End With
                Dim data: data = rng        ' assign data to 2-dim datafield
            '[1]get uniques (column 1) and remove duplicate rows
                LeaveUniques data           ' << call procedure LeaveUniques (c.f. RemoveDuplicates)
            '[2]overwrite original range
                rng.Clear
                rng.Resize(UBound(data), UBound(data, 2)) = data
            End Sub
            

            程序LeaveUniques

            Sub LeaveUniques(ByRef data As Variant, Optional ByVal colNum As Long = 1)
            'Purpose: procedure removes duplicates of given column number in entire array
                data = Application.Index(data, uniqueRowIndices(data, colNum), nColIndices(UBound(data, 2)))
            End Sub
            

            帮助功能到LeaveUniques

            Function uniqueRowIndices(data, Optional ByVal colNum As Long = 1)
            'Purpose: return data index numbers referring to uniques
            'a) set late bound dictionary to memory
                Dim dict As Object
                Set dict = CreateObject("Scripting.Dictionary")
            'b) slice e.g. first data column (colNum = 1)
                Dim colData
                colData = Application.Index(data, 0, colNum)
            'c) fill dictionary with uniques referring to first occurencies
                Dim i As Long
                For i = 1 To UBound(colData)
                    If Not dict.exists(dict(colData(i, 1))) Then dict(colData(i, 1)) = i
                Next
            'd) return 2-dim array of valid unique 1-based index numbers
                uniqueRowIndices = Application.Transpose(dict.items)
            End Function
            
            Function nColIndices(ByVal n As Long)
            'Purpose: return "flat" array of n column indices, e.g. for n = 3 ~> Array(1, 2, 3)
                nColIndices = Application.Transpose(Evaluate("row(1:" & n & ")"))
            End Function
            
            

            【讨论】:

              猜你喜欢
              • 2016-02-21
              • 1970-01-01
              • 2020-02-23
              • 1970-01-01
              • 1970-01-01
              • 1970-01-01
              • 2015-04-13
              • 2017-05-03
              • 2012-12-06
              相关资源
              最近更新 更多