【问题标题】:Replace words in a cell替换单元格中的单词
【发布时间】:2014-08-21 19:43:49
【问题描述】:

我正在尝试创建一个简单的翻译脚本,该脚本将从一个范围(列)中查看每个单元格中的一个句子,并根据简单的两列(lookat/replace)翻译记忆库逐字翻译我'已创建。

如果单元格包含

"This app is cool"

翻译记忆库是

This | 1
app  | 2
cool | 3

结果应该是:

"1 2 is 3"

但是,使用.Replace 方法,下面的字符串:

"This apple from the cooler"

会回来

"1 2le from the 3er"

我使用数组和拆分方法将句子分解为单词,然后从我的翻译列表中查找每个单词以查找 xlwhole 匹配项。我有大约 10,000 行句子,将每个句子分解成单词大约需要 100,000 个单词,每个单词要查看大约 1,000 个翻译单词列表。它的话..但有点慢。

还有其他方法,也许是更好的方法?

【问题讨论】:

    标签: excel vba


    【解决方案1】:

    这是另一个使用替换方法和单词边界的正则表达式解决方案(正则表达式模式中的“\b”表示单词边界)。它假定您的来源在 A 列,结果将进入 B 列。

    翻译表在宏中是硬编码的,但您可以轻松地将其更改为从工作簿中的表中获取。

    Option Explicit
    Sub Translate()
        Dim V As Variant
        Dim RE As Object
        Dim arrTranslate As Variant
        Dim I As Long, J As Long
        Dim S As String
    
    V = Range("a1", Cells(Rows.Count, "A").End(xlUp))
    ReDim Preserve V(1 To UBound(V, 1), 1 To 2)
    
    arrTranslate = VBA.Array(Array("This", 1), Array("app", 2), Array("cool", 3))
    Set RE = CreateObject("vbscript.regexp")
    With RE
        .Global = True
        .ignorecase = True
    End With
    
    For I = 1 To UBound(V, 1)
        S = V(I, 1)
        For J = 0 To UBound(arrTranslate)
            RE.Pattern = "\b" & arrTranslate(J)(0) & "\b"
            S = RE.Replace(S, arrTranslate(J)(1))
        Next J
        V(I, 2) = S
    Next I
    
    Range(Cells(1, 1), Cells(UBound(V, 1), UBound(V, 2))) = V
    
    End Sub
    

    【讨论】:

    • +1 高效完成。享受regexp 解决方案。
    • 我已经开始使用Regex了...非常漂亮的对象。
    • @brettdj 谢谢。正则表达式可能比直接的 VBA 函数慢,但它更容易编程。
    • @RonRosenfeld 感谢您的回答。我使用拆分数组的方式让它工作,但不会享受更快的方法 =]
    【解决方案2】:

    Word 救援:这里我使用 Word 的查找/替换功能中的“仅匹配整个单词”选项。

    Dim rngSentences As Range
    Dim sentences, translatedSentences, wordsToReplace, newStrings 
    Dim iWord As Long
    Dim iSentence As Long
    Dim cell As Range
    Dim w As Word.Application
    Dim d As Word.Document
    
    Set rngSentences = Range("A1:A5")
    wordsToReplace = Array("this", "app", "cool")
    newStrings = Array("1", "2", "3")
    
    Set w = New Word.Application
    Set d = w.Documents.Add(DocumentType:=wdNewBlankDocument)
    sentences = rngSentences.Value ' read sentences from sheet
    ReDim translatedSentences(LBound(sentences, 1) To UBound(sentences, 1), _
        LBound(sentences, 2) To UBound(sentences, 2))
    
    For iSentence = LBound(sentences, 1) To UBound(sentences, 1)
        'Put sentence in Word document
        d.Range.Text = sentences(iSentence, 1)
        'Replace the words
        For iWord = LBound(wordsToReplace) To UBound(wordsToReplace)
            d.Range.Find.Execute Findtext:=wordsToReplace(iWord), _
                Replacewith:=newStrings(iWord), MatchWholeWord:=True
        Next iWord
        'Grab sentence back from Word doc
        translatedSentences(iSentence, 1) = d.Range.Text
    Next iSentence
    'slap translated sentences onto sheet
    rngSentences.Offset(0, 1) = translatedSentences
    
    w.Quit savechanges:=False
    

    另一种可能更快的替代方法是将所有句子一次性粘贴到 Word 文档中,替换所有内容,然后将所有内容立即复制粘贴回 Excel 工作表。它可能会更快;我不知道,我没有对它进行过广泛的测试;由您决定。

    要实现这一点,Set d = ... 之后的行可以替换为:

    'Copy-paste all sentences into Word doc
    rngSentences.Copy
    d.Range.PasteSpecial DataType:=wdPasteText
    'Replace words
    For iWord = LBound(wordsToReplace) To UBound(wordsToReplace)
        d.Range.Find.Execute Findtext:=wordsToReplace(iWord), Replacewith:=newStrings(iWord), _
            MatchWholeWord:=True
    Next iWord
    'Copy-paste back to Excel sheet
    d.Range.Copy
    rngSentences.Offset(0, 1).PasteSpecial xlPasteValues
    w.Quit savechanges:=False
    

    【讨论】:

      【解决方案3】:

      如果您愿意,可以使用 Regex
      遵循方案:

      代码:

      ' reference: "Microsoft VBScript Regular Expressions 5.5"
      Dim RegX As Object, Mats As Object, Counter As Long
      Set RegX = CreateObject("VBScript.RegExp")
      
      Dim TrA(1 To 1000) As String
      Dim TrB(1 To 1000) As String
      Dim TrMax As Integer
      Dim StrSp
      
      For i = 1 To 9999
          If Range("D" & i).Value = "" Then Exit For
          TrA(i) = Range("D" & i).Value
          TrB(i) = Range("E" & i).Value
          TrMax = i
      Next
      
      Range("B1:B10").ClearContents
      
      For i = 1 To 9999
          If Range("A" & i).Value = "" Then Exit For
      
          With RegX
              .Global = True
              .Pattern = "[a-zA-Z0-9]+"
              Set Mats = .Execute(Range("A" & i).Value)
          End With
      
          kk = Range("A" & i).Value
          For Counter = 0 To Mats.Count - 1
              For e = 1 To TrMax
                  If LCase(Mats(Counter)) = TrA(e) Then
                      kk = Replace(kk, Mats(Counter), TrB(e), , 1)
                  End If
              Next
          Next
          Range("B" & i).Value = kk
      
      Next
      Set Mats = Nothing
      Set RegX = Nothing
      

      Regex 很快,但是 Word 代码很有趣(复制和粘贴 ... :-)

      【讨论】:

      • +1 但循环遍历单元格非常慢;当TrMax 达到数百个时,这将需要很长时间才能运行。
      猜你喜欢
      • 2016-05-30
      • 2013-09-10
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2021-12-29
      • 1970-01-01
      • 2020-07-25
      相关资源
      最近更新 更多