虽然 stribizhev 写的答案很适合这种情况,但我想利用这个机会强调使用正则表达式处理简单任务对性能的(负面)影响。
比正则表达式更快(x2)的替代方法(在处理这些情况时总是很慢)
我的方法是基于递归删除空格。我创建了两个版本:第一个带有传统循环 (withoutRegex) 和第二个依赖 LINQ (withoutRegex2;实际上,除了Regex 部分之外,它与 stribizhev 的答案相同。
Private Function withoutRegex(input As String) As String
Dim output As String = ""
Dim temp() = input.Split(","c)
For i As Integer = 0 To temp.Length - 1
output = output & recursiveSpaceRemoval(temp(i).Trim()) & If(i < temp.Length - 1, ",", "")
Next
Return output
End Function
Private Function withoutRegex2(input As String) As String
Return String.Join(",", _
input _
.Split(","c) _
.Select(Function(x) recursiveSpaceRemoval(x.Trim())) _
.ToArray())
End Function
Private Function recursiveSpaceRemoval(input As String) As String
Dim output As String = input.Replace(" ", " ")
If output = input Then Return output
Return recursiveSpaceRemoval(output)
End Function
为了证明我的观点,我创建了以下测试框架:
Dim input As String = "Ireland, UK, United States of America, Belgium, Germany , Some Country"
Dim output As String = ""
Dim count As Integer = 0
Dim countMax As Integer = 20
Dim with0 As Long = 0
Dim without As Long = 0
Dim without2 As Long = 0
While count < countMax
count = count + 1
Dim sw As Stopwatch = New Stopwatch
sw.Start()
output = withRegex(input)
sw.Stop()
with0 = with0 + sw.ElapsedTicks
sw = New Stopwatch
sw.Start()
output = withoutRegex(input)
sw.Stop()
without = without + sw.ElapsedTicks
sw = New Stopwatch
sw.Start()
output = withoutRegex2(input)
sw.Stop()
without2 = without2 + sw.ElapsedTicks
End While
MessageBox.Show("With: " & with0.ToString)
MessageBox.Show("Without: " & without.ToString)
MessageBox.Show("Without 2: " & without2.ToString)
其中withRegex指的是stribizhev的回答,即:
Private Function withRegex(input As String) As String
Return String.Join(",", _
input _
.Split(","c) _
.Select(Function(m) Regex.Replace(m.Trim(), "\p{Zs}{2,}", " ")) _
.ToArray())
End Function
这是一个简单的测试框架,可以分析非常快速的动作,其中每一位都很重要(20 次循环迭代的重点正是试图提高测量的可靠性)。也就是说:即使更改调用方法的顺序,结果也会受到影响。
无论如何,在我的所有测试中,方法之间的差异或多或少都保持一致。我经过一些测试得到的平均值是:
With: 2500-2700
Without: 1100-1300
Without2: 900-1200
注意:至于这是对正则表达式性能的一般批评(至少,在足够简单的情况下,可以很容易地用我在这里展示的替代方案替换),关于如何改进它的任何建议( .NET 中正则表达式的性能)将非常受欢迎。但请避免笼统的不清楚的陈述,并尽可能具体(例如,通过建议对提议的测试框架进行更改)。