【问题标题】:Replace a given value in array with values from second array in C#用 C# 中第二个数组中的值替换数组中的给定值
【发布时间】:2016-04-20 04:02:16
【问题描述】:

我有一个整数数组,其中有许多负值:

var arrayExisting = new int[]{1,2,-1,3,5,-1,0,0,-1};

还有另一个数组,我想将一组对应的值插入到第一个数组中:

var replacements = new int[]{7,6,5};

有没有真正有效的方法?

我目前拥有的是:

var newArray = arrayExisting.Select(val =>
        {
            if (val != -1) return val;
            var ret = replacements[i];
            i++;
            return ret;
        }).ToArray();

相当快。有问题的数组长度只有大约 15 个整数,这可能会增加,但不太可能超过 100。问题是对于我的中等测试系统和实际系统,我必须这样做超过一百万次我正在考虑将涉及此代码的大约 10e10 次迭代!

【问题讨论】:

  • 您能否在最初创建或填充现有数组时建立一个负索引列表?
  • 不要挑剔,但你说数组可以有“负值”,但你的代码片段 - 它只检查 -1..
  • 是的,我可以预先建立负指数列表。我在下面发布了另一个带有一些基准测试的答案
  • 回复:“10e10 次迭代”。你的意思是 1.0e10 还是 1e11?
  • 我的意思是 1e11,很抱歉造成混乱

标签: c# arrays performance linq optimization


【解决方案1】:

我会使用 for 循环并就地替换原始数组中的值。

int replacementIndex = 0;
for (var i = 0; i < arrayExisting.Length; i++) {
    if (arrayExisting[i] < 0) {
        arrayExisting[i] = replacements[replacementIndex++];
    }
}

这样可以避免创建新数组的开销。如果您需要创建一个新数组,您可以创建一个new int[arrayExisting.Length]

运行一个快速的基准测试,for 循环似乎快了约 4 倍,即使在最坏的情况下,您必须每次都替换并构建一个新数组来保存替换。

Select: 12672
For: 3386

如果您有兴趣,这里是基准。

var loops = 1000000;
            var arrayExisting = Enumerable.Repeat(-1, 1000).ToArray();
            var replacements = Enumerable.Repeat(1, 1000).ToArray();

            var selectTimer = Stopwatch.StartNew();
            for (var j = 0; j < loops; j++)
            {
                var i = 0;
                var newArray = arrayExisting.Select(val =>
                {
                    if (val != -1) return val;
                    var ret = replacements[i];
                    i++;
                    return ret;
                }).ToArray();
            }
            selectTimer.Stop();

            var forTimer = Stopwatch.StartNew();
            for (var j = 0; j < loops; j++)
            {
                var replaced = new int[arrayExisting.Length];
                int replacementIndex = 0;
                for (var i = 0; i < arrayExisting.Length; i++)
                {
                    if (arrayExisting[i] < 0)
                    {
                        replaced[i] = replacements[replacementIndex++];
                    }
                    else
                    {
                        replaced[i] = arrayExisting[i];
                    }
                }
            }
            forTimer.Stop();

            Console.WriteLine("Select: " + selectTimer.ElapsedMilliseconds);
            Console.WriteLine("For: " + forTimer.ElapsedMilliseconds);

【讨论】:

    【解决方案2】:

    用指针试试:

    int replacementsLength = arrayReplacements.Length;
    fixed (int* existing = arrayExisting, replacements = arrayReplacements)
    {
        int* exist = existing;
        int* replace = replacements;
        int i = 0;
        while (i < replacementsLength)
        {
            if (*exist == -1)
            {
                *exist = *replace;
                i++;
                replace++;
            }
            exist++; //edit: forgot to put exist++ outside the if block
        }
    }
    

    编辑:此代码仅在您确定有完全相同数量的替换和 -1 时才有效 要处理每个场景,请使用以下代码:

    int replacementsLength = arrayReplacements.Length;
    int existingLength = arrayExisting.Length;
    fixed (int* existing = copy, replacements = arrayReplacements)
    {
        int* exist = existing;
        int* replace = replacements;
        int i = 0;
        int x = 0;
        while (i < replacementsLength && x < existingLength)
        {
            if (*exist == -1)
            {
                *exist = *replace;
                i++;
                replace++;
            }
            exist++;
            x++;
        }
    }
    

    与 Joey 运行相同的测试,结果是:

    选择:17378
    对于:2172
    指针:1780
    编辑:我的错误,我忘了迭代我的代码 1000000。还是更快。

    这里是测试代码:

    private unsafe static void test()
    {
        var loops = 1000000;
        var arrayExisting = Enumerable.Repeat(-1, 1000).ToArray();
        var arrayReplacements = Enumerable.Repeat(1, 1000).ToArray();
        int[] newArray = null;
    
        var selectTimer = Stopwatch.StartNew();
        for (var j = 0; j < loops; j++)
        {
            var i = 0;
            newArray = arrayExisting.Select(val =>
            {
                if (val != -1) return val;
                var ret = arrayReplacements[i];
                i++;
                return ret;
            }).ToArray();
        }
        selectTimer.Stop();
    
        printResult("linQ", newArray);
    
        arrayExisting = Enumerable.Repeat(-1, 1000).ToArray();
        arrayReplacements = Enumerable.Repeat(1, 1000).ToArray();
        int[] replaced = null;
    
        var forTimer = Stopwatch.StartNew();
        for (var j = 0; j < loops; j++)
        {
            replaced = new int[arrayExisting.Length];
            int replacementIndex = 0;
            for (var i = 0; i < arrayExisting.Length; i++)
            {
                if (arrayExisting[i] < 0)
                {
                    replaced[i] = arrayReplacements[replacementIndex++];
                }
                else
                {
                    replaced[i] = arrayExisting[i];
                }
            }
        }
        forTimer.Stop();
    
        printResult("for", replaced);
    
        arrayExisting = Enumerable.Repeat(-1, 1000).ToArray();
        arrayReplacements = Enumerable.Repeat(1, 1000).ToArray();
    
        int[] copy = null;
    
        var pointerTimer = Stopwatch.StartNew();
        //EDIT: fixed the test code
        for (int j = 0; j < loops; j++)
        {
            copy = new int[arrayExisting.Length];
            Array.Copy(arrayExisting, copy, arrayExisting.Length);
            int replacementsLength = arrayReplacements.Length;
            int existingLength = arrayExisting.Length;
            fixed (int* existing = copy, replacements = arrayReplacements)
            {
                int* exist = existing;
                int* replace = replacements;
                int i = 0;
                int x = 0;
                while (i < replacementsLength && x < existingLength)
                {
                    if (*exist == -1)
                    {
                        *exist = *replace;
                        i++;
                        replace++;
                    }
                    exist++;
                    x++;
                }
            }
        }
        pointerTimer.Stop();
    
        printResult("pointer", copy);
    
        File.AppendAllText(@"E:\dev\test.txt", "\r\n" +
            "Select: " + selectTimer.ElapsedMilliseconds + "\r\n" +
            "For: " + forTimer.ElapsedMilliseconds + "\r\n" + 
            "Pointer: " + pointerTimer.ElapsedMilliseconds);
    }
    

    【讨论】:

    • 你能提供完整的测试代码吗?根据我的测试,这个不安全的代码只比每次都不会创建新数组的for 循环快 35%。
    • 我意识到我的错误,我忘记循环代码1000000次,我要更改它并再次测试
    • 谢谢。作为提示,您可以制作像 int* replaceEnd = replace + arrayReplacements.Length; 这样的指针并针对 replace &lt; replaceEnd 进行测试。您保存 i 迭代器。
    【解决方案3】:

    使用@TVOHM 对原始问题的评论,我实现了以下代码

    public static int[] ReplaceUsingLinq(IEnumerable<int> arrayFromExisting, IEnumerable<int> x)
        {
            var indices = x.ToArray();
            var i = 0;
            var newArray = arrayFromExisting.Select(val =>
            {
                if (val != -1) return val;
                var ret = indices[i];
                i++;
                return ret;
            }).ToArray();
            return newArray;
    
        }
    
        public static int[] ReplceUsingForLoop(int[] arrayExisting, IEnumerable<int> x)
        {
    
            var arrayReplacements = x.ToArray();
            var replaced = new int[arrayExisting.Length];
            var replacementIndex = 0;
            for (var i = 0; i < arrayExisting.Length; i++)
            {
                if (arrayExisting[i] < 0)
                {
                    replaced[i] = arrayReplacements[replacementIndex++];
                }
                else
                {
                    replaced[i] = arrayExisting[i];
                }
            }
    
            return replaced;
        }
    
        public static unsafe int[] ReplaceUsingPointers(int[] arrayExisting, IEnumerable<int> reps)
        {
    
            var arrayReplacements = reps.ToArray();
            int replacementsLength = arrayReplacements.Length;
            var replaced = new int[arrayExisting.Length];
            Array.Copy(arrayExisting, replaced, arrayExisting.Length);
            int existingLength = replaced.Length;
            fixed (int* existing = replaced, replacements = arrayReplacements)
            {
                int* exist = existing;
                int* replace = replacements;
                int i = 0;
                int x = 0;
                while (i < replacementsLength && x < existingLength)
                {
                    if (*exist == -1)
                    {
                        *exist = *replace;
                        i++;
                        replace++;
                    }
                    exist++;
                    x++;
                }
            }
    
            return replaced;
        }
    
        public static int[] ReplaceUsingLoopWithMissingArray(int[] arrayExisting, IEnumerable<int> x,
            int[] missingIndices)
        {
    
            var arrayReplacements = x.ToArray();
            var replaced = new int[arrayExisting.Length];
            Array.Copy(arrayExisting, replaced, arrayExisting.Length);
            var replacementIndex = 0;
            foreach (var index in missingIndices)
            {
                replaced[index] = arrayReplacements[replacementIndex];
                replacementIndex++;
            }
            return replaced;
        }
    

    并使用以下代码对此进行基准测试:

    public void BenchmarkArrayItemReplacements()
        {
            var rand = new Random();
            var arrayExisting = Enumerable.Repeat(2, 1000).ToArray();
            var arrayReplacements = Enumerable.Repeat(1, 100);
            var toReplace = Enumerable.Range(0, 100).Select(x => rand.Next(100)).ToList();
            toReplace.ForEach(x => arrayExisting[x] = -1);
            var misisngIndices = toReplace.ToArray();
            var sw = Stopwatch.StartNew();
    
    
            var result = ArrayReplacement.ReplceUsingForLoop(arrayExisting, arrayReplacements);
            Console.WriteLine($"for loop took {sw.ElapsedTicks}");
    
            sw.Restart();
            result = ArrayReplacement.ReplaceUsingLinq(arrayExisting, arrayReplacements);
            Console.WriteLine($"linq took {sw.ElapsedTicks}");
            sw.Restart();
            result = ArrayReplacement.ReplaceUsingLoopWithMissingArray(arrayExisting, arrayReplacements, misisngIndices);
            Console.WriteLine($"with missing took {sw.ElapsedTicks}");
            sw.Restart();
            result = ArrayReplacement.ReplaceUsingPointers(arrayExisting, arrayReplacements);
            Console.WriteLine($"Pointers took {sw.ElapsedTicks}");
    
        }
    

    这给出了结果:

    for loop took      848
    linq took         2879
    with missing took  584
    Pointers took      722
    

    因此,我们似乎知道缺失值在哪里(-1 在哪里)是它快速的关键。

    顺便说一句,如果我将每个调用循环到相关方法 10000 次并检查我得到的时间:

    for loop took     190988
    linq took         489052
    with missing took  69198
    Pointers took     159102
    

    这里效果更大

    【讨论】:

      猜你喜欢
      • 2010-12-07
      • 2012-07-01
      • 1970-01-01
      • 2021-02-03
      • 1970-01-01
      • 2021-09-20
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多