检查两个 List<int> 是否有相同的数字答案

【问题标题】：Check two List<int>'s for the same numbers检查两个 List<int> 是否有相同的数字
【发布时间】：2009-01-26 14:55:02
【问题描述】：

我有两个列表，我想检查对应的数字。

例如

List<int> a = new List<int>(){1, 2, 3, 4, 5};
List<int> b = new List<int>() {0, 4, 8, 12};

应该给出结果 4。有没有一种简单的方法可以做到这一点而无需过多地循环列表？

我正在为我需要这个的项目使用 3.0，所以没有 Linq。

【问题讨论】：

您可能存储多少物品？
不多。第一个列表最多 15 个，第二个列表最多 20 个，但一般第一个列表不超过 4 个，第二个列表不超过 10 个。
所以可能有一个外循环和 IndexOf() 会好吗？答案可能更清楚

标签： c# .net generics list

【解决方案1】：

您可以使用.net 3.5 .Intersect() 扩展方法：-

List<int> a = new List<int>() { 1, 2, 3, 4, 5 };
List<int> b = new List<int>() { 0, 4, 8, 12 };

List<int> common = a.Intersect(b).ToList();

【讨论】：

【解决方案2】：

Jeff Richter 出色的 PowerCollections 设置了交叉点。一直工作到 .NET 2.0。

http://www.codeplex.com/PowerCollections

        Set<int> set1 = new Set<int>(new[]{1,2,3,4,5});
    Set<int> set2 = new Set<int>(new[]{0,4,8,12});
    Set<int> set3 = set1.Intersection(set2);

【讨论】：

【解决方案3】：

您可以像 LINQ 那样有效地做到这一点 - 使用一组。现在在 3.5 之前我们还没有合适的集合类型，所以你需要使用 Dictionary<int,int> 或类似的东西：

创建一个Dictionary<int, int> 并从列表a 中填充它，使用元素作为条目的键和值。（条目中的值根本不重要。）
为交叉点创建一个新列表（或将其写为迭代器块，等等）。
遍历列表 b，并使用 dictionary.ContainsKey 检查：如果是，则向列表中添加一个条目或生成它。

这应该是 O(N+M)（即两个列表大小都是线性的）

请注意，如果列表 b 包含重复项，则会为您提供重复的条目。如果您想避免这种情况，您可以随时在列表b 中第一次看到字典条目时更改它的值。

【讨论】：

我假设 Dictionary.ContainsKey() 比 List.Contains() 快？
字典查找是O(1)，列表查找是O(N)
当然，O(N) 的东西实际上只对大数字很重要。对于非常小的列表，List.Contains 可能会更快。
@Jon Skeet，我同意这种情况（作者谈到每个列表有 15-20 个条目）List.Contains 就足够了。

【解决方案4】：

您可以对第二个列表进行排序并遍历第一个列表，然后为每个值对第二个列表进行二分搜索。

【讨论】：

【解决方案5】：

如果两个列表都已排序，您可以在 O(n) 时间内轻松完成此操作，方法是从合并排序中进行修改合并，只需“删除”（越过计数器）两个前导数字中的较低者，如果它们永远相等，将该数字保存到结果列表中并“删除”它们。它需要少于 n(1) + n(2) 步。这当然是假设它们已排序。但是整数数组的排序并不完全昂贵 O(n log(n))... 我认为。如果您愿意，我可以编写一些代码来说明如何执行此操作，但想法很简单。

【讨论】：

列表没有排序，数据量也不大；还是谢谢。

【解决方案6】：

在 3.0 上测试

    List<int> a = new List<int>() { 1, 2, 3, 4, 5, 12, 13 };
    List<int> b = new List<int>() { 0, 4, 8, 12 };
    List<int> intersection = new List<int>();
    Dictionary<int, int> dictionary = new Dictionary<int, int>();
    a.ForEach(x => { if(!dictionary.ContainsKey(x))dictionary.Add(x, 0); });
    b.ForEach(x => { if(dictionary.ContainsKey(x)) dictionary[x]++; });
    foreach(var item in dictionary)
    {
        if(item.Value > 0)
            intersection.Add(item.Key);
    }

【讨论】：

这似乎是一个很好的幼稚实现。它可以完成这项工作，但无法很好地适应所有内存和计算开销。
它在 o(n) 中完成工作，无论如何对于不同的问题有不同的解决方案，如果输入不是多项式，有时天真的解决方案比通用优化解决方案更好。鉴于当前的问题，我认为天真的解决方案会起作用

【解决方案7】：

在对问题作者的评论中说会有

第一个列表最多 15 个，第一个列表最多 20 个第二个列表

在这种情况下，我不会费心优化并使用 List.Contains。

对于较大的列表，散列可用于利用 O(1) 查找，这会导致 O(N+M) 算法，如 Jon 所述。

哈希需要额外的空间。为了减少内存使用，我们应该散列最短列表。

List<int> a = new List<int>() { 1, 2, 3, 4, 5 };
List<int> b = new List<int>() { 0, 4, 8, 12 };
List<int> shortestList;
List<int> longestList;
if (a.Count > b.Count)
{
    shortestList = b;
    longestList = a;
}
else
{
    shortestList = a;
    longestList = b;                
}

Dictionary<int, bool> dict = new Dictionary<int, bool>();
shortestList.ForEach(x => dict.Add(x, true));

foreach (int i in longestList)
{
    if (dict.ContainsKey(i))
    {
        Console.WriteLine(i);
    }
}

【讨论】：

【解决方案8】：

var c = a.Intersect(b);

这仅适用于 3.5 看到您的要求我的应用程序。

【讨论】：

【解决方案9】：

如果您打算从头开始实施，ocdecio 推荐的方法是一个不错的方法。对比我们看到的 nieve 方法的时间复杂度：

排序/二分查找方法： T ~= O(n log n) + O(n) * O(log n) ~= O(n log n)

遍历两个列表（nieve 方法）： T ~= O(n) * O(n) ~= O(n ^ 2)

可能有更快的方法，但我不知道。希望这可以证明选择他的方法是合理的。

【讨论】：

二分查找方法为 O(n log n) + O(n log n)，列表中的每一项将进行 1 次搜索。

【解决方案10】：

（上一个答案 - 将 IndexOf 更改为 Contains，因为 IndexOf 首先转换为数组）

鉴于它是两个小列表，下面的代码应该没问题。不确定是否有一个像 Java 那样具有交集方法的库（尽管 List 不是一个集合，所以它不起作用），我知道有人指出 PowerCollection 库有一个。

List<int> a = new List<int>() {1, 2, 3, 4, 5};
List<int> b = new List<int>() {0, 4, 8, 12};

List<int> result = new List<int>();
for (int i=0;i < a.Count;i++)
{
    if (b.Contains(a[i]))
        result.Add(a[i]);
}

foreach (int i in result)
    Console.WriteLine(i);

更新 2： HashSet 是一个愚蠢的答案，因为它是 3.5 而不是 3.0

更新：HashSet 似乎是显而易见的答案：

// Method 2 - HashSet from System.Core
HashSet<int> aSet = new HashSet<int>(a);
HashSet<int> bSet = new HashSet<int>(b);
aSet.IntersectWith(bSet);
foreach (int i in aSet)
    Console.WriteLine(i);

【讨论】：

【解决方案11】：

这是一个删除重复字符串的方法。将其更改为适应 int ，它将正常工作。

public List<string> removeDuplicates(List<string> inputList)
    {
        Dictionary<string, int> uniqueStore = new Dictionary<string, int>();
        List<string> finalList = new List<string>();

        foreach (string currValue in inputList)
        {
            if (!uniqueStore.ContainsKey(currValue))
            {
                uniqueStore.Add(currValue, 0);
                finalList.Add(currValue);
            }
        }
        return finalList;

    }

更新：抱歉，我实际上是在合并列表，然后删除重复项。我将组合列表传递给此方法。不完全符合您的要求。

【讨论】：

【解决方案12】：

哇。迄今为止的答案看起来非常复杂。为什么不直接使用：

List<int> a = new List<int>() { 1, 2, 3, 4, 5, 12, 13 };
List<int> b = new List<int>() { 0, 4, 8, 12 };

...

public List<int> Dups(List<int> a, List<int> b)
{
    List<int> ret = new List<int>();

    foreach (int x in b)
    { 
        if (a.Contains(x))
        {
           ret.add(x);
        }
    }

    return ret;
}

这对我来说似乎更直接......除非我错过了部分问题。这是完全可能的。

【讨论】：

这种方式意味着对于 b 中的每个项目，您都在循环 b 中的每个项目，因此性能会在更大的数据集上下降。
本质上是 O(M + M*N)) 而不是具有 O(M+N) 性能的字典示例。
他问如何在没有 Linq 的情况下完成。我没有看到任何关于用最少的 CPU 周期完成的事情。不，它不是最快的，但它确实有效，并且易于阅读/维护。