为什么这两个 linq 查询返回不同数量的结果？答案

【问题标题】：Why do these two linq queries return different numbers of results?为什么这两个 linq 查询返回不同数量的结果？
【发布时间】：2013-11-19 00:22:53
【问题描述】：

在我使用的一个 Web 应用程序中，我发现了一段很慢的代码，我想加快速度。原代码如下：

foreach (Guid g in SecondaryCustomersIds)
{
    var Customer = (from d in Db.CustomerRelationships

                    join c in Db.Customers on
                    d.PrimaryCustomerId equals c.CustomerId

                    where c.IsPrimary == true && d.SecondaryCustomerId == g
                    select c).Distinct().SingleOrDefault();
   //Add this customer to a List<>
}

我认为将这些全部加载到单个查询中可能会更快，因此我尝试将其重写为以下查询：

var Customers = (from d in Db.CustomerRelationships

                 join c in Db.Customers on
                 d.PrimaryCustomerId equals c.CustomerId

                 where c.IsPrimary == true && SecondaryCustomersIds.Contains(d.SecondaryCustomerId)
                 select c).Distinct();

这确实更快，但现在新查询返回的记录比第一个要少。在我看来，这两个代码块正在做同样的事情，并且应该返回相同数量的记录。谁能明白他们为什么不这样做？我在这里错过了什么？

【问题讨论】：

可能是您在第一个中有重复项？您是否尝试过在添加所有客户的最终列表中添加.Distinct？
不，对不起，我应该提到这一点。获取最终列表的不同值会返回相同的计数，因此不会返回重复项。
SecondaryCustomersIds 是如何定义的？
这是一个List<Guid>，使用可能超出此问题范围的方法填充。我可以说两段代码都使用了相同的SecondaryCustomersIds。
这里的基准测试怎么样？它会给你答案，但可能无法解释原因。

标签： c# .net performance linq

【解决方案1】：

第一个查询可以将空对象添加到列表中（SingleOrDefault 将返回该类型的默认值，或者在这种情况下返回null，如果它找不到匹配的实体）。因此，对于没有匹配关系的每个客户，您可以向该 List<>, 添加一个空对象，这将增加计数。

【讨论】：

【解决方案2】：

在您的第一个场景中，您的最终 List<Customers> 是否有重复项？

您正在调用 Distinct，但同时也在循环，这意味着您没有在整个集合上执行 Distinct。

您的第二个示例是在整个集合上调用 Distinct。

【讨论】：

@Noctis 虽然我觉得这应该是一个评论而不是一个答案（太投机），但我们都同时提交了我们的帖子，所以我不会要求版税。
将此添加为评论，但只是为了澄清，从最终的List<Customers> 中选择Distinct 不会改变记录数。
和评论一样，没错，但是我在回答的时候没看到