过滤重复行答案

【问题标题】：Filtering for duplicate rows过滤重复行
【发布时间】：2018-06-08 10:53:17
【问题描述】：

我这里有一段代码：

var grouped = nonGrouped.GroupBy(x => new
{
    x.Id,
    x.Col1,
    x.Col2
}).Select(x => new MyDbTable
{
    Id = x.Key.Id,
    VALUE = x.Sum(y => y.Value),
    Col1 = x.Key.Col1,
    Col2 = x.Key.Col2
}).ToList();

//Filter out rows with the same Col1/Col2 combination
var dbTableList = new List<MyDbTable>();
grouped.ForEach(x =>
{
    if (!dbTableList.Any(a => a.Col1 == x.Col2 && a.Col2 == x.Col1))
    {
        dbTableList.Add(x);
    }
});

我想删除注释“//过滤掉具有相同 Col1/Col2 组合的行”下的代码，并以某种方式将此功能添加到注释上方的 LINQ 语句中

【问题讨论】：

a.Col1 == x.Col2 你应该比较 Col1 和 Col1
Remove duplicates in the list using linq的可能重复
为什么不在ToList() 之前调用Distinct()？它应该特别适用于 Select() 中返回的自定义字段。
你需要Id吗？因为您的逻辑实际上删除了不重复的行，因为它们具有不同的Id。如果你不需要Id，你可以完全删除它，你不会再有重复了
您是否真的只想为每个 col 组合添加第一个 id（及其 sum），如果是这样，您可以按 cols 分组仅，Distinct()，并取第一个 ID 和该 ID 的总和。

标签： c# .net linq

【解决方案1】：

我认为您正在寻找 grouped List 的自定义不同值

class MyDbTableComparer : IEqualityComparer<MyDbTable>
{
    public bool Equals(MyDbTable x, MyDbTable y)
    {
        if (x.Col1 == y.Col2 && x.Col2 == y.Col1) return true;
        return false;
    }
}

然后，将上面的语句更改为：

.Select(x => new MyDbTable
{
    Id = x.Key.Id,
    VALUE = x.Sum(y => y.Value),
    Col1 = x.Key.Col1,
    Col2 = x.Key.Col2
}).ToList().Distinct(new MyDbTableComparer());

但我不知道如果你在ToList() 之前使用Distinct() 是否会正常工作

【讨论】：

【解决方案2】：

你就不能这样做吗？

var grouped = nonGrouped.GroupBy(x => new
            {
                x.Id,
                x.Col1,
                x.Col2
            }).Select(x => new MyDbTable
            {
                Id = x.Key.Id,
                VALUE = x.Sum(y => y.Value),
                Col1 = x.Key.Col1,
                Col2 = x.Key.Col2
            }).Where(z => z.Col1 != z.Col2).ToList();

还是我完全误解了你的问题？它以前发生过:)

【讨论】：

【解决方案3】：

这应该很适合你：

var dbTableList =
    nonGrouped
        .GroupBy(x => new
        {
            x.Id,
            x.Col1,
            x.Col2
        })
        .Select(x => new MyDbTable
        {
            Id = x.Key.Id,
            VALUE = x.Sum(y => y.Value),
            Col1 = x.Key.Col1,
            Col2 = x.Key.Col2
        })
        .GroupBy(x => new
        {
            x.Col1,
            x.Col2
        })
        .SelectMany(xs => xs.Take(1))
        .ToList();

完成这项工作的关键是GroupBy/SelectMany/Take(1) 组合。

【讨论】：

【解决方案4】：

nonGrouped
    .GroupBy(x => new { x.Col1, x.Col2 })
    .Where(group => !group.Take(1).Any())
    .SelectMany(group => group)
    .GroupBy(x => x.Id, x.Col1,x.Col2);

可能会起作用。

不知道为什么你这里的逻辑是什么（如果你想删除所有具有相等 col1 和 col2 的行，为什么按 Id、Col1 和 Col2 分组是必要的）。

【讨论】：