C#：合并字典和列表答案

【问题标题】：C# : Merging Dictionary and ListC#：合并字典和列表
【发布时间】：2010-09-20 06:06:16
【问题描述】：

我有一个 List 的 String 喜欢

List<String> MyList=new List<String>{"A","B"};

还有一个

Dictionary<String, Dictionary<String,String>> MyDict=new Dictionary<String,Dictionary<String,String>>();

其中包含

 Key      Value
          Key     Value

   "ONE"        "A_1"  "1"
                "A_2"  "2"
                "X_1"  "3"
                "X_2"  "4"
                "B_1"  "5"

    "TWO"       "Y_1"  "1"
                "B_9"  "2"
                "A_4"  "3"
                "B_2"   "6"
                "X_3" "7"

我需要将列表和字典合并到一个新的字典中

 Dictionary<String,String> ResultDict = new Dictionary<String,String>()

结果字典包含

Key Value

"A_1"   "1"
"A_2"   "2"
"B_1"   "5"
"A_4"   "3"
"B_2"   "6"
"X_2"   "4"
"X_3"   "7"

合并规则

首先添加子字符串等于列表中任何项目的项目。
然后合并“MyDict”中的项目，这样结果就不应包含重复的键和重复的值。

这是我的源代码。

        Dictionary<String, String> ResultDict = new Dictionary<string, string>();
        List<String> TempList = new List<string>(MyDict.Keys);
        for (int i = 0; i < TempList.Count; i++)
        {
            ResultDict = ResultDict.Concat(MyDict[TempList[i]])
                                              .Where(TEMP => MyList.Contains(TEMP.Key.Contains('_') == true ? TEMP.Key.Substring(0, TEMP.Key.LastIndexOf('_'))
                                                                                                            : TEMP.Key.Trim()))
                                              .ToLookup(TEMP => TEMP.Key, TEMP => TEMP.Value)
                                              .ToDictionary(TEMP => TEMP.Key, TEMP => TEMP.First())
                                              .GroupBy(pair => pair.Value)
                                              .Select(group => group.First())
                                              .ToDictionary(pair => pair.Key, pair => pair.Value);            }
        for (int i = 0; i < TempList.Count; i++)
        {
            ResultDict = ResultDict.Concat(MyDict[TempList[i]])
                                              .ToLookup(TEMP => TEMP.Key, TEMP => TEMP.Value)
                                              .ToDictionary(TEMP => TEMP.Key, TEMP => TEMP.First())
                                              .GroupBy(pair => pair.Value)
                                              .Select(group => group.First())
                                              .ToDictionary(pair => pair.Key, pair => pair.Value);
        }

它工作正常，但我需要消除两个 for 循环或至少一个（使用 LINQ 或 LAMBDA 表达式的任何方式）

【问题讨论】：

在你的例子中，MyList 不包括"X" 那么为什么"X_2" 和"X_3" 在结果字典中？
阐明您希望“独特”规则如何工作也很有用。如果两项具有相同的键但不同的值，则应排除一项；如果两个项目具有相同的值但不同的键，应该排除一个；在每种情况下，如果是这样，应该排除哪个项目？也许您可以在示例中包含三个“不同”案例中的每一个的示例？
为什么要去掉两个for循环？
@Daniel Renshaw：在合并规则 1 中，我提到“首先添加具有等于列表中任何项目的子字符串的项目。”所以在那之后我需要添加其余的元素（没有重复的键或值）
@Daniel Renshaw：我需要保留第一个值（键值对）。结果不应包含相同的键或相同的值或相同的键值对。上面的代码工作正常。我需要这样做而不使用 for 循环

标签： c# list dictionary

【解决方案1】：

这是您可以根据要求使用 LINQ 和 lambda 实现此目的的一种方法：

var keysFromList = new HashSet<string>(MyList);
var results =
    MyDict.Values
          .SelectMany(x => x)
          .OrderBy(x => {
                            int i = x.Key.LastIndexOf('_');
                            string k = (i < 0) ? x.Key.Trim() 
                                               : x.Key.Substring(0, i);
                            return keysFromList.Contains(k) ? 0 : 1;
                        })
          .Aggregate(new {
                             Results = new Dictionary<string, string>(),
                             Values = new HashSet<string>()
                         },
                     (a, x) => {
                                   if (!a.Results.ContainsKey(x.Key)
                                           && !a.Values.Contains(x.Value))
                                   {
                                       a.Results.Add(x.Key, x.Value);
                                       a.Values.Add(x.Value);
                                   }
                                   return a;
                               },
                     a => a.Results);

【讨论】：

不错的解决方案 :) 将我的 foreach 速度提高一倍，但很可能这并不重要，因为两者都比原来的要快。
@Mikael：真的吗？我没有进行基准测试，但希望您的版本会更快，如果有的话，尤其是对于更大的数据集。你的对我来说看起来像 O(n)，而使用 OrderBy 会使我的大约 O(n log n)。（我最初写的东西和你的几乎一模一样，然后注意到你已经发布了它，所以 LINQ 版本也发布了！）
我的意思是，我的更快 :) 我现在可以看到我在那句话中使用了错误的词。

【解决方案2】：

循环明智，这段代码更简单，但不是 Linq：

public static Dictionary<string, string> Test()
{
    int initcount = _myDict.Sum(keyValuePair => keyValuePair.Value.Count);

    var usedValues = new Dictionary<string, string>(initcount); //reverse val/key
    var result = new Dictionary<string, string>(initcount);
    foreach (KeyValuePair<string, Dictionary<string, string>> internalDicts in _myDict)
    {
        foreach (KeyValuePair<string, string> valuePair in internalDicts.Value)
        {
            bool add = false;
            if (KeyInList(_myList, valuePair.Key))
            {
                string removeKey;
                if (usedValues.TryGetValue(valuePair.Value, out removeKey))
                {
                    if (KeyInList(_myList, removeKey)) continue;
                    result.Remove(removeKey);
                }
                usedValues.Remove(valuePair.Value);
                add = true;
            }
            if (!add && usedValues.ContainsKey(valuePair.Value)) continue;
            result[valuePair.Key] = valuePair.Value;
            usedValues[valuePair.Value] = valuePair.Key;
        }
    }
    return result;
}

private static bool KeyInList(List<string> myList, string subKey)
{
    string key = subKey.Substring(0, subKey.LastIndexOf('_'));
    return myList.Contains(key);
}

【讨论】：

@Mikael Svenson：这个方法很好。但它需要更多的时间来执行。我有大量数据要处理。大约需要 12.45 秒，而另一个只需要 0.016 秒。
我会再摆弄一下，看看能不能从我的脑子里挤出点别的东西。
刚刚对少量测试数据进行了基准测试，我的版本比你的快 5 倍。尝试使用预期的项目数初始化字典。
编辑了我的代码以使用更大的集合。将 result.ContainsValue(...) 与 usedValues.ContainsKey(...) 交换。与 ContainsKey 上的 O(1) 相比，保存为 ContainsValue 的时间明显慢于 O(n)。
如果_myList 可能包含许多值，那么您可能在开始循环之前通过填充HashSet<> 来压缩额外的性能：HashSet<>.Contains 是O(1) 而List<>.Contains 是 O(n)。