.NET 集合类的渐近复杂度答案

【问题标题】：Asymptotic complexity of .NET collection classes.NET 集合类的渐近复杂度
【发布时间】：2010-10-25 12:22:28
【问题描述】：

是否有任何关于 .NET 集合类（Dictionary<K,V>、List<T> 等）方法的渐近复杂性（big-O 和其他）的资源？

我知道 C5 库的文档包含一些有关它的信息 (example)，但我也对标准 .NET 集合感兴趣...（并且 PowerCollections 的信息也很好）。

【问题讨论】：

通过类的复杂度，我会考虑圈复杂度而不是渐近时间/空间复杂度。我会将后者归因于类中的操作。
您始终可以编写一个程序来为您感兴趣的特定函数计时，针对各种输入模式绘制结果与 N 的关系。我认为没有记录时间复杂度的主要原因是这是一个实现细节，因此 .NET 团队保留在未来更改实现细节的权利。因此，这些类的规范是基于它们的功能而不是它们的性能。如果某个特定的性能特征对您的要求非常重要，那么最好自己实现该算法。

标签： .net collections big-o asymptotic-complexity

【解决方案1】：

MSDN 列出了这些：

Dictionary<,>
List<>
SortedList<,>（编辑：错误链接；这里是generic version）
SortedDictionary<,>

等等。例如：

SortedList(TKey, TValue) 泛型类是一个二叉搜索树 O(log n) 检索，其中 n 是字典中的元素数量。在这方面，它类似于 SortedDictionary(TKey, TValue) 泛型班级。两个班有相似之处对象模型，并且都有 O(log n) 恢复。哪两个班不同之处在于内存使用和速度插入和移除：

SortedList(TKey, TValue) 使用较少内存比 SortedDictionary(TKey, T 值）。

SortedDictionary(TKey, TValue) 有更快的插入和移除未排序数据的操作，O(log n) 与 O(n) 相反排序列表（TKey，TValue）。

如果列表一次全部填充从排序的数据，SortedList(TKey, TValue) 比排序字典（TKey，TValue）。

【讨论】：

在这个（旧的，已删除的）引用中，二叉搜索树与基于排序数组的集合混淆了。 en.wikipedia.org/wiki/Binary_search_tree
注意他们列出 O 符号的位置。 “Dictionary 泛型类提供了从一组键到一组值的映射。字典中的每个添加都由一个值及其关联的键组成。使用它的键检索一个值非常快，接近到 O(1)，因为 Dictionary 类是作为哈希表实现的。"

【解决方案2】：

This page 总结了使用 Java 的各种集合类型的一些时间复杂性，尽管它们对于 .NET 应该完全相同。

我已从该页面获取表格并针对 .NET 框架进行了更改/扩展。另请参阅 SortedDictionary 和 SortedList 的 MSDN 页面，其中详细说明了各种操作所需的时间复杂度。

搜索

搜索类型/集合类型复杂性评论 线性搜索 Array/ArrayList/LinkedList O(N) 未排序的数据。 Binary search sorted Array/ArrayList/ O(log N) 需要已排序的数据。 Search Hashtable/Dictionary O(1) 使用散列函数。二进制搜索 SortedDictionary/SortedKey O(log N) 排序是自动的。

检索和插入

操作Array/ArrayList LinkedList SortedDictionary SortedList 返回 O(1) O(1) O(log N) O(log N) 访问前 O(1) O(1) N.A. N.A. 访问中间 O(1) O(N) N.A. N.A. 在后面插入 O(1) O(1) O(log N) O(N) 在前面插入 O(N) O(1) N.A. N.A. 插入中间 O(N) O(1) N.A. N.A.

删除应该与关联集合的插入具有相同的复杂性。

SortedList 在插入和检索方面有一些显着的特点。

插入（添加方法）：

这个方法是一个 O(n) 操作未排序的数据，其中 n 是计数。它是一个 O(log n) 操作，如果新的元素添加在末尾列表。如果插入导致调整大小，运算时间为 O(n)。

检索（项目属性）：

检索此属性的值是一个 O(log n) 操作，其中 n 是数数。设置属性是如果键是 O(log n) 操作已经在 SortedList)>)。如果钥匙不在列表，设置属性是 O(n) 未排序数据的操作，或 O(log n) 如果新元素被添加到列表的末尾。如果插入导致调整大小，操作是O(n)。

请注意，就所有操作的复杂性而言，ArrayList 等同于 List<T>。

【讨论】：

您确定.NET 的复杂性应该相同吗？我认为它比这更微妙 - 例如，.NET 中的 SortedDictionary、SortedList 和 Hashtable 之间存在差异。
是的，没有根本区别——基本算法和数据结构几乎相同。我没有详细说明 SortedDictionary/SortedList，但我现在将它们添加进去。我相信 Hashtable 应该具有与 Dictionary 相同的复杂性（它几乎是它的非泛型版本）。
无法保证底层实现具有可比性。
不，但是这是官方 .NET 实现的情况。

【解决方案3】：

我一般不知道（刚刚发布的另一个答案可能会为您提供您所追求的确切内容） - 但您当然可以使用 ILSpy 反映这个和其他方法（FSharp 代码有点尴尬，真的）和这最终将这个函数生成为 C#：

internal static a maximumElementAux<a>(SetTree<a> s, a n)
{
  while (true)
  {
    SetTree<a> setTree = s;
    if (setTree is SetTree<a>.SetOne)
    {
      break;
    }
    if (setTree == null)
    {
      return n;
    }
    SetTree<a>.SetNode setNode = (SetTree<a>.SetNode)s;
    SetTree<a> arg_23_0 = setNode.item3;
    n = setNode.item1;
    s = arg_23_0;
  }
  return ((SetTree<a>.SetOne)s).item;
  return n;
}

好的，所以这不是 C# 术语中的“正确”代码 - 但 while(true) 循环的存在意味着它至少不能是 O(1)；至于它实际上是什么......好吧，我的头很痛，无法找到:)

【讨论】：

仅供参考：从stackoverflow.com/questions/6313896/…合并

【解决方案4】：

本页简要说明了大多数 .NET 集合的一些主要优点和缺点：

http://geekswithblogs.net/BlackRabbitCoder/archive/2011/06/16/c.net-fundamentals-choosing-the-right-collection-class.aspx

Collection Ordering Contiguous Storage Direct Access Lookup Efficiency Manipulate Efficiency Notes

Dictionary Unordered Yes Via Key Key: O(1) O(1) Best for high performance lookups.

SortedDictionary Sorted No Via Key Key: O(log n) O(log n) Compromise of Dictionary speed and ordering, uses binary search tree.

SortedList Sorted Yes Via Key Key: O(log n) O(n) Very similar to SortedDictionary, except tree is implemented in an array, so has faster lookup on preloaded data, but slower loads.

List User has precise control over element ordering Yes Via Index Index: O(1)
Value: O(n) O(n) Best for smaller lists where direct access required and no sorting.

LinkedList User has precise control over element ordering No No Value: O(n) O(1) Best for lists where inserting/deleting in middle is common and no direct access required.

HashSet Unordered Yes Via Key Key: O(1) O(1) Unique unordered collection, like a Dictionary except key and value are same object.

SortedSet Sorted No Via Key Key: O(log n) O(log n) Unique sorted collection, like SortedDictionary except key and value are same object.

Stack LIFO Yes Only Top Top: O(1) O(1)* Essentially same as List except only process as LIFO

Queue FIFO Yes Only Front Front: O(1) O(1) Essentially same as List except only process as FIFO

Collection	Ordering	Contiguous Storage	Direct Access	Lookup Efficiency	Manipulate Efficiency	Notes
Dictionary	Unordered	Yes	Via Key	Key: O(1)	O(1)	Best for high performance lookups.
SortedDictionary	Sorted	No	Via Key	Key: O(log n)	O(log n)	Compromise of Dictionary speed and ordering, uses binary search tree.
SortedList	Sorted	Yes	Via Key	Key: O(log n)	O(n)	Very similar to SortedDictionary, except tree is implemented in an array, so has faster lookup on preloaded data, but slower loads.
List	User has precise control over element ordering	Yes	Via Index	Index: O(1) Value: O(n)	O(n)	Best for smaller lists where direct access required and no sorting.
LinkedList	User has precise control over element ordering	No	No	Value: O(n)	O(1)	Best for lists where inserting/deleting in middle is common and no direct access required.
HashSet	Unordered	Yes	Via Key	Key: O(1)	O(1)	Unique unordered collection, like a Dictionary except key and value are same object.
SortedSet	Sorted	No	Via Key	Key: O(log n)	O(log n)	Unique sorted collection, like SortedDictionary except key and value are same object.
Stack	LIFO	Yes	Only Top	Top: O(1)	O(1)*	Essentially same as List except only process as LIFO
Queue	FIFO	Yes	Only Front	Front: O(1)	O(1)	Essentially same as List except only process as FIFO

【讨论】：

链接已失效，这就是为什么最好也引用相关内容，因为现在人们无法引用这些可能有用的信息。
幸运地在此处备份了 Internet 存档的原因：web.archive.org/web/20121022141414/http://geekswithblogs.net/…

【解决方案5】：

没有“集合类的复杂性”之类的东西。相反，对这些集合的不同操作具有不同的复杂性。例如，将元素添加到 Dictionary<K, V>...

...接近 O(1) 操作。如果必须增加容量以容纳新元素，则此方法变为 O(n) 操作，其中n 是Count。

而从Dictionary<K, V>... 中检索元素

...接近 O(1) 操作。

【讨论】：

我的意思是他们的操作，我已经编辑了问题以使其更清楚。

【解决方案6】：

文档说它是建立在二叉树上的，并没有提到跟踪最大元素。如果文档是正确的，这意味着它应该是 O(log n)。集合文档中曾经至少存在一个错误（将数组支持的数据结构称为二叉搜索树），但已更正。

【讨论】：

公平地说，数组是二叉树的完美存储。见：webdocs.cs.ualberta.ca/~holte/T26/tree-as-array.html
是和不是。是的，因为它当然都映射到主内存，它提供了一个类似数组的接口（但非常倾向于优先访问同一缓存行中的数据）。不，因为除了最小的（和平衡的）树之外，这没有为任何树提供合理的实现。多路树更适合当前的处理器设计
仅供参考：从stackoverflow.com/questions/6313896/…合并