【问题标题】:What is the difference between chain and chain.from_iterable in itertools?itertools中chain和chain.from_iterable有什么区别?
【发布时间】:2013-02-06 22:12:25
【问题描述】:

我在互联网上找不到任何有效的例子,我可以看到它们之间的区别以及为什么选择一个而不是另一个。

【问题讨论】:

    标签: python iterator itertools


    【解决方案1】:

    第一个接受 0 个或多个参数,每个参数是一个可迭代对象,第二个接受一个参数,预期会产生可迭代对象:

    from itertools import chain
    
    chain(list1, list2, list3)
    
    iterables = [list1, list2, list3]
    chain.from_iterable(iterables)
    

    iterables 可以是任何产生可迭代对象的迭代器:

    def gen_iterables():
        for i in range(10):
            yield range(i)
    
    itertools.chain.from_iterable(gen_iterables())
    

    使用第二种形式通常是一种方便的情况,但由于它会延迟地循环输入可迭代对象,因此它也是链接无限个有限迭代器的唯一方法:

    def gen_iterables():
        while True:
            for i in range(5, 10):
                yield range(i)
    
    chain.from_iterable(gen_iterables())
    

    上面的例子将给你一个迭代,它产生一个循环的数字模式,它永远不会停止,但永远不会消耗比单个 range() 调用所需的内存更多的内存。

    【讨论】:

    • 我仍然无法得到它。你能告诉我实际情况下的输出差异和用例在哪里使用什么
    • @user1994660:没有输出差异。这是一个输入的区别。它使使用某些输入变得更容易。
    • @user1994660:我使用this answer中的第二种形式。
    • @user1994660:运行这段代码:# Return an iterator of iteratorsdef it_it(): return iter( [iter( [11, 22] ), iter( [33, 44] )] )print( list(itertools.chain.from_iterable(it_it())) )print( list(itertools.chain(it_it())) )print( list(itertools.chain(*it_it())) )第一个最好;第二个没有到达嵌套迭代器,它返回迭代器,而不是所需的数字;第三个产生正确的输出,但它不是完全懒惰的:“*”强制创建所有迭代器。对于这个无关紧要的愚蠢输入。
    • 注意,如果iterables不太大,也可以itertools.chain(*iterables)
    【解决方案2】:

    我找不到任何有效的例子...我可以看到它们之间的区别 [chainchain.from_iterable] 以及为什么选择一个而不是另一个

    接受的答案是彻底的。对于那些寻求快速申请的人,可以考虑将几个列表展平:

    list(itertools.chain(["a", "b", "c"], ["d", "e"], ["f"]))
    # ['a', 'b', 'c', 'd', 'e', 'f']
    

    您可能希望稍后重用这些列表,因此您可以创建一个可迭代的列表:

    iterable = (["a", "b", "c"], ["d", "e"], ["f"])
    

    尝试

    但是,将一个可迭代对象传递给 chain 会得到一个未展平的结果:

    list(itertools.chain(iterable))
    # [['a', 'b', 'c'], ['d', 'e'], ['f']]
    

    为什么?你传入了 一个 项(一个元组)。 chain 分别需要每个列表。


    解决方案

    如果可能,您可以解压缩一个可迭代对象:

    list(itertools.chain(*iterable))
    # ['a', 'b', 'c', 'd', 'e', 'f']
    
    list(itertools.chain(*iter(iterable)))
    # ['a', 'b', 'c', 'd', 'e', 'f']
    

    更一般地,使用.from_iterable(因为它也适用于无限迭代器):

    list(itertools.chain.from_iterable(iterable))
    # ['a', 'b', 'c', 'd', 'e', 'f']
    
    g = itertools.chain.from_iterable(itertools.cycle(iterable))
    next(g)
    # "a"
    

    【讨论】:

      【解决方案3】:

      他们做的事情非常相似。对于少数可迭代对象 itertools.chain(*iterables)itertools.chain.from_iterable(iterables) 执行类似。

      from_iterables 的主要优势在于能够处理大量(可能无限)的可迭代对象,因为它们在调用时不需要全部可用。

      【讨论】:

      • 有谁知道* 操作符是否懒惰地解包iterables
      • @Rotareti,是的,它确实会延迟解包(一次一个),但在这种情况下,itertools.chain(*iterables) 是一个函数调用。所有参数必须在调用时出现。
      • 这是真的吗?从CPython代码来看,好像是一样的stackoverflow.com/a/62513808/610569
      • @alvas 尝试将元素的数量更改为非常大;在 10_000 到 1_000_000 的范围内,您会看到 from_iterables 变得更快。
      【解决方案4】:

      另一种看待方式:

      chain(iterable1, iterable2, iterable3, ...) 用于当您已经知道您拥有哪些可迭代对象时,您可以将它们写为这些逗号分隔的参数。

      chain.from_iterable(iterable) 用于您的可迭代对象(如 iterable1、iterable2、iterable3)是从另一个可迭代对象获得的。

      【讨论】:

        【解决方案5】:

        扩展@martijn-pieters answer

        尽管对可迭代对象中的内部项目的访问保持不变,并且在实现方面,

        • itertools_chain_from_iterable(即 Python 中的 chain.from_iterable)和
        • chain_new(即 Python 中的 chain

        在 CPython 实现中,都是 chain_new_internal 的鸭子类型


        使用chain.from_iterable(x) 是否有任何优化好处,其中x 是可迭代的可迭代;主要目的是最终消费扁平化的物品列表?

        我们可以尝试使用以下方法对其进行基准测试:

        import random
        from itertools import chain
        from functools import wraps
        from time import time
        
        from tqdm import tqdm
        
        def timing(f):
            @wraps(f)
            def wrap(*args, **kw):
                ts = time()
                result = f(*args, **kw)
                te = time()
                print('func:%r args:[%r, %r] took: %2.4f sec' % (f.__name__, args, kw, te-ts))
                return result
            return wrap
        
        def generate_nm(m, n):
            # Creates m generators of m integers between range 0 to n.
            yield iter(random.sample(range(n), n) for _ in range(m))
            
        
        def chain_star(x):
            # Stores an iterable that will unpack and flatten the list of list.
            chain_x = chain(*x)
            # Consumes the items in the flatten iterable.
            for i in chain_x:
                pass
        
        def chain_from_iterable(x):
            # Stores an iterable that will unpack and flatten the list of list.
            chain_x = chain.from_iterable(x)
            # Consumes the items in the flatten iterable.
            for i in chain_x:
                pass
        
        
        @timing
        def versus(f, n, m):
          f(generate_nm(n, m))
        

        P/S:基准测试正在运行...等待结果。


        结果

        链星,m=1000,n=1000

        for _ in range(10):
            versus(chain_star, 1000, 1000)
        

        [出]:

        func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 1000, 1000), {}] took: 0.6494 sec
        func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 1000, 1000), {}] took: 0.6603 sec
        func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 1000, 1000), {}] took: 0.6367 sec
        func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 1000, 1000), {}] took: 0.6350 sec
        func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 1000, 1000), {}] took: 0.6296 sec
        func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 1000, 1000), {}] took: 0.6399 sec
        func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 1000, 1000), {}] took: 0.6341 sec
        func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 1000, 1000), {}] took: 0.6381 sec
        func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 1000, 1000), {}] took: 0.6343 sec
        func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 1000, 1000), {}] took: 0.6309 sec
        

        chain_from_iterable, m=1000, n=1000

        for _ in range(10):
            versus(chain_from_iterable, 1000, 1000)
        

        [出]:

        func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 1000, 1000), {}] took: 0.6416 sec
        func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 1000, 1000), {}] took: 0.6315 sec
        func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 1000, 1000), {}] took: 0.6535 sec
        func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 1000, 1000), {}] took: 0.6334 sec
        func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 1000, 1000), {}] took: 0.6327 sec
        func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 1000, 1000), {}] took: 0.6471 sec
        func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 1000, 1000), {}] took: 0.6426 sec
        func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 1000, 1000), {}] took: 0.6287 sec
        func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 1000, 1000), {}] took: 0.6353 sec
        func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 1000, 1000), {}] took: 0.6297 sec
        

        链星,m=10000,n=1000

        func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 10000, 1000), {}] took: 6.2659 sec
        func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 10000, 1000), {}] took: 6.2966 sec
        func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 10000, 1000), {}] took: 6.2953 sec
        func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 10000, 1000), {}] took: 6.3141 sec
        func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 10000, 1000), {}] took: 6.2802 sec
        func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 10000, 1000), {}] took: 6.2799 sec
        func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 10000, 1000), {}] took: 6.2848 sec
        func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 10000, 1000), {}] took: 6.3299 sec
        func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 10000, 1000), {}] took: 6.2730 sec
        func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 10000, 1000), {}] took: 6.3052 sec
        

        chain_from_iterable, m=10000, n=1000

        func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 10000, 1000), {}] took: 6.3129 sec
        func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 10000, 1000), {}] took: 6.3064 sec
        func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 10000, 1000), {}] took: 6.3071 sec
        func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 10000, 1000), {}] took: 6.2660 sec
        func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 10000, 1000), {}] took: 6.2837 sec
        func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 10000, 1000), {}] took: 6.2877 sec
        func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 10000, 1000), {}] took: 6.2756 sec
        func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 10000, 1000), {}] took: 6.2939 sec
        func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 10000, 1000), {}] took: 6.2715 sec
        func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 10000, 1000), {}] took: 6.2877 sec
        

        链星,m=100000,n=1000

        func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 100000, 1000), {}] took: 62.7874 sec
        func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 100000, 1000), {}] took: 63.3744 sec
        func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 100000, 1000), {}] took: 62.5584 sec
        func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 100000, 1000), {}] took: 63.3745 sec
        func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 100000, 1000), {}] took: 62.7982 sec
        func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 100000, 1000), {}] took: 63.4054 sec
        func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 100000, 1000), {}] took: 62.6769 sec
        func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 100000, 1000), {}] took: 62.6476 sec
        func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 100000, 1000), {}] took: 63.7397 sec
        func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 100000, 1000), {}] took: 62.8980 sec
        

        chain_from_iterable, m=100000, n=1000

        for _ in range(10):
            versus(chain_from_iterable, 100000, 1000)
        

        [出]:

        func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 100000, 1000), {}] took: 62.7227 sec
        func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 100000, 1000), {}] took: 62.7717 sec
        func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 100000, 1000), {}] took: 62.7159 sec
        func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 100000, 1000), {}] took: 62.7569 sec
        func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 100000, 1000), {}] took: 62.7906 sec
        func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 100000, 1000), {}] took: 62.6211 sec
        func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 100000, 1000), {}] took: 62.7294 sec
        func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 100000, 1000), {}] took: 62.8260 sec
        func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 100000, 1000), {}] took: 62.8356 sec
        func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 100000, 1000), {}] took: 62.9738 sec
        

        链星,m=500000,n=1000

        for _ in range(3):
            versus(chain_from_iterable, 500000, 1000)
        

        [出]:

        func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 500000, 1000), {}] took: 314.5671 sec
        func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 500000, 1000), {}] took: 313.9270 sec
        func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 500000, 1000), {}] took: 313.8992 sec
        

        chain_from_iterable, m=500000, n=1000

        for _ in range(3):
            versus(chain_from_iterable, 500000, 1000)
        

        [出]:

        func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 500000, 1000), {}] took: 313.8301 sec
        func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 500000, 1000), {}] took: 313.8104 sec
        func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 500000, 1000), {}] took: 313.9440 sec
        

        【讨论】:

          【解决方案6】:

          另一种看待它的方式是使用chain.from_iterable

          当您有一个可迭代的可迭代对象,例如嵌套的可迭代对象(或复合迭代对象)并使用链来实现简单的可迭代对象时

          【讨论】:

            猜你喜欢
            • 1970-01-01
            • 1970-01-01
            • 1970-01-01
            • 2011-05-06
            • 2010-10-02
            • 2011-12-12
            • 2010-09-16
            • 2012-03-14
            • 2012-02-06
            相关资源
            最近更新 更多