有效地在排序列表中查找元素的索引答案

【问题标题】：Finding the index of an element in a sorted list efficiently有效地在排序列表中查找元素的索引
【发布时间】：2014-04-30 22:51:08
【问题描述】：

我有一个排序列表l（大约有 20,000 个元素），并且想在l 中找到超过给定值 t_min 的第一个元素。目前，我的代码如下。

def find_index(l):   
    first=next((t for t in l if t>t_min), None) 
    if first==None:
        return None
    else:
        return l.index(first)

为了对代码进行基准测试，我使用cProfile 运行测试循环，并通过将时间与控制循环进行比较来去除随机生成列表所需的时间：

import numpy
import cProfile

def test_loop(n):
    for _ in range(n):
        test_l=sorted(numpy.random.random_sample(20000))
        find_index(test_l, 0.5)

def control_loop(n):
    for _ in range(n):
        test_l=sorted(numpy.random.random_sample(20000))

# cProfile.run('test_loop(1000)') takes 10.810 seconds
# cProfile.run('control_loop(1000)') takes 9.650 seconds

find_index 的每个函数调用大约需要 1.16 毫秒。鉴于我们知道列表已排序，有没有办法改进代码以使其更高效？

【问题讨论】：

不能用search_sorted吗？
你指的是docs.scipy.org/doc/numpy/reference/generated/…吗？
是的，如果您可以使用 numpy 数组并对其进行排序，那么这将很快，您基本上会这样做 np.searchsorted(my_array, find_val, side='right')
即使排序也会很快：docs.scipy.org/doc/numpy/reference/generated/…

标签： python sorting python-2.7 optimization

【解决方案1】：

标准库bisect 模块对此很有用，而文档contain an example 正是这个用例。

def find_gt(a, x):
    'Find leftmost value greater than x'
    i = bisect_right(a, x)
    if i != len(a):
        return a[i]
    raise ValueError

【讨论】：

感谢您的回答 - 我不知道 bisect。不过，我想我正在寻找find_gt 而不是这里的索引。