给定一个整数，在字典的元组键之间找到一个整数答案

【问题标题】：Finding value in dict given an integer that can be found in between dictionary's tuple key给定一个整数，在字典的元组键之间找到一个整数
【发布时间】：2021-07-17 23:06:03
【问题描述】：

给定一个包含元组键和字符串值的 x 字典：

x = {(0, 4): 'foo', (4,9): 'bar', (9,10): 'sheep'}

任务是编写函数，找到值，给定一个特定的数字，例如如果用户输入 3，它应该返回 'foo'。我们可以假设键中没有重叠的数字。

另一个例子，如果用户输入 9，它应该返回 'bar'。

我尝试将x dict 转换为列表并编写如下函数，但如果键中的值范围非常大，则不是最佳选择：

from itertools import chain

mappings = None * max(chain(*x))

for k in x:
    for i in range(k[0], k[1]):
        mappings[i] = x[k] 

def myfunc(num):
    return mapping[num]

myfunc 函数还能怎么写？
是否有更好的数据结构来保留mapping？

【问题讨论】：

元组范围是打开还是关闭？
我们可以假设键是连续范围吗？
如果键不是连续范围，那么你的映射函数不正确。
啊，是的，我实际上必须忽略我取了最大值，然后在运行列表理解之前用 None 预填充。
我已经更新了我的答案。请让我知道这是否为您的 dict 提供了足够的性能。

标签： python dictionary data-structures tuples key-value

【解决方案1】：

您可以简单地遍历键并比较值（而不是创建映射）。这比首先创建映射更有效，因为您可能拥有像 (0, 100000) 这样的键，这会产生不必要的开销。

根据 OP 的 cmets 编辑答案

x = {(0, 4): 'foo', (4,9): 'bar', (9,10): 'sheep'}

def find_value(k):
    for t1, t2 in x:
        if k > t1 and k <= t2:   # edited based on comments
            return x[(t1, t2)]
    
    # if we end up here, we can't find a match
    # do whatever appropriate, e.g. return None or raise exception
    return None

注意： ~~不清楚您的元组键是否包含输入数字的范围。例如。如果用户输入4，他们应该得到'foo' 还是'bar'？这将影响您在我的 sn-p 中对上述功能的比较。~~（请参阅上面的编辑，这应该满足您的要求）。

~~在上面的示例中，4 的输入将返回 'foo'，因为它满足 k >= 0 and k <= 4 的条件，因此在继续循环之前返回。~~

编辑：措辞和错字修复

【讨论】：

我会采用相同的方法，除了
@jwal 是的，同意，希望 OP 能够澄清关键值的包容性。另一个复杂因素是我们假设 t1
嗯，范围是(0,4], (4, 9], (9, 10]，[ ]指的是包括数字在内。
@alvas gotcha，那么我只需将比较更改为k > t1 and k <= t2。我还看到 (10, 10) 键被编辑为 (9, 10)，这对于这种类型的比较更有意义。

【解决方案2】：

迭代字典与键比较：

x = {(0, 4): 'foo', (4, 9): 'bar', (9, 10): 'sheep'}

def find_tuple(dct, num):
    for tup, val in dct.items():
        if tup[0] <= num < tup[1]:
            return val
    return None

print(find_tuple(x, 3))
# foo
print(find_tuple(x, 9))
# sheep
print(find_tuple(x, 11))
# None

一个更好的数据结构是一个只有区间左边界（作为键）和相应值的字典。然后你可以使用bisect 作为其他答案提到的。

import bisect
import math

x = {
    -math.inf: None,
    0: 'foo',
    4: 'bar',
    9: 'sheep',
    10: None,
}

def find_tuple(dct, num):
    idx = bisect.bisect_right(list(dct.keys()), num)
    return list(dct.values())[idx-1]

print(find_tuple(x, 3))
# foo
print(find_tuple(x, 9))
# sheep
print(find_tuple(x, 11))
# None

【讨论】：

【解决方案3】：

您可以在numpy 数组中转换您的密钥，并使用numpy.searchsorted 来搜索查询。由于键是left open，我在数组中将键的打开值增加了1。

每个查询都是有序的O(log(n))。

创建一个数组：

A = np.array([[k1+1, k2] for k1, k2 in x])
>>> A
array([[ 1,  4],
       [ 5,  9],
       [10, 10]])

查询查询功能：

def myfunc(num):
    ind1 = np.searchsorted(A[:, 0], num, 'right')
    ind2 = np.searchsorted(A[:, 1], num, 'left')
    if ind1 == 0 or ind2 == A.shape[0] or ind1 <= ind2: return None
    return vals[ind2]

打印：

>>> myfunc(3)
'foo'

【讨论】：

【解决方案4】：

这是使用pandas.IntervalIndex 和pandas.cut 的一种解决方案。请注意，我将最后一个键“调整”为 (10, 11)，因为我在 IntervalIndex 中使用了closed="left"。如果您希望间隔在不同侧（或两者）关闭，您可以更改此设置：

import pandas as pd

x = {(0, 4): "foo", (4, 9): "bar", (10, 11): "sheep"}

bins = pd.IntervalIndex.from_tuples(x, closed="left")
result = pd.cut([3], bins)[0]

print(x[(result.left, result.right)])

打印：

foo

使用bisect 模块的其他解决方案（假设范围是连续的 - 所以没有“间隙”）：

from bisect import bisect_left

x = {(0, 4): "foo", (4, 9): "bar", (10, 10): "sheep"}

bins, values = [], []
for k in sorted(x):
    bins.append(k[1])  # intervals are closed "right", eg. (0, 4]
    values.append(x[k])

idx = bisect_left(bins, 4)
print(values[idx])

打印：

foo

【讨论】：

pandas 是一把世界大小的锤子，可以解决一个非常小的问题。
@jwal 是的，这就是为什么我也添加了bisect 方法。