查找一个列表的任何元素出现在另一个列表中的索引，并重复答案

【问题标题】：Find the indices at which any element of one list occurs in another, with duplicates查找一个列表的任何元素出现在另一个列表中的索引，并重复
【发布时间】：2019-10-28 13:50:08
【问题描述】：

Python 新手，来自 MATLAB。我的问题与这篇文章（Find the indices at which any element of one list occurs in another）非常相似，但有一些我无法完全整合的调整（即管理重复值和缺失值）。

按照那个例子，我有两个列表，干草堆和针：

haystack = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K']
needles = ['F', 'G', 'H', 'I', 'F', 'K']

但是，haystack 和 needles 都是日期列表。我需要为 haystack 中的针的每个元素在 haystack 中创建索引列表，以便：

result = [5, 6, 7, nan, 5, 9]

我的问题与发布的示例之间的两大区别是： 1. 我有重复的针（干草堆没有任何重复），据我所知，这意味着我不能使用 set() 2. 在极少数情况下，needles 中的元素可能不在大海捞针中，在这种情况下我想插入一个 nan（或其他占位符）

到目前为止，我已经得到了这个（对于有多大的干草堆和针头来说效率不够）：

import numpy as np

def find_idx(a,func):
    return [i for (i,val) in enumerate(a) if func(val)]

haystack = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K']
needles = ['F', 'G', 'H', 'I', 'F', 'K']

result=[]
for x in needles:
    try:
        idx = find_idx(haystack, lambda y: y==x)
        result.append(idx[0])
    except:
        result.append(np.nan)

据我所知，该代码可以满足我的要求，但速度还不够快。更有效的替代方案？

【问题讨论】：

这是this question with a different title的副本
答案很简单[ haystack.index(x) if x in haystack else None for x in needles ]

标签： python python-3.x

【解决方案1】：

如果您的数组非常大，可能值得制作一个字典来索引大海捞针：

haystack = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K']
needles  = ['F', 'G', 'H', 'I', 'F', 'K']

hayDict  = { K:i for i,K in enumerate(haystack) }
result   = [ hayDict.get(N,np.nan) for N in needles]

print(result)

# [5, 6, 7, nan, 5, 9]

【讨论】：

谢谢！之前的所有响应都正常工作，但这一响应被证明是迄今为止最快的。感谢所有精彩的回复！

【解决方案2】：

这个怎么样？

results=[]
haystack = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K']
needles = ['F', 'G', 'H', 'I', 'F', 'K']    

for n in needles:
    if n in haystack:
        results.append(haystack.index(n))
    else:
        results.append("NaN")
print (results)

或方法2：

haystack = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K']
needles = ['F', 'G', 'H', 'I', 'F', 'K']

results=[]

def getInd(n, haystack):
        if n in haystack:
                return haystack.index(n)
        else:
                return "NaN"

for n in needles:
        results.append(getInd(n, haystack))

print (results)

【讨论】：