熊猫比较数据帧的行并根据条件返回集合答案

【问题标题】：Pandas comparing rows of dataframes and returning set based on conditions熊猫比较数据帧的行并根据条件返回集合
【发布时间】：2017-08-05 16:30:39
【问题描述】：

我有两个数据框：

[in] print(testing_df.head(n=5))
print(product_combos1.head(n=5))

[out]
                     product_id  length
transaction_id                         
001                      (P01,)       1
002                  (P01, P02)       2
003             (P01, P02, P09)       3
004                  (P01, P03)       2
005             (P01, P03, P05)       3

             product_id  count  length
0            (P06, P09)  36340       2
1  (P01, P05, P06, P09)  10085       4
2            (P01, P06)  36337       2
3            (P01, P09)  49897       2
4            (P02, P09)  11573       2

我想返回len(testing_df + 1) 频率最高的product_combos 行，并在其中包含testing_df 字符串。例如，transaction_id 001 我想返回product_combos[3]（虽然只有 P09）。

对于第一部分（仅根据长度进行比较），我尝试了：

# Return the product combos values that are of the appropriate length and the strings match
for i in testing_df['length']:
    for k in product_combos1['length']:
        if (i)+1 == (k):
            matches = list(k)

但是，这会返回错误：

TypeError: 'numpy.int64' object is not iterable

【问题讨论】：

标签： python pandas dataframe

【解决方案1】：

您不能像这样从不可迭代的对象中创建列表。尝试将matches = list(k) 替换为matches = [k]。这些括号也是多余的 - 您可以将 if (i)+1 == (k): 替换为 if i + 1 == k:。

【讨论】：

【解决方案2】：

只需使用 .append() 方法。我还建议将“匹配”设置为顶部的空列表，这样在重新运行单元格时就不会出现重复项。

# Setup

testing_df = pd.DataFrame(columns = ['product_id','length'])
testing_df.product_id = [('P01',),('P01', 'P02')]
testing_df.length = [1,2]
product_combos1 = pd.DataFrame(columns = ['product_id','count','length'])
product_combos1.length = [3,1]
product_combos1.product_id = [('P01',),('P01', 'P02')]
product_combos1.count = [100,5000]

# Matching

matches = []
for i in testing_df['length']:
    for k in product_combos1['length']:
        if i+1 == k:
            matches.append(k)

让我知道这是否有效，或者是否还有其他问题！祝你好运！

【讨论】：

谢谢，但不幸的是这不起作用 - 但是我能够使用另一种方法解决问题。
很遗憾听到这个消息！使用给出的示例设置，它在我的笔记本中运行良好。很高兴听到您能够解决问题！记得在有机会时将其发布为答案，以便其他来此帖子的人可以参考。