从列表中迭代采样，直到满足条件 - Python答案

【问题标题】：Iteratively Sample from a list Until a Condition is Met - Python从列表中迭代采样，直到满足条件 - Python
【发布时间】：2020-05-13 07:48:49
【问题描述】：

我有一个值列表：

address_ids = [123,123,123,123,456,789,112,115]

来自address_ids 列表，我想检查单个值占整个列表的百分比。

我是这样看的，

unique_adres = list(set(address_ids))

save_vals = {}
for i in unique_adres:
    temp_val =  address_ids.count(i)/len(address_ids)
    save_vals[i] = temp_val  
save_vals
>> {456: 0.125, 112: 0.125, 115: 0.125, 789: 0.125, 123: 0.5}

123 有 50%。如果单个值具有超过 50% 的数据，我需要有一个条件，然后我想用替换和 8 个样本重新采样，其中单个属性不占整个列表的 50%。因此，它看起来像这样，（因为随机采样，这不会完全相同）并且想法是使单个属性不占整个列表的 50%。

>> {456: 0.125, 112: 0.125, 115: 0.125, 789: 0.225, 123: 0.4}

或

>> {456: 0.125, 112: 0.225, 115: 0.125, 789: 0.225, 123: 0.3}

我试过这样的，

from random import choices
for k,v in save_vals.items():
    if v >= 0.50:
        break
    choices_vals = choices(address_ids, k=8)

但不确定，如果不符合条件if v >= 0.50:，如何通过重新采样持续检查我的条件。

任何帮助或建议都会很棒。

【问题讨论】：

标签： python-3.x random

【解决方案1】：

将条件设为函数并使用循环：

def needs_improvement(unique_adress):
    save_vals = {}
    for i in unique_adress:
        temp_val =  choices_vals.count(i)/len(choices_vals)
        save_vals[i] = temp_val  
        # this checks if you need to change something
        if temp_val > 0.5:
            return True
   # not necessary as `None` (default value) evaluates to `False`
   return False

while needs_improvement(unique_adres):
    global unique_adres
    unique_adres = choices(address_ids, k=8)

【讨论】：

重采样时不考虑123 :(

【解决方案2】：

集合中的Counter 类使其变得简单：

from collections import Counter

address_ids = [123,123,123,123,456,789,112,115]
counter = Counter(address_ids) #{123:4 ,456:1 ,789:1 ,112:1 ,115:1}
common = counter.most_common(1) # [(123,4)]
# where most_common(n) gives the n most common values as a list of tuples

while common[0][1] >= 0.5*len(address_ids):
    #resample
    #recheck common

【讨论】：

你能解释一下#recheck common 是怎么做的吗？是“继续”吗？
对于一个新的列表“address_ids_2”：common = Counter(address_ids_2).most_comon(1) ....同上...只是更新公共变量