【问题标题】:Iteratively Sample from a list Until a Condition is Met - Python从列表中迭代采样,直到满足条件 - Python
【发布时间】:2020-05-13 07:48:49
【问题描述】:

我有一个值列表:

address_ids = [123,123,123,123,456,789,112,115]

来自address_ids 列表,我想检查单个值占整个列表的百分比。

我是这样看的,

unique_adres = list(set(address_ids))

save_vals = {}
for i in unique_adres:
    temp_val =  address_ids.count(i)/len(address_ids)
    save_vals[i] = temp_val  
save_vals
>> {456: 0.125, 112: 0.125, 115: 0.125, 789: 0.125, 123: 0.5}

123 有 50%。如果单个值具有超过 50% 的数据,我需要有一个条件,然后我想用替换和 8 个样本重新采样,其中单个属性不占整个列表的 50%。因此,它看起来像这样,(因为随机采样,这不会完全相同)并且想法是使单个属性不占整个列表的 50%。

>> {456: 0.125, 112: 0.125, 115: 0.125, 789: 0.225, 123: 0.4}

>> {456: 0.125, 112: 0.225, 115: 0.125, 789: 0.225, 123: 0.3}

我试过这样的,

from random import choices
for k,v in save_vals.items():
    if v >= 0.50:
        break
    choices_vals = choices(address_ids, k=8)

但不确定,如果不符合条件if v >= 0.50:,如何通过重新采样持续检查我的条件。

任何帮助或建议都会很棒。

【问题讨论】:

    标签: python-3.x random


    【解决方案1】:

    将条件设为函数并使用循环:

    def needs_improvement(unique_adress):
        save_vals = {}
        for i in unique_adress:
            temp_val =  choices_vals.count(i)/len(choices_vals)
            save_vals[i] = temp_val  
            # this checks if you need to change something
            if temp_val > 0.5:
                return True
       # not necessary as `None` (default value) evaluates to `False`
       return False
    
    while needs_improvement(unique_adres):
        global unique_adres
        unique_adres = choices(address_ids, k=8)
    

    【讨论】:

    • 重采样时不考虑123 :(
    【解决方案2】:

    集合中的Counter 类使其变得简单:

    from collections import Counter
    
    address_ids = [123,123,123,123,456,789,112,115]
    counter = Counter(address_ids) #{123:4 ,456:1 ,789:1 ,112:1 ,115:1}
    common = counter.most_common(1) # [(123,4)]
    # where most_common(n) gives the n most common values as a list of tuples
    
    while common[0][1] >= 0.5*len(address_ids):
        #resample
        #recheck common
    
    
    
    

    【讨论】:

    • 你能解释一下#recheck common 是怎么做的吗?是“继续”吗?
    • 对于一个新的列表“address_ids_2”:common = Counter(address_ids_2).most_comon(1) ....同上...只是更新公共变量
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2021-11-26
    • 2020-10-06
    • 1970-01-01
    • 2021-12-24
    • 2020-06-16
    • 2021-05-02
    • 2019-08-19
    相关资源
    最近更新 更多