【问题标题】:Python Nested Function Variable ScopingPython 嵌套函数变量范围
【发布时间】:2018-08-24 09:59:30
【问题描述】:

我有以下功能:

def print_hamming_distance(calls):
    #calls is a dictionary
    samples = calls.keys() 
    with Pool(8) as pool: #Parallel Process
        for dist, sample1, sample2 in pool.imap(multi_proc_hamming_distance, itertools.combinations(samples,2)):
            print( dist, sample1, sample2 )   

def multi_proc_hamming_distance(samples): # specifically created function to use with pool
    return hamming_distance(calls[samples[0]],calls[samples[1]]), samples[0], samples[1]

当我在我的代码中调用它们时,我得到了这个错误:

NameError: name 'calls' is not defined

我的印象是嵌套函数可以访问该函数之外的变量。有人可以向我解释为什么我会收到此错误吗?

我意识到解决方案之一就是将字典作为参数传递给第二个函数,这就是我解决问题的方法,但这增加了运行时间。此外,当我在 jupyter 上运行代码而不包装 print_hamming_distance(calls) 时,它起作用了。

不包装我的意思是这样的:

def multi_proc_hamming_distance(samples): # specifically created function to use with pool
    return hamming_distance(calls[samples[0]],calls[samples[1]]), samples[0], samples[1]


#calls is already defined somewhere
samples = calls.keys() 
with Pool(8) as pool: #Parallel Process
    for dist, sample1, sample2 in pool.imap(multi_proc_hamming_distance, itertools.combinations(samples,2)):
        print( dist, sample1, sample2 )

编辑:完全回溯错误

multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File             "/home/usr/anaconda3/envs/some_env/lib/python3.5/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/usr/project/pipeline/project_name/distance.py", line 44, in multi_proc_hamming_distance
return hamming_distance(calls[samples[0]],calls[samples[1]]), samples[0], samples[1]
NameError: name 'calls' is not defined
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/project/pipeline/project_name.py", line 264, in <module>
main()
File "/usr/project/pipeline/project_name.py", line 259, in main
distance(param)
File "/usr/project/pipeline/project_name.py", line 169, in distance
distance = get_distance[param.data_type](calls)
File "/usr/project/pipeline/project_name/distance.py", line 37, in get_param_type_distance
for dist, sample1, sample2 in pool.imap(multi_proc_hamming_distance, itertools.combinations(samples,2)):
File "/home/usr/anaconda3/envs/some_env/lib/python3.5/multiprocessing/pool.py", line 731, in next
raise value

【问题讨论】:

  • 你能发布完整的回溯错误吗?
  • 将完整的回溯错误添加为编辑
  • 检查答案,看看解决方案现在是否有效
  • 没有 lexical 嵌套 - 您有一个将第二个函数传递给第三个函数的函数。它们的词汇环境没有任何关系。我怀疑你只是忘了缩进multi_proc_hamming_distance
  • @molbdnilo 我认为这只是一个编辑错误

标签: python-3.x variables scope


【解决方案1】:

是的,嵌套函数可以访问该函数之外的变量。但是在您的情况下,调用变量未在函数内部定义,它只是一个参数,嵌套函数无法访问。您可以通过添加calls = calls 来纠正该问题,如下所示。

def print_hamming_distance(calls):
    #calls is a dictionary
    calls = calls
    samples = calls.keys() 
    with Pool(8) as pool: #Parallel Process
    for dist, sample1, sample2 in pool.imap(multi_proc_hamming_distance, itertools.combinations(samples,2)):
        print( dist, sample1, sample2 )   

    # nested function
    def multi_proc_hamming_distance(calls,samples): # specifically created function to use with pool
        return hamming_distance(calls[samples[0]],calls[samples[1]]), samples[0], samples[1]

【讨论】:

  • 我已经对其进行了编辑,以便嵌套函数接收调用
  • 真的需要multi_proc_hamming_distance,因为return调用了第三个函数?
  • 你可能是对的,我可以删除中间函数,但无论如何我想避免添加第二个参数。第二个参数基本上破坏了 pool.imap。长话短说,有一种方法可以规避使用 starmap 或 partial() 函数,但结果是运行时间更长,如原始帖子中所述。在您看来,有什么方法可以在不添加第二个参数的情况下解决这个问题
  • 如何分配distance = hamming_distance(calls[samples[0]],calls[samples[1]]) 然后分配return distance, samples[0], samples[1] 而你不需要multi_proc_hamming_distance 也不需要添加第二个参数
  • 我认为这行不通。我能做的是制作一个带有样本 id 和字典的元组列表,但此时我不妨回到我修改后的实现。感谢您的帮助
猜你喜欢
  • 2011-07-10
  • 2016-09-25
  • 2013-05-03
  • 1970-01-01
  • 1970-01-01
  • 2015-07-02
  • 2012-06-24
  • 2019-08-21
  • 2013-10-12
相关资源
最近更新 更多