仅与一项功能相关的昂贵的预先计算数据[重复]答案

【问题标题】：Expensive precalculated data only relevant to one function [duplicate]仅与一项功能相关的昂贵的预先计算数据[重复]
【发布时间】：2013-07-01 21:48:29
【问题描述】：

我发现自己经常写这样的代码：

_munge_text_re = re.compile("... complicated regex ...")
def munge_text(text):
    match = _munge_text_re.match(text)
    ... do stuff with match ...

只有munge_text 使用_munge_text_re，所以最好以某种方式使其成为函数的本地函数，但如果我将re.compile 行移动到def 内，那么每次函数执行时都会对其进行评估调用，违背了编译正则表达式的目的。

有没有办法让_munge_text_re 本地化到munge_text，同时仍然只评估其初始化程序一次？单次评估不必在模块加载时发生；第一次调用munge_text 就足够了。

这个例子使用了一个正则表达式，我需要它的大部分时间是一个正则表达式，但它可能是任何实例化昂贵的数据（所以你不想每次调用函数时都这样做) 并在程序的生命周期内固定。 ConfigParser 实例也会浮现在脑海中。

额外的功劳：由于太乏味而无法进入这里，我当前的项目需要极高的向后兼容性，因此适用于 Python 2.0 的解决方案会比不适用的解决方案要好。 p>

【问题讨论】：

标签： python

【解决方案1】：

我想你可以这样做：

def munge_text(text):
    global munge_text
    _munge_text_re = re.compile("... complicated regex ...")
    def t(text):
       match = _munge_text_re.match(text)
       ... do stuff with match ...
    munge_text = t
    return t(text)

【讨论】：

【解决方案2】：

另一种方法——我只提到一种信息，而不是因为我会在生产中使用它，所以我将在社区 wiki 上使用它——将状态存储在函数本身中。你可以使用hasattr，或者使用AttributeError：

def munge1(x):
    if not hasattr(munge1, 'val'):
        print 'expensive calculation'
        munge1.val = 10
    return x + munge1.val

def munge2(x):
    try:
        munge2.val
    except AttributeError:
        print 'expensive calculation'
        munge2.val = 10
    return x + munge2.val

之后

>>> munge1(3)
expensive calculation
13
>>> munge1(3)
13
>>> munge2(4)
expensive calculation
14
>>> munge2(4)
14

但老实说，我通常会在这个时候切换到一个类。

【讨论】：

注：这不适用于 Python 2.0。
@Zack：它适用于最近的 Python，甚至在 2.x 行中。还是您的意思是字面意义上的 2.0，即十多年前出现的那个？
是的，我的意思是 2.0。一直到 2.0 的向后兼容性是这个特定项目的要求。如果不知道其中的原因，您可能会更高兴。

【解决方案3】：

现在它有了状态，只需为它创建一个类：

class TextMunger(object):

    def __init__(self, regex):
        self._pattern = regex
        self._munge_text_re = None

    def __call__(self, text):
        if self._munge_text_re is None:
            self._munge_text_re = re.compile(self._pattern)

        match = self._munge_text_re.match(text)
        # ... do stuff with match ...

munge_text = TextMunger("... complicated regex ...")

# The rest of your code stays the same

如果您不知道，类上的__call__ 方法意味着可以像调用函数一样调用对象，因此您可以像以前一样继续使用munge_text(text)。

（这种问题实际上是导致我对 Python 中的 lazy property decorator 提出问题的原因，您可能也会对此感兴趣；除非您发现自己经常重复这种模式，否则我不会为此烦恼。）

【讨论】：

+1，不错的方法——但你如何做到这一点，以便文档字符串和这样的“看起来正确”自省（在 ipython/code completers/etc 中）？
@Dougal - 好问题。通常我只对 Sphinx 学习的内容感兴趣，所以像记录任何其他课程一样记录课程就足够了。您可以尝试分配给self.__doc__，但这不会欺骗例如。 help(munge_text)。也许它将适用于您使用的其他工具。如果没有，也许将其作为另一个问题发布。

【解决方案4】：

_munge_text_re = None
def munge_text(text):
    global _munge_text_re
    _munge_text_re = _munge_text_re or re.compile("... complicated regex ...")
    match = _munge_text_re.match(text)
    ... do stuff with match ...

【讨论】：

那个...实际上并没有摆脱全局变量...
不，但它摆脱了昂贵的初始化
@Malvolio OP 想使用本地人。
另一种解决方案也好不到哪里去：不是有一个仅用于缓存 RE 的模块级变量，而是一个仅用于缓存 RE 的模块级类。 Python 不是灵活作用域的最佳语言。
我想知道你是否可以通过使用某种闭包来摆脱全局和类，但据我所知，你最终还是会得到第二个全局函数。