【问题标题】:What _really_ caches a Django QuerySet?什么_really_缓存了Django QuerySet?
【发布时间】:2012-12-31 00:48:01
【问题描述】:

根据(我的阅读)官方dox here:

https://docs.djangoproject.com/en/dev/ref/models/querysets/#when-querysets-are-evaluated

在评估 Django QuerySet 时,它应该被缓存。但情况似乎并非如此。在下面的示例中,TrackingImport 是一个模型,其后面有一个非常大的表。 (为简洁起见,稍微编辑了输出。)

recs = TrackingImport.objects.filter(...stuff...)

In [102]: time(recs[0])
Wall time: 1.84 s

In [103]: time(recs[0])
Wall time: 1.84 s

调用 len() 似乎像宣传的那样工作:

In [104]: len(recs)
Out[104]: 1823

In [105]: time(recs[0])
Wall time: 0.00 s

我不明白为什么取消引用数组没有缓存 QuerySet 结果。它必须评估它,对吗?那我错过了什么?

【问题讨论】:

    标签: django django-models django-queryset


    【解决方案1】:

    你可以去看看源码(django.db.model.query),然后就清楚了,这里是django 1.3.4的query.py,

    def __getitem__(self, k):
        """
        Retrieves an item or slice from the set of results.
        """
        if not isinstance(k, (slice, int, long)):
            raise TypeError
        assert ((not isinstance(k, slice) and (k >= 0))
                or (isinstance(k, slice) and (k.start is None or k.start >= 0)
                    and (k.stop is None or k.stop >= 0))), \
                "Negative indexing is not supported."
    
        if self._result_cache is not None:
            if self._iter is not None:
                # The result cache has only been partially populated, so we may
                # need to fill it out a bit more.
                if isinstance(k, slice):
                    if k.stop is not None:
                        # Some people insist on passing in strings here.
                        bound = int(k.stop)
                    else:
                        bound = None
                else:
                    bound = k + 1
                if len(self._result_cache) < bound:
                    self._fill_cache(bound - len(self._result_cache))
            return self._result_cache[k]
    
        if isinstance(k, slice):
            qs = self._clone()
            if k.start is not None:
                start = int(k.start)
            else:
                start = None
            if k.stop is not None:
                stop = int(k.stop)
            else:
                stop = None
            qs.query.set_limits(start, stop)
            return k.step and list(qs)[::k.step] or qs
        try:
            qs = self._clone()
            qs.query.set_limits(k, k + 1)
            return list(qs)[0]
        except self.model.DoesNotExist, e:
            raise IndexError(e.args)
    

    当你不遍历查询集时,_result_cache 为 None,然后当你调用 resc[0] 时,它会跳到下面几行,

    try:
       qs = self._clone()
       qs.query.set_limits(k, k + 1)
       return list(qs)[0]
    except self.model.DoesNotExist, e:
       raise IndexError(e.args)
    

    您会发现,在这种情况下,_result_cache 没有被设置。这就是为什么多个 resc[0] 的持续时间成本相同的原因。

    调用len(resc)后,可以找到源代码,

    def __len__(self):
        # Since __len__ is called quite frequently (for example, as part of
        # list(qs), we make some effort here to be as efficient as possible
        # whilst not messing up any existing iterators against the QuerySet.
        if self._result_cache is None:
            if self._iter:
                self._result_cache = list(self._iter)
            else:
                self._result_cache = list(self.iterator())
        elif self._iter:
            self._result_cache.extend(self._iter)
        return len(self._result_cache)
    

    你可以看到_result_cache有值,然后你调用recs[0],它只会使用缓存,

     if self._result_cache is not None:
             ....
         return self._result_cache[k]
    

    源代码永远不会说谎,因此当您在文档中找不到答案时,最好阅读源代码。

    【讨论】:

    • 感谢您如此详尽的回答。
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2020-04-16
    • 2011-05-12
    • 2019-07-18
    • 2011-05-03
    • 1970-01-01
    • 2023-03-28
    • 2021-01-16
    相关资源
    最近更新 更多