【问题标题】:How to cache a paginated Django queryset如何缓存分页的 Django 查询集
【发布时间】:2014-01-31 18:05:12
【问题描述】:

如何缓存分页的 Django 查询集,特别是在 ListView 中?

我注意到一个查询需要很长时间才能运行,所以我正在尝试缓存它。查询集很大(超过 10 万条记录),所以我试图只缓存它的分页子部分。我无法缓存整个视图或模板,因为有些部分是用户/会话特定的并且需要不断更改。

ListView 有两个标准方法来检索查询集,get_queryset(),它返回非分页数据,paginate_queryset(),它通过当前页面过滤它。

我首先尝试在get_queryset() 中缓存查询,但很快意识到调用cache.set(my_query_key, super(MyView, self).get_queryset()) 会导致整个查询被序列化。

然后我尝试覆盖paginate_queryset(),例如:

import time
from functools import partial
from django.core.cache import cache
from django.views.generic import ListView

class MyView(ListView):

    ...

    def paginate_queryset(self, queryset, page_size):
        cache_key = 'myview-queryset-%s-%s' % (self.page, page_size)
        print 'paginate_queryset.cache_key:',cache_key
        t0 = time.time()
        ret = cache.get(cache_key)
        if ret is None:
            print 're-caching'
            ret = super(MyView, self).paginate_queryset(queryset, page_size)
            cache.set(cache_key, ret, 60*60)
        td = time.time() - t0
        print 'paginate_queryset.time.seconds:',td
        (paginator, page, object_list, other_pages) = ret
        print 'total objects:',len(object_list)
        return ret

但是,即使只检索到 10 个对象,并且每个请求都显示“重新缓存”,这也需要将近一分钟的时间来运行,这意味着没有任何内容被保存到缓存中。

我的settings.CACHE 看起来像:

CACHES = {
    'default': {
        'BACKEND': 'django.core.cache.backends.memcached.MemcachedCache',
        'LOCATION': '127.0.0.1:11211',
    }
}

service memcached status 显示 memcached 正在运行,tail -f /var/log/memcached.log 完全没有显示任何内容。

我做错了什么?缓存分页查询以便不检索整个查询集的正确方法是什么?

编辑:我认为它们可能是 memcached 或 Python 包装器中的错误。 Django 似乎支持两种不同的 memcached 后端,一种使用 python-memcached,一种使用 pylibmc。 python-memcached 似乎默默地隐藏了缓存paginate_queryset() 值的错误。当我切换到 pylibmc 后端时,现在我收到一条明确的错误消息“来自 memcached_set:SERVER ERROR 的错误 10”,可追溯到 set 中的 django/core/cache/backends/memcached.py,第 78 行。

【问题讨论】:

标签: python django django-models memcached django-views


【解决方案1】:

您可以扩展Paginator 以支持通过提供的cache_key 进行缓存。

关于 CachedPaginator 的使用和实现的博客文章可以在 here 找到。源代码贴在djangosnippets.org(这里是web-acrhive link,因为原来的不行)。

但是,我将发布一个对原始版本稍作修改的示例,它不仅可以缓存每页的对象,还可以缓存总数。 (有时即使是计数也可能是一项昂贵的操作)。

from django.core.cache import cache
from django.utils.functional import cached_property
from django.core.paginator import Paginator, Page, PageNotAnInteger


class CachedPaginator(Paginator):
    """A paginator that caches the results on a page by page basis."""
    def __init__(self, object_list, per_page, orphans=0, allow_empty_first_page=True, cache_key=None, cache_timeout=300):
        super(CachedPaginator, self).__init__(object_list, per_page, orphans, allow_empty_first_page)
        self.cache_key = cache_key
        self.cache_timeout = cache_timeout

    @cached_property
    def count(self):
        """
            The original django.core.paginator.count attribute in Django1.8
            is not writable and cant be setted manually, but we would like
            to override it when loading data from cache. (instead of recalculating it).
            So we make it writable via @cached_property.
        """
        return super(CachedPaginator, self).count

    def set_count(self, count):
        """
            Override the paginator.count value (to prevent recalculation)
            and clear num_pages and page_range which values depend on it.
        """
        self.count = count
        # if somehow we have stored .num_pages or .page_range (which are cached properties)
        # this can lead to wrong page calculations (because they depend on paginator.count value)
        # so we clear their values to force recalculations on next calls
        try:
            del self.num_pages
        except AttributeError:
            pass
        try:
            del self.page_range
        except AttributeError:
            pass

    @cached_property
    def num_pages(self):
        """This is not writable in Django1.8. We want to make it writable"""
        return super(CachedPaginator, self).num_pages

    @cached_property
    def page_range(self):
        """This is not writable in Django1.8. We want to make it writable"""
        return super(CachedPaginator, self).page_range

    def page(self, number):
        """
        Returns a Page object for the given 1-based page number.

        This will attempt to pull the results out of the cache first, based on
        the requested page number. If not found in the cache,
        it will pull a fresh list and then cache that result + the total result count.
        """
        if self.cache_key is None:
            return super(CachedPaginator, self).page(number)

        # In order to prevent counting the queryset
        # we only validate that the provided number is integer
        # The rest of the validation will happen when we fetch fresh data.
        # so if the number is invalid, no cache will be setted
        # number = self.validate_number(number)
        try:
            number = int(number)
        except (TypeError, ValueError):
            raise PageNotAnInteger('That page number is not an integer')

        page_cache_key = "%s:%s:%s" % (self.cache_key, self.per_page, number)
        page_data = cache.get(page_cache_key)

        if page_data is None:
            page = super(CachedPaginator, self).page(number)
            #cache not only the objects, but the total count too.
            page_data = (page.object_list, self.count)
            cache.set(page_cache_key, page_data, self.cache_timeout)
        else:
            cached_object_list, cached_total_count = page_data
            self.set_count(cached_total_count)
            page = Page(cached_object_list, number, self)

        return page

【讨论】:

  • 这非常有用——感谢您在缓存计数方面所付出的努力
【解决方案2】:

问题原来是多种因素的结合。主要是,paginate_queryset() 返回的结果包含对无限查询集的引用,这意味着它本质上是不可缓存的。当我调用 cache.set(mykey, (paginator, page, object_list, other_pages)) 时,它试图序列化数千条记录,而不仅仅是我期望的 page_size 记录数,导致缓存项超过 memcached 的限制并失败。

另一个因素是 memcached/python-memcached 中可怕的默认错误报告,它会默默地隐藏所有错误并在出现任何问题时将 cache.set() 转换为 nop,这使得追踪错误非常耗时问题。

我通过基本上重写 paginate_queryset() 以完全放弃 Django 的内置分页器功能并自己计算查询集来解决此问题:

object_list = queryset[page_size*(page-1):page_size*(page-1)+page_size]

然后缓存 那个 object_list

【讨论】:

    【解决方案3】:

    我想在主页上对无限滚动视图进行分页,这就是我想出的解决方案。它是 Django CCBV 和作者最初的解决方案的混合体。

    然而,响应时间并没有像我希望的那样改善,但这可能是因为我在本地测试它时只有 6 个帖子和 2 个用户哈哈。

        # Import
        from django.core.cache import cache
        from django.core.paginator import InvalidPage
        from django.views.generic.list import ListView
        from django.http Http404
    
        class MyListView(ListView):
        template_name = 'MY TEMPLATE NAME'
        model = MY POST MODEL
        paginate_by = 10
    
    
    
        def paginate_queryset(self, queryset, page_size):
    
            """Paginate the queryset"""
            paginator = self.get_paginator(
                queryset, page_size, orphans=self.get_paginate_orphans(),
                allow_empty_first_page=self.get_allow_empty())
    
            page_kwarg = self.page_kwarg
    
            page = self.kwargs.get(page_kwarg) or self.request.GET.get(page_kwarg) or 1
    
            try:
                page_number = int(page)
    
            except ValueError:
                if page == 'last':
                    page_number = paginator.num_pages
    
                else:
                    raise Http404(_("Page is not 'last', nor can it be converted to an int."))
            try:
                page = paginator.page(page_number)
                cache_key = 'mylistview-%s-%s' % (page_number, page_size)
                retreive_cache = cache.get(cache_key)
    
                if retreive_cache is None:
                    print('re-caching')
                    retreive_cache = super(MyListView, self).paginate_queryset(queryset, page_size)
    
                    # Caching for 1 day
                    cache.set(cache_key, retreive_cache, 86400)
    
                return retreive_cache
            except InvalidPage as e:
                raise Http404(_('Invalid page (%(page_number)s): %(message)s') % {
                    'page_number': page_number,
                    'message': str(e)
                })
    
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2021-08-25
      • 2021-05-02
      • 1970-01-01
      • 1970-01-01
      • 2010-12-25
      • 2014-06-29
      • 2011-03-22
      • 1970-01-01
      相关资源
      最近更新 更多