【问题标题】:Django: optimizing many to many queryDjango:优化多对多查询
【发布时间】:2011-08-03 22:27:46
【问题描述】:

我有 Post 和 Tag 模型:

class Tag(models.Model):
    """ Tag for blog entry """
    title           = models.CharField(max_length=255, unique=True)

class Post(models.Model):
    """ Blog entry """
    tags            = models.ManyToManyField(Tag)
    title           = models.CharField(max_length=255)
    text            = models.TextField()

我需要为每个帖子输出博客条目列表和一组标签。我希望能够使用以下工作流程通过两个查询来做到这一点:

  1. 获取帖子列表
  2. 获取这些帖子中使用的标签列表
  3. 将标签链接到 python 中的帖子

我在最后一步遇到了麻烦,这是我想出的代码,但给了我'Tag' object has no attribute 'post__id'

#getting posts
posts = Post.objects.filter(published=True).order_by('-added')[:20]
#making a disc, like {5:<post>}
post_list = dict([(obj.id, obj) for obj in posts])
#gathering ids to list
id_list = [obj.id for obj in posts]

#tags used in given posts
objects = Tag.objects.select_related('post').filter(post__id__in=id_list)
relation_dict = {}
for obj in objects:
    #Here I get: 'Tag' object has no attribute 'post__id'
    relation_dict.setdefault(obj.post__id, []).append(obj)

for id, related_items in relation_dict.items():
    post_list[id].tags = related_items

你能看到那里有错误吗?如何使用 django ORM 解决此任务,否则我将不得不编写自定义 SQL?

编辑:

我能够通过原始查询解决这个问题:

objects = Tag.objects.raw("""
    SELECT
        bpt.post_id,
        t.*
    FROM
        blogs_post_tags AS bpt,
        blogs_tag AS t
    WHERE
        bpt.post_id IN (""" + ','.join(id_list) + """)
        AND t.id = bpt.tag_id
""")
relation_dict = {}
for obj in objects:
    relation_dict.setdefault(obj.post_id, []).append(obj)

如果有人指出如何避免这种情况,我将不胜感激。

【问题讨论】:

    标签: python django query-optimization rails-postgresql


    【解决方案1】:

    在这种情况下,我通常会这样做:

    posts = Post.objects.filter(...)[:20]
    
    post_id_map = {}
    for post in posts:
        post_id_map[post.id] = post
        # Iteration causes the queryset to be evaluated and cached.
        # We can therefore annotate instances, e.g. with a custom `tag_list`.
        # Note: Don't assign to `tags`, because that would result in an update.
        post.tag_list = []
    
    # We'll now need all relations between Post and Tag. 
    # The auto-generated model that contains this data is `Post.tags.through`.
    for t in Post.tags.through.select_related('tag').filter(post_id__in=post):
        post_id_map[t.post_id].tag_list.append(t.tag)
    
    # Now you can iterate over `posts` again and use `tag_list` instead of `tags`.
    

    如果以某种方式封装此模式会更好,因此您可能需要添加一个 QuerySet 方法(例如select_tags())来为您完成它。

    【讨论】:

    • Daniel Roseman 在他的 django-efficient 项目中很好地封装了它:github.com/danielroseman/django-efficient
    • 在 Django 1.2 中对我不起作用。需要这样做:Post.tags.through.objects.select_related('tag')...
    【解决方案2】:

    如果你必须在两个查询中使用它,我认为你需要自定义 SQL:

    def custom_query(posts):
      from django.db import connection
      query = """
      SELECT "blogs_post_tags"."post_id", "blogs_tag"."title"
      FROM "blogs_post_tags"
      INNER JOIN "blogs_tags" ON ("blogs_post_tags"."tag_id"="blogs_tags"."id")
      WHERE "blogs_post_tags"."post_id" in %s
      """
      cursor=connection.cursor()
      cursor.execute(query,[posts,])
      results = {}
      for id,title in cursor.fetchall():
        results.setdefault(id,[]).append(title)
      return results
    
    recent_posts = Post.objects.filter(published=True).order_by('-added')[:20]
    post_ids = recent_posts.values_list('id',flat=True)
    post_tags = custom_query(post_ids)
    

    recent_posts 是您的 Post QuerySet,应该从一个查询缓存。
    post_tags 是一个从一个查询到 post id 到标签标题的映射。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2010-12-11
      • 1970-01-01
      • 2017-06-29
      • 2011-12-31
      • 2021-09-04
      • 1970-01-01
      相关资源
      最近更新 更多