【问题标题】:Django elasticsearch dsl term and phrase search is not workingDjango elasticsearch dsl术语和短语搜索不起作用
【发布时间】:2020-09-19 05:29:11
【问题描述】:

我使用两个包(即django-elasticsearch-dsl==7.1.4django-elasticsearch-dsl-drf==0.20.8)将搜索引擎添加到我的 Django 项目中。 我在弹性中索引的模型是:

class Article(models.Model):
    created_time = models.DateTimeField(_('created time'), auto_now_add=True)
    updated_time = models.DateTimeField(_('updated time'), auto_now=True)
    profile = models.ForeignKey('accounts.UserProfile', verbose_name=_('profile'), on_delete=models.PROTECT)
    approved_user = models.ForeignKey(settings.AUTH_USER_MODEL, verbose_name=_('approved user'), blank=True, null=True, editable=False, on_delete=models.CASCADE, related_name='article_approved_users')
    approved_time = models.DateTimeField(_('approved time'), blank=True, null=True, db_index=True, editable=False)
    title = models.CharField(_('title'), max_length=50)
    image = models.ImageField(_('image'), blank=True, upload_to=article_directory_path)
    slug = models.SlugField(_('slug'), max_length=50, unique=True)
    content = models.TextField(_('content'))
    summary = models.TextField(_('summary'))
    views_count = models.PositiveIntegerField(verbose_name=_('views count'), default=int, editable=False)
    is_free = models.BooleanField(_('is free'), default=True)
    is_enable = models.BooleanField(_('is enable'), default=True)

    tags = TaggableManager(verbose_name=_('tags'), related_name='articles')
    categories = models.ManyToManyField('Category', verbose_name=_('categories'), related_name='articles')
    namads = models.ManyToManyField('namads.Namad', verbose_name=_('namads'), related_name='articles', blank=True)

我使用以下文档来索引我的Article 模型:

html_strip = analyzer(
    'html_strip',
    tokenizer="whitespace",
    filter=["lowercase", "stop", "snowball"],
    char_filter=["html_strip"]
)


@registry.register_document
class ArticleDocument(Document):
    title = fields.TextField(
        analyzer=html_strip,
        fields={
            'raw': fields.TextField(analyzer='keyword'),
            'suggest': fields.CompletionField(),
        }
    )

    tags = fields.ObjectField(
        properties={
            "name": fields.TextField(
                analyzer=html_strip,
                fields={
                    'raw': fields.TextField(analyzer='keyword'),
                    'suggest': fields.CompletionField(),
                }
            )
        }
    )
    categories = fields.ObjectField(
        properties={
            'id': fields.IntegerField(),
            'title': fields.TextField(
                analyzer=html_strip,
                fields={
                    'raw': fields.TextField(analyzer='keyword'),
                    'suggest': fields.CompletionField(),
                }
            )
        }
    )
    namads = fields.ObjectField(
        properties={
            "id": fields.IntegerField(),
            "name": fields.TextField(
                analyzer=html_strip,
                fields={
                    'raw': fields.TextField(analyzer='keyword'),
                    'suggest': fields.CompletionField(),
                }
            ),
            "group_name": fields.TextField(
                analyzer=html_strip,
                fields={
                    'raw': fields.TextField(analyzer='keyword'),
                    'suggest': fields.CompletionField(),
                }
            )
        }
    )

    class Index:
        name = settings.ARTICLE_INDEX_NAME
        settings = {
            "number_of_shards": 1,
            "number_of_replicas": 0
        }

    def get_queryset(self):
        return super(ArticleDocument, self).get_queryset().filter(
            approved_user__isnull=False,
            is_enable=True
        ).prefetch_related(
            'tags',
            'categories',
            'namads'
        )

    class Django:
        model = Article
        fields = ['id', 'summary']

最后使用下面的视图集搜索它的结果(基于this document。)

class ArticleSearchViewSet(DocumentViewSet):
    """

        list:
            Search on all articles, ordered by most recently added.

            query parameters
            -  Search fields: 'title', 'summary', 'tags.name', 'categories.title', 'namads.name',
                'namads.group_name' . Ex: ?search=some random name.
        retrieve:
            Return a specific article details.

    """
    serializer_class = ArticleDocumentSerializer
    document = ArticleDocument
    pagination_class = PageNumberPagination
    lookup_field = 'id'
    filter_backends = [
        FilteringFilterBackend,
        IdsFilterBackend,
        OrderingFilterBackend,
        DefaultOrderingFilterBackend,
        CompoundSearchFilterBackend,
        SuggesterFilterBackend,
    ]
    search_fields = (
        'title',
        'summary',
        'tags.name',
        'categories.title',
        'namads.name',
        'namads.group_name'
    )
    filter_fields = {
        'id': {
            'field': 'id',
            # Note, that we limit the lookups of id field in this example,
            # to `range`, `in`, `gt`, `gte`, `lt` and `lte` filters.
            'lookups': [
                LOOKUP_FILTER_RANGE,
                LOOKUP_QUERY_IN,
                LOOKUP_QUERY_GT,
                LOOKUP_QUERY_GTE,
                LOOKUP_QUERY_LT,
                LOOKUP_QUERY_LTE,
            ],
        },
        'namads': {
            'field': 'namads',
            # Note, that we limit the lookups of `pages` field in this
            # example, to `range`, `gt`, `gte`, `lt` and `lte` filters.
            'lookups': [
                LOOKUP_FILTER_RANGE,
                LOOKUP_QUERY_GT,
                LOOKUP_QUERY_GTE,
                LOOKUP_QUERY_LT,
                LOOKUP_QUERY_LTE,
            ],
        },
        'title': "title.raw",
        'summary': 'summary',
        'categories': {
            'field': 'categories',
            # Note, that we limit the lookups of `pages` field in this
            # example, to `range`, `gt`, `gte`, `lt` and `lte` filters.
            'lookups': [
                LOOKUP_FILTER_RANGE,
                LOOKUP_QUERY_GT,
                LOOKUP_QUERY_GTE,
                LOOKUP_QUERY_LT,
                LOOKUP_QUERY_LTE,
            ],
        },

        'tags': {
            'field': 'tags',
            # Note, that we limit the lookups of `tags` field in
            # this example, to `terms, `prefix`, `wildcard`, `in` and
            # `exclude` filters.
            'lookups': [
                LOOKUP_FILTER_TERMS,
                LOOKUP_FILTER_PREFIX,
                LOOKUP_FILTER_WILDCARD,
                LOOKUP_QUERY_IN,
                LOOKUP_QUERY_EXCLUDE,
            ],
        },
    }
    # Suggester fields
    suggester_fields = {
        'title_suggest': {
            'field': 'title.suggest',
            'suggesters': [
                SUGGESTER_TERM,
                SUGGESTER_COMPLETION,
                SUGGESTER_PHRASE,

            ],
            'default_suggester': SUGGESTER_COMPLETION,
            'options': {
                'size': 10,  # Number of suggestions to retrieve.
                'skip_duplicates': True,  # Whether duplicate suggestions should be filtered out.
            },
        },
        'tags_suggest': {
            'field': 'tags.name.suggest',
            'suggesters': [
                SUGGESTER_COMPLETION,
            ],
            # 'options': {
            #     'size': 20,  # Override default number of suggestions
            # },
        },
        'categories_suggest': {
            'field': 'categories.title.suggest',
            'suggesters': [
                SUGGESTER_TERM,
                SUGGESTER_COMPLETION,
                SUGGESTER_PHRASE,
            ],
        },
        'namads_name_suggest': {
            'field': 'namads.name.suggest',
            'suggesters': [
                SUGGESTER_COMPLETION,
            ],
        },
        'namad_group_name_suggest': {
            'field': 'namads.group_name.suggest',
            'suggesters': [
                SUGGESTER_COMPLETION,
            ],
        },

    }
    ordering_fields = {
        'id': 'id',
        'title': 'title',
        'summary': 'summary',
    }
    # Specify default ordering
    ordering = ('-id', )

我的文档序列化器是:

class ArticleDocumentSerializer(DocumentSerializer):
    class Meta:
        document = ArticleDocument
        fields = ['id', 'title', 'summary', 'namads', 'categories', 'tags']

除了我使用termphrase 使用以下查询参数进行搜索外,一切正常:

?title_suggest__phrase=fi

?title_suggest__term=fi

在 url localhost/api/v1/blog/articles-search/suggest/ 中,但在这两种情况下,结果都相同,如下所示:

{
    "title_suggest__term": [
        {
            "text": "fi",
            "offset": 0,
            "length": 2,
            "options": []
        }
    ]
}.

而且我很确定我的索引中有First article,当我使用完成建议时,一切正常(即?title_suggest__completion=fi)并返回结果。 我错过了什么吗?我想在我的项目中添加术语和短语搜索。我该如何解决这个问题(术语和短语结果为 0)?

【问题讨论】:

    标签: python django elasticsearch django-rest-framework


    【解决方案1】:

    问题出在我的建议器配置中。首先对于termphrase 建议我们不需要完成字段(即'suggest': fields.CompletionField()),我们只需要在Index 中声明我们的字段,类似于:

    title = fields.TextField(
            fields={
                'raw': fields.TextField(analyzer=html_strip)
            }
        ) # which goes in documents.py
    

    只需将以下内容添加到任何字段即可启用termphrase 建议:

    suggester_fields = {
            'title_suggest': {
                'field': 'title',
                'suggesters': [
                    SUGGESTER_TERM,
                    SUGGESTER_PHRASE,
    
                ],
            },
        } # Which goes in views.py and search view suggester_fields
    

    要查看相关的建议结果,我们应该发送查询参数,例如 ?title_suggest__term=something?title_suggest__phrase=something。 最后,如果我们也需要为建议字段添加完成,我们应该使用其他键添加它,例如:

    suggester_fields = {
            'title_suggest': {
                'field': 'title',
                'suggesters': [
                    SUGGESTER_TERM,
                    SUGGESTER_PHRASE,
    
                ],
            },
            'title': {
                'field': 'title',
                'suggesters': [
                    SUGGESTER_COMPLETION,
                ],
            },
        }
    

    现在有了这个配置,我们有关于标题字段的三种类型的建议(即termphrasecompletion)。因此,如果我们想要基于这三个建议者的整个结果,我们应该使用三个查询参数,例如:

    localhost/api/v1/blog/articles-search/suggest/?title_suggest__term=something&title_suggest__phrase=something&title__completion=something

    别忘了更改Index中的字段配置(以防我们需要完成建议):

    title = fields.TextField(
            fields={
                'raw': fields.TextField(analyzer=html_strip),
                'suggest': fields.CompletionField(),
            }
        ) # Which goes in documents.py
    

    【讨论】:

    • 您知道如何根据列中的某个值自定义此默认排序结果以优先排序 # 指定默认排序 ordering = ('-id', ) ????
    猜你喜欢
    • 1970-01-01
    • 2016-03-31
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2018-11-09
    • 1970-01-01
    • 2021-08-19
    • 2014-07-31
    相关资源
    最近更新 更多